[Medin_standards] P021 keyword storage in NERC/MEDIN metadata

Charlesworth, Mark E. mecha at bodc.ac.uk
Tue Nov 9 10:51:01 GMT 2010


All,
I have just spoken to Roy about this and will do some further investigation and get back to you all by the end of the week. On the face of it appears like it can be resolved fairly easily from a MEDIN standard perspective but may take further thought from a portal perspective.
Mark

From: medin_standards-bounces at biwebs1.nerc-liv.ac.uk [mailto:medin_standards-bounces at biwebs1.nerc-liv.ac.uk] On Behalf Of Lowry, Roy K.
Sent: 08 November 2010 16:45
To: medin_standards at mailman.nerc-liv.ac.uk
Cc: Clements, David O.; Jason Sadler; Thorne, Kay
Subject: [Medin_standards] FW: P021 keyword storage in NERC/MEDIN metadata

Houston, we may have a problem........

From: steve.donegan at stfc.ac.uk [mailto:steve.donegan at stfc.ac.uk]
Sent: 08 November 2010 16:36
To: Lowry, Roy K.
Cc: jds at geodata.soton.ac.uk
Subject: RE: P021 keyword storage in NERC/MEDIN metadata

Hi Roy,

I think the problem stems from the fact that MEDIN provides the keyword value i.e. "Zoobenthos taxonomy-related counts" and with this in the iso gmd:thesaurusName element section gives a title of "SeaDataNet P021 parameter discovery vocabulary" - there is no specification of the actual term id & url i.e. http://vocab.ndg.nerc.ac.uk/term/P021/59/ZOOB" elsewhere in the gmd:thesaurusName section.  The ingest system doesnt touch this element at all -its the portal that takes the keyword value and recursively looks it up in available lists to get a definition - I think this is how it works from what I've seen - until the actual term url and version number is specified in the metadata I dont think there's a lot that can be done?

cheers,

Steve

________________________________
From: Lowry, Roy K. [mailto:rkl at bodc.ac.uk]
Sent: 08 November 2010 15:55
To: Donegan, Steve (STFC,RAL,SSTD)
Cc: Jason Sadler
Subject: P021 keyword storage in NERC/MEDIN metadata
Hi Steve,

I'm currently trying to understand/overcome the consequences of the dynamic nature of the P021 vocabulary, which has a governance that allows term broadening and term deprecation for the MEDIN/NERC portals.  Basically, what can happen as a result is that the text associated with a given URI can change and, if  they specify P021 and just P02, they can disappear.  Anything that disappears (moves into P022) has a replacement P021 term indicated by a 1-to-1 mapping.

In SeaDataNet we manage this by refreshing the vocabulary in the metadata generation tools that produce the XML.  This leaves the issue of stale P021 text and deprecated codes in XML files 'in transit' and metadatabases generated  from the ingestion of these files.  We only ingest and store the URIs: translation to text is handled by a dynamic call to the vocabulary server.

If the URI has a version number embedded in it then this call returns the text as it was at the time of metadata creation.  Alternatively, replacing the version number by 'current' in a URL or '::' in a URN causes the most up-to-date text to be displayed.  We have adopted the latter approach in SeaDataNet, but it isn't the only approach.

Deprecation in SeaDataNet is dealt with by a daily cron that sweeps the metadatabases and automatically translates any deprecated URIs into their replacement.

I'm not sure how this issue is being dealt with in the MEDIN/NERC case, but some code I've seen lately (which I think is the  portal) seems to do a verifyTerm against the current vocabulary list, which if you aren't refreshing content seems like an accident waiting to happen.

Any clarification you can give me on what happens to MEDIN XML after they have been harvested would be helpful.

Cheers, Roy.


--
This message (and any attachments) is for the recipient only. NERC
is subject to the Freedom of Information Act 2000 and the contents
of this email and any reply you make may be disclosed by NERC unless
it is exempt from release under the Act. Any material supplied to
NERC may be stored in an electronic records management system.

--
Scanned by iCritical.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.nerc-liv.ac.uk/pipermail/medin_standards/attachments/20101109/88033bf8/attachment.html 


More information about the Medin_standards mailing list