[Medin_standards] P021 keyword storage in NERC/MEDIN metadata
hrz at geodata.soton.ac.uk
hrz at geodata.soton.ac.uk
Tue Nov 16 14:31:37 GMT 2010
Hello,
Yes, that sounds like it will do the trick. It should be a fairly
minor change to the portal to integrate it but I won't
develop/test/deploy until we have documents with the new structure in
the discovery web service.
I'm imagining an implementation whereby if the anchor tag is not found
we fall back to the current way of doing things; this should make the
transition easier.
Regards,
Homme
On Tue, Nov 16, 2010 at 01:41:09PM +0000, Charlesworth, Mark E. wrote:
> All,
> The solution to this from a standards point of view is that we will amend the xml which uses vocabs so that the term and code will both be included as follows:
>
> Use the gmx:Anchor element. This element substitutes for gco:CharacterString and works in the following way:
>
> <gmd:keyword>
> <gmx:Anchor xlink:href="http://vocab.ndg.nerc.ac.uk/term/P021/59/ZOOB">Zoobenthos taxonomy-related counts</gmx:Anchor>
> </gmd:keyword>
>
> instead of:
>
> <gmd:keyword>
> <gco:CharacterString>Zoobenthos taxonomy-related counts</gco:CharacterString>
> </gmd:keyword>
>
> This should give enough information for the portal to cope with depreciations in vocabs.
>
> Unless you are generating discovery metadata yourselves then this will not impact you as these changes will be made in the various tools themselves in the next month so they generate xml that is correct. Once we have a full example xml file we will rerelease the MEDIN discovery standard guidance document.
>
> Best Wishes
> Mark
>
>
> From: medin_standards-bounces at biwebs1.nerc-liv.ac.uk [mailto:medin_standards-bounces at biwebs1.nerc-liv.ac.uk] On Behalf Of Lowry, Roy K.
> Sent: 08 November 2010 16:45
> To: medin_standards at mailman.nerc-liv.ac.uk
> Cc: Clements, David O.; Jason Sadler; Thorne, Kay
> Subject: [Medin_standards] FW: P021 keyword storage in NERC/MEDIN metadata
>
> Houston, we may have a problem........
>
> From: steve.donegan at stfc.ac.uk [mailto:steve.donegan at stfc.ac.uk]
> Sent: 08 November 2010 16:36
> To: Lowry, Roy K.
> Cc: jds at geodata.soton.ac.uk
> Subject: RE: P021 keyword storage in NERC/MEDIN metadata
>
> Hi Roy,
>
> I think the problem stems from the fact that MEDIN provides the keyword value i.e. "Zoobenthos taxonomy-related counts" and with this in the iso gmd:thesaurusName element section gives a title of "SeaDataNet P021 parameter discovery vocabulary" - there is no specification of the actual term id & url i.e. http://vocab.ndg.nerc.ac.uk/term/P021/59/ZOOB" elsewhere in the gmd:thesaurusName section. The ingest system doesnt touch this element at all -its the portal that takes the keyword value and recursively looks it up in available lists to get a definition - I think this is how it works from what I've seen - until the actual term url and version number is specified in the metadata I dont think there's a lot that can be done?
>
> cheers,
>
> Steve
>
> ________________________________
> From: Lowry, Roy K. [mailto:rkl at bodc.ac.uk]
> Sent: 08 November 2010 15:55
> To: Donegan, Steve (STFC,RAL,SSTD)
> Cc: Jason Sadler
> Subject: P021 keyword storage in NERC/MEDIN metadata
> Hi Steve,
>
> I'm currently trying to understand/overcome the consequences of the dynamic nature of the P021 vocabulary, which has a governance that allows term broadening and term deprecation for the MEDIN/NERC portals. Basically, what can happen as a result is that the text associated with a given URI can change and, if they specify P021 and just P02, they can disappear. Anything that disappears (moves into P022) has a replacement P021 term indicated by a 1-to-1 mapping.
>
> In SeaDataNet we manage this by refreshing the vocabulary in the metadata generation tools that produce the XML. This leaves the issue of stale P021 text and deprecated codes in XML files 'in transit' and metadatabases generated from the ingestion of these files. We only ingest and store the URIs: translation to text is handled by a dynamic call to the vocabulary server.
>
> If the URI has a version number embedded in it then this call returns the text as it was at the time of metadata creation. Alternatively, replacing the version number by 'current' in a URL or '::' in a URN causes the most up-to-date text to be displayed. We have adopted the latter approach in SeaDataNet, but it isn't the only approach.
>
> Deprecation in SeaDataNet is dealt with by a daily cron that sweeps the metadatabases and automatically translates any deprecated URIs into their replacement.
>
> I'm not sure how this issue is being dealt with in the MEDIN/NERC case, but some code I've seen lately (which I think is the portal) seems to do a verifyTerm against the current vocabulary list, which if you aren't refreshing content seems like an accident waiting to happen.
>
> Any clarification you can give me on what happens to MEDIN XML after they have been harvested would be helpful.
>
> Cheers, Roy.
>
>
> --
> This message (and any attachments) is for the recipient only. NERC
> is subject to the Freedom of Information Act 2000 and the contents
> of this email and any reply you make may be disclosed by NERC unless
> it is exempt from release under the Act. Any material supplied to
> NERC may be stored in an electronic records management system.
>
> --
> Scanned by iCritical.
>
--
Homme Zwaagstra
GeoData Institute
University of Southampton
More information about the Medin_standards
mailing list