[Seavox] Draft mapping

Roy Lowry rkl at bodc.ac.uk
Wed Feb 7 08:22:05 GMT 2007


Hi Luis,

At the moment I am doing my mapping using two computers side-by-side each running a Toad interface onto our Oracle database (a sort of VINE simulation!!).  This, however, is NOT how I want to do it.  Currently, the maps are stored in an Oracle table (see attached document) as a representation of simple RDF triples.  My plan was for Michael to develop import and export tools between this and an XML representation (OWL was the plan, but I feel SKOS would be better) so I could use VINE but he didn't manage it before leaving.  Once he is replaced I will take this forward again. In the meantime I will plod on with my lash-up. Note that within BODC we need the Oracle representation as that's the way our operational systems work and I don't want to re-engineer them at the moment!

I agree about being careful about what is "synonymous".  I remember an e-mail debate with John on this a while back about the balance between an ontologist's viewpoint and the pragmatist's viewpoint. I am currently trying to achieve this balance (testing my judgement was one reason for circulating the draft).  In the example you quote, I am pretty comfortable that what I call 'chlorophyll pigments' and what GCMD calls 'chlorophyll' are exactly the same and what I call the 'water column' and what GCMD calls the 'hydrosphere' are exactly the same (oceans, seas, estuaries, rivers, lakes, puddles.....).  This BODC definition of 'water column' is  why I put BODC 'nitrate in the water column' broader than the GCMD term for 'ocean chemistry > nitrate', which I interpret as just covering salt water.  It's only when we get the term server API up and running and try things out that we will really get a feel for the correct balance to make smart searches behave in a scientifically credible manner (i.e. do what your average scientist expects).

An interesting quandry I've encountered is wehere a term has two components that have opposing relationships.  For example consider:

nutrients in the ocean 
nitrate in the water column

'Nutrients' is boader than 'nitrate' but 'ocean' is narrower than 'water column'.  In these cases I've adopted the view that the phenomenon dimension is more important than the spatial dimension and so would map 'nutrients in the ocean' broader than 'nitrate in the water column'.  Alternative viewpoint welcome.

Cheers, Roy.

>>> Luis Bermudez <bermudez at mbari.org> 2/6/2007 10:27 pm >>>
Hi Roy,

Great you finally started making some progress.

1) I am interested in the process you are following to do these  
mappings. What tool(s) are you using ?

2) SKOS provides a good framework to map resources: http://www.w3.org/ 
2004/02/skos/mapping/spec/. Which I plan to implement with VINE.

3) About broader and synonym, SKOS defines the following:

- If 'concept A has-broad-match concept B' then the set of resources  
properly indexed against concept A is a subset of the set of  
resources properly indexed against concept B.

- If two concepts are an 'exact-match' then the set of resources  
properly indexed against the first concept is identical to the set of  
resources properly indexed against the second. Therefore the two  
concepts may be interchanged in queries and subject-based indexes.

My concern is that we have to be very careful about synonyms or  
"exact-match", for example:

BODC Chlorophyll pigment concentrations in the water column =??
GCMD EARTH SCIENCE > Hydrosphere > Water Quality/Water Chemistry >  
Chlorophyll

Cheers,

-Luis


On Jan 31, 2007, at 7:07 AM, Roy Lowry wrote:

> Dear All,
>
> I'm starting to dip my toe into the development of a parameter  
> ontology that will eventually cover BODC Discovery Vocabulary, GCMD  
> Science Keywords and CF Standard Names.  The big step forward from  
> the existing map is that this one now introduces the simple  
> thesaurus relationships narrower, broader and synonymous rather  
> than just an indication that the terms are related.
>
> Attached is my first bash covering about 5% of the BODC PDV to  
> GCMD.  It's as an Excel spreadsheet, which I thought would be  
> easiest for most of you, with BODC PDV terms, GCMD terms and the  
> mappings each way.  Any comments welcome before I go into mass  
> production.  One issue causing me unease is  that I've mapped many  
> BODC terms broader even though I think the parameter dimension is  
> synonymous because I see the concept of 'water column' as broader  
> than 'Oceans' - more equivalent to the GCMD 'Hydrosphere', so views  
> on this are of particular interest.  I also now realise that  
> splitting nitrate+nitrite from nitrate in the BODC PDV wasn't a  
> smart move!
>
> Cheers, Roy.
>
> <BODC_GCMD_Map.xls>
> _______________________________________________
> Seavox mailing list
> Seavox at mailman.nerc-liv.ac.uk 
> http://mailman.nerc-liv.ac.uk/mailman/listinfo/seavox 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: Vocabulary Mapping Revamp.doc
Type: application/msword
Size: 52224 bytes
Desc: not available
Url : http://mailman.nerc-liv.ac.uk/pipermail/seavox/attachments/20070207/4889cfc0/attachment-0001.doc 


More information about the Seavox mailing list