[Platforms] Principle decision needed

Lowry, Roy K. rkl at bodc.ac.uk
Tue Mar 27 22:25:27 BST 2012


Hi Marilynn,

My concern with having a simple synonym approach is that using the language tag allows a mechanism for a single concept to have multiple labels, whereas with synonyms, each name is a separate concept.

This may seem a strange comment if you haven't done any work with representing ship codes in standard knowledge encodings, such as SKOS or OWL.  The issue is that each synonym is encoded in RDF XML as a reference (such as a URL) and having multiple references for one ship code is an issue.  Some form of discriminator is required, which probably brings us back to language codes! 

Your concern about flag changes can be easily addressed - you have one entry for lang=en (the LCD), plus another for each country where the ship has been registered.  Note that this would mean total decoupling between the languages carried by a code and the flag of the ship.  However, in my opinion this is good.

One final point.  I'm not sure if  ISO639 is the recommended standard for XML encodings. There's one we've used, but I've forgotten its name!  Hopefully Adam can help!

Cheers, Roy.
________________________________________
From: Marilynn Sørensen [marilynn at ices.dk]
Sent: 27 March 2012 12:59
To: Sjur Ringheim Lid; Lowry, Roy K.; platforms at mailman.nerc-liv.ac.uk
Cc: dick at maris.nl
Subject: Principle decision needed

Dear Roy, Sjur and Platform group,

Full multi-lingual support would mean creating a new field to hold the "language" code. This would also mean adopting ISO 639 (2 or 3 character variant) to use as the language code identifier. The responsibility would be on the platform management group to ensure that the language is correctly identified, which may not be easy if a vessel has changed ownership/flag. This still leaves us with the issue of dealing with the lowest common denominator, and even if we include local language support the "default" name attribute would still need to be easily read/translated by users of the webservices etc., and we would therefore still need a basic name with no extended language characters.

A simpler solution could be to create a new attribute field for "Synonyms". The name could be translated to the English in the "Name" field and the original spelling in the original language would be added to the "Synonyms". This would require that we agree on a rule for English translation of special characters. See http://en.wikipedia.org/wiki/Typographic_ligat for a description of translations between the most common letters. We could create and agree a simple translation table based on this.

What do you think of this solution?

Kind regards,
Marilynn

-----Original Message-----
From: Lid Sjur Ringheim [mailto:sjur.ringheim.lid at imr.no]
Sent: 26 March 2012 14:01
To: Lowry, Roy K.; Marilynn Sørensen; platforms at mailman.nerc-liv.ac.uk
Cc: dick at maris.nl
Subject: SV: Principle decision needed

Dear Roy and Marilynn,

As we are one of the countries where the ships get named using those characters we would welcome the possibility very much.

The proposal Roy comes with about multilingual storage of ship names would actualy be the very best as it will make it possible to register the original name and a english friendly (possibly others?) name for users not familliar with the letters.

Cheers,
Sjur
________________________________________
Fra: platforms-bounces at mailman.nerc-liv.ac.uk [platforms-bounces at mailman.nerc-liv.ac.uk] på vegne av Lowry, Roy K. [rkl at bodc.ac.uk]
Sendt: 23. mars 2012 22:44
Til: Marilynn Sørensen; platforms at mailman.nerc-liv.ac.uk
Kopi: dick at maris.nl
Emne: Re: [Platforms] Principle decision needed

Hi Marilynn,

Whilst we have been able to store full Latin-1 characters for a long time (since Oracle introduced Unicode support), I prefer to avoid them for two reasons:

1) The characters are extremely difficult to type from the keyboards I use - there may be keyboard shortcuts, but I don't know them so I usually end up opening Word, inserting symbol and then copying and pasting the character to wherever I need it.

2) There are issues that we've hit several times with character encoding mismatches causing web applications to render the Latin-1 characters incorrectly - they usually end up as square boxes. I don't know the technical details - all I know is that I have had to submit multiple bug reports and have experienced fixes for Java applications causing Perl applications to break and vice versa.

One way around the problem might be to introduce multilingual storage of ship names (obviously tagged with the appropriate languages) combined with the sort of Technology Google uses to allow Latin-1 and various equivalents to be discovered interoperably.  That way we could cover all bases.  We could even use that to go beyond Latin-1 into full multilingual support. What do people think about that?

Cheers, Roy.
________________________________
From: platforms-bounces at mailman.nerc-liv.ac.uk [platforms-bounces at mailman.nerc-liv.ac.uk] On Behalf Of Marilynn Sørensen [marilynn at ices.dk]
Sent: 23 March 2012 18:45
To: platforms at mailman.nerc-liv.ac.uk
Subject: [Platforms] Principle decision needed


Dear Platform Group,

You have seen the request for standardization of "ö" to "oe" so all vessels with "Hoegh" become "Hoeegh".

There is an alternative which the platform group needs to discuss.

a)      Is it time to implement extended characters and allow "ö", "ø", "ä" etc? Both ICES and NOAA can handle this change.

b)      If yes, how should this be implemented?

*       Update all old ship names

*       Make a "point in time" change

Please send your views to me as soon as possible and by 23 April at the latest. We need answers from all members of the platform group. If other groups should be contacted for their views, please let us know.

Kind regards,

Marilynn

----------------------------------------------------------

Marilynn Sørensen

Data Manager

International Council for the Exploration of the Sea

H.C. Andersens Boulevard 44-46, 1553 Copenhagen V.

Denmark

marilynn.sorensen at ices.dk<mailto:marilynn.sorensen at ices.dk>

Direct tel: +45 33 38 67 20

--
This message (and any attachments) is for the recipient only. NERC is subject to the Freedom of Information Act 2000 and the contents of this email and any reply you make may be disclosed by NERC unless it is exempt from release under the Act. Any material supplied to NERC may be stored in an electronic records management system.
********************************************************************************
Denne mail er blevet scannet af http://www.comendo.com og indeholder ikke virus!
********************************************************************************


More information about the Platforms mailing list