[Platforms] Principle decision needed
Neil Holdsworth
NeilH at ices.dk
Thu Mar 29 10:16:04 BST 2012
Hi Sjur,
Indeed we would be able to programmatically identify the non-english name, which would be wise for validation however i think if we are to do this then we should positively identify the language that has been used in the synonym. I think we all agree so far that we
1) keep english as the default
2) set some rules for converting non-english alphabet characters
3) allow a single instance 'other language' synonym associated with the correct language code
Neil
-----Original Message-----
From: Lid Sjur Ringheim [mailto:sjur.ringheim.lid at imr.no]
Sent: 29 March 2012 11:06
To: Neil Holdsworth; Lowry, Roy K.; Marilynn Sørensen; platforms at mailman.nerc-liv.ac.uk
Cc: dick at maris.nl; Donald.Collins at noaa.gov
Subject: SV: [Platforms] Principle decision needed
Hi Neil and Roy,
Is there any scenarios where a ship would have more then two names registered? Usualy the only reason to register a second name, as i can see it, would be that the original name contains characters not in the english alphabet. If that is the case it would make it quite easy to pick out the right language to present when you need lang=en as it would be the one without strange characters.
Another way this could be solved is to make one of the names the default name.
We are at the moment updating our internal systems and allowing multilingual synonyms. The way we get around the problem with choosing the right synonym to present is to assign one of them as the default synonym.
As i mentionend earlier we would welcome a possibility to register names containing non english alphabet characters as some of the ships we handle do have names with characters from the norwegian alphabet.
We should also agree on a standard way to represent the characters when translating them to English as Marilynn earier stated.
Sjur
________________________________________
Fra: Neil Holdsworth [NeilH at ices.dk]
Sendt: 29. mars 2012 10:30
Til: Lowry, Roy K.; Marilynn Sørensen; Lid Sjur Ringheim; platforms at mailman.nerc-liv.ac.uk
Kopi: dick at maris.nl; Donald.Collins at noaa.gov
Emne: RE: [Platforms] Principle decision needed
Hi Roy,
I think the first assumption of the default language is safe enough. I'm not quite 100% so sure on the 2nd assumption but having scanned the ship names and their respective country flags i haven't seen any that break that rule.
However, it is a decision that has some implications, so still waiting for some agreement/disagreement from others in the platform group.
Neil
-----Original Message-----
From: Lowry, Roy K. [mailto:rkl at bodc.ac.uk]
Sent: 29 March 2012 00:54
To: Neil Holdsworth; Marilynn Sørensen; Sjur Ringheim Lid; platforms at mailman.nerc-liv.ac.uk
Cc: dick at maris.nl
Subject: RE: [Platforms] Principle decision needed
Hi Neil,
I'm only too well aware of the need to consider resource implications. Should the language encoding approach not get the support I feel that it's important to establish a convention so we know which of the synonyms is the lowest common denominator (which I would assume to be lang=en in any application generating SKOS or OWL). I would probably also assume that any synonym found containing extended characters was in the language of the ship's flag.
If everybody would be happy with this kind of assumption model then I guess we could go for the easy option.
Cheers, Roy.
________________________________________
From: Neil Holdsworth [NeilH at ices.dk]
Sent: 28 March 2012 09:06
To: Lowry, Roy K.; Marilynn Sørensen; Sjur Ringheim Lid; platforms at mailman.nerc-liv.ac.uk
Cc: dick at maris.nl
Subject: RE: [Platforms] Principle decision needed
Hi Roy,
I agree that the language encoding is the most thorough solution, we suggested the synonym approach on the basis of it was quite easy to implement and would take little resources to do.
If we are to go for the language encoding approach then i think we need a bit more positive agreement from the platform group - is anyone able to speak up?
Best, Neil
-----Original Message-----
From: platforms-bounces at mailman.nerc-liv.ac.uk [mailto:platforms-bounces at mailman.nerc-liv.ac.uk] On Behalf Of Lowry, Roy K.
Sent: 27 March 2012 23:25
To: Marilynn Sørensen; Sjur Ringheim Lid; platforms at mailman.nerc-liv.ac.uk
Cc: dick at maris.nl
Subject: Re: [Platforms] Principle decision needed
Hi Marilynn,
My concern with having a simple synonym approach is that using the language tag allows a mechanism for a single concept to have multiple labels, whereas with synonyms, each name is a separate concept.
This may seem a strange comment if you haven't done any work with representing ship codes in standard knowledge encodings, such as SKOS or OWL. The issue is that each synonym is encoded in RDF XML as a reference (such as a URL) and having multiple references for one ship code is an issue. Some form of discriminator is required, which probably brings us back to language codes!
Your concern about flag changes can be easily addressed - you have one entry for lang=en (the LCD), plus another for each country where the ship has been registered. Note that this would mean total decoupling between the languages carried by a code and the flag of the ship. However, in my opinion this is good.
One final point. I'm not sure if ISO639 is the recommended standard for XML encodings. There's one we've used, but I've forgotten its name! Hopefully Adam can help!
Cheers, Roy.
________________________________________
From: Marilynn Sørensen [marilynn at ices.dk]
Sent: 27 March 2012 12:59
To: Sjur Ringheim Lid; Lowry, Roy K.; platforms at mailman.nerc-liv.ac.uk
Cc: dick at maris.nl
Subject: Principle decision needed
Dear Roy, Sjur and Platform group,
Full multi-lingual support would mean creating a new field to hold the "language" code. This would also mean adopting ISO 639 (2 or 3 character variant) to use as the language code identifier. The responsibility would be on the platform management group to ensure that the language is correctly identified, which may not be easy if a vessel has changed ownership/flag. This still leaves us with the issue of dealing with the lowest common denominator, and even if we include local language support the "default" name attribute would still need to be easily read/translated by users of the webservices etc., and we would therefore still need a basic name with no extended language characters.
A simpler solution could be to create a new attribute field for "Synonyms". The name could be translated to the English in the "Name" field and the original spelling in the original language would be added to the "Synonyms". This would require that we agree on a rule for English translation of special characters. See http://en.wikipedia.org/wiki/Typographic_ligat for a description of translations between the most common letters. We could create and agree a simple translation table based on this.
What do you think of this solution?
Kind regards,
Marilynn
-----Original Message-----
From: Lid Sjur Ringheim [mailto:sjur.ringheim.lid at imr.no]
Sent: 26 March 2012 14:01
To: Lowry, Roy K.; Marilynn Sørensen; platforms at mailman.nerc-liv.ac.uk
Cc: dick at maris.nl
Subject: SV: Principle decision needed
Dear Roy and Marilynn,
As we are one of the countries where the ships get named using those characters we would welcome the possibility very much.
The proposal Roy comes with about multilingual storage of ship names would actualy be the very best as it will make it possible to register the original name and a english friendly (possibly others?) name for users not familliar with the letters.
Cheers,
Sjur
________________________________________
Fra: platforms-bounces at mailman.nerc-liv.ac.uk [platforms-bounces at mailman.nerc-liv.ac.uk] på vegne av Lowry, Roy K. [rkl at bodc.ac.uk]
Sendt: 23. mars 2012 22:44
Til: Marilynn Sørensen; platforms at mailman.nerc-liv.ac.uk
Kopi: dick at maris.nl
Emne: Re: [Platforms] Principle decision needed
Hi Marilynn,
Whilst we have been able to store full Latin-1 characters for a long time (since Oracle introduced Unicode support), I prefer to avoid them for two reasons:
1) The characters are extremely difficult to type from the keyboards I use - there may be keyboard shortcuts, but I don't know them so I usually end up opening Word, inserting symbol and then copying and pasting the character to wherever I need it.
2) There are issues that we've hit several times with character encoding mismatches causing web applications to render the Latin-1 characters incorrectly - they usually end up as square boxes. I don't know the technical details - all I know is that I have had to submit multiple bug reports and have experienced fixes for Java applications causing Perl applications to break and vice versa.
One way around the problem might be to introduce multilingual storage of ship names (obviously tagged with the appropriate languages) combined with the sort of Technology Google uses to allow Latin-1 and various equivalents to be discovered interoperably. That way we could cover all bases. We could even use that to go beyond Latin-1 into full multilingual support. What do people think about that?
Cheers, Roy.
________________________________
From: platforms-bounces at mailman.nerc-liv.ac.uk [platforms-bounces at mailman.nerc-liv.ac.uk] On Behalf Of Marilynn Sørensen [marilynn at ices.dk]
Sent: 23 March 2012 18:45
To: platforms at mailman.nerc-liv.ac.uk
Subject: [Platforms] Principle decision needed
Dear Platform Group,
You have seen the request for standardization of "ö" to "oe" so all vessels with "Hoegh" become "Hoeegh".
There is an alternative which the platform group needs to discuss.
a) Is it time to implement extended characters and allow "ö", "ø", "ä" etc? Both ICES and NOAA can handle this change.
b) If yes, how should this be implemented?
* Update all old ship names
* Make a "point in time" change
Please send your views to me as soon as possible and by 23 April at the latest. We need answers from all members of the platform group. If other groups should be contacted for their views, please let us know.
Kind regards,
Marilynn
----------------------------------------------------------
Marilynn Sørensen
Data Manager
International Council for the Exploration of the Sea
H.C. Andersens Boulevard 44-46, 1553 Copenhagen V.
Denmark
marilynn.sorensen at ices.dk<mailto:marilynn.sorensen at ices.dk>
Direct tel: +45 33 38 67 20
--
This message (and any attachments) is for the recipient only. NERC is subject to the Freedom of Information Act 2000 and the contents of this email and any reply you make may be disclosed by NERC unless it is exempt from release under the Act. Any material supplied to NERC may be stored in an electronic records management system.
********************************************************************************
Denne mail er blevet scannet af http://www.comendo.com og indeholder ikke virus!
********************************************************************************
_______________________________________________
Platforms mailing list
Platforms at mailman.nerc-liv.ac.uk
http://mailman.nerc-liv.ac.uk/mailman/listinfo/platforms
********************************************************************************
Denne mail er blevet scannet af http://www.comendo.com og indeholder ikke virus!
********************************************************************************
********************************************************************************
Denne mail er blevet scannet af http://www.comendo.com og indeholder ikke virus!
********************************************************************************
********************************************************************************
Denne mail er blevet scannet af http://www.comendo.com og indeholder ikke virus!
********************************************************************************
More information about the Platforms
mailing list