<html>

<head>

<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">

<meta name="Generator" content="Microsoft Exchange Server">

<!-- converted from rtf -->

<style><!-- .EmailQuote { margin-left: 1pt; padding-left: 4pt; border-left: #800000 2px solid; } --></style>

</head>

<body>

<font face="Calibri, sans-serif" size="2">

<div>Dear All,</div>

<div>&nbsp;</div>

<div>I am proposing basing language labelling in the NETMAR semantic resources on ISO639-1, a collection of 2-bytes codes representing 136 languages.&nbsp; As far as I can ascertain, it is the most widely followed part of the standard, although Dublin Core documentation

quotes ISO639-3 (3-bytes codes covering many more languages). It is largely followed by GEMET, although they have extensions to differentiate between UK and US English and specify a Chinese dialect.&nbsp; US and UK English are not differentiated in ISO639-3.</div>

<div>&nbsp;</div>

<div>I have two questions:</div>

<div>&nbsp;</div>

<ol style="margin-top: 0pt; margin-bottom: 0pt; margin-left: 36pt; ">

<li>Does anybody disagree with the choice of ISO639-1 for NETMAR?</li><li>Do we need to differentiate between UK and US English?&nbsp; I&#8217;m thinking particularly of the ICAN semantic requirements here. If so, should we follow GEMET, who code UK English &#8216;en&#8217; and US English &#8216;en-US&#8217;, into extension of ISO639-1?</li></ol>

<div>&nbsp;</div>

<div>Cheers, Roy.</div>

<div>&nbsp;</div>

</font>

<br />-- 


<br />This message (and any attachments) is for the recipient only. NERC


<br />is subject to the Freedom of Information Act 2000 and the contents


<br />of this email and any reply you make may be disclosed by NERC unless


<br />it is exempt from release under the Act. Any material supplied to


<br />NERC may be stored in an electronic records management system.</body>

</html>