The communication between the creators of content destined for the web and those contracted to translate it for multilingual web sets is far from smooth today. The wide variety of source content format, content editors and content management systems places a major cost on integrating with modern translation platforms. In addition, the potential to support this process using language technology such as machine translation is also being hampered by a lack of system interoperability.
CNGL to the fore in Multilingal Web Standards Development
Several CNGL partners are playing a leadership role in the MultiLingualWeb – Language Technology Working Group at the W3C, which aims to develop standard means to simplify the creation and translation of Web content in the world’s languages. This has resulted in a revision to the specification of the Internationalization Tag Set (ITS2.0).
The Benefits of ITS2.0
ITS2.0 allows text in XML and HTML documents to be annotated with standardised meta-data that enables different content and translation processing tools to seamlessly deal with different hand-off scenarios. For example, ITS meta-data can express: whether text should be translated or not; how text analysis can be used to aid terminology management; the quality of a machine transition; who or what contributed to a translation and translation quality data.
Current Status of ITS2.0
The technical specification of ITS2.0 is largely complete, featuring 19 distinct categories of meta-data and a comprehensive conformance test suite. 15 organisations have already provided implementations with over 900 successful feature conformance tests completed to date.
Integrating ITS2.0 with other Interoperability Standards
ITS will only reach its full potential when used in conjunction with other relevant interoperability standards. To this end, David Filip and Dave Lewis of CNGL organised the second in the series of Federated Event for Interoperability Standardization in Globalization, Internationalization, Localization and Translation Technologies (FEISGILTT’13). This was co-located with Localization World, the language services industry’s premier trade show, in London on 12-13 June 2013.
FEISGILTT'13 brought together localisation standardisation experts from the W3C, OASIS and ETSI, as well as other consortia such as the Unicode Consortium, Linport and Interoperability Now. With 27 presentations over three tracks covering practical implementation experience and harmonisation topics, this event provided a unique venue for addressing the major interoperability issues addressed by the language services industry. Particular hot topics were the mapping of ITS2.0 with XLIFF and the strategies this industry needs to employ to promote benefits of standardised interoperability to the content management community.
Next Steps for ITS2.0
Professor David Lewis, Co-Chair of the MultilingualWeb-LT Working Group and Leader and leader of CNGL's Interoperability & Analytics theme, outlines the next steps for Internationalization Tag Set 2.0:
"Two follow-on EU-funded projects are starting, where CNGL partners TCD and DCU are in receipt of over €800k to advance the integration of ITS in the multilingual web and linked data. LIDER will develop a research roadmap around the use of linked data in multilingual text and media analytics, while FALCON will demonstrate the interoperability benefits of linked data integrated into the localisation tool chain", says Professor Lewis.
MLW-LT working group: http://www.w3.org/International/multilingualweb/lt/
OASIS XLIFF technical committee: https://www.oasis-open.org/committees/xliff/