Localisation
Goals
The main goal of the Next Generation Localisation (LOC) area is to build a standardsand web-based Next Generation Localisation Factory by embedding internationalisation and localisation into the full production process of multilingual digital content largely eliminating the need for human intervention. LOC1 addresses the need for standards and guidelines for a localisation knowledge container; LOC2 is developing performance measurements and evaluation frameworks for the building blocks of an integrated localisation solution; and LOC 3 focuses on the discovery, development and implementation of automated workflows in a Next Generation Localisation framework.
Current mainstream localisation scenarios are largely based on static processes characterised by pre-defined workflows. The tools and technologies employed in these processes are based on closed standards and often lack even basic interoperability. Metadata, the building block of universal localisation knowledge, is often locked up in proprietary technology silos from where it cannot be extracted. Standardized localisation knowledge containers that could do the roundtrip from content creation to localisation and back into the content creation process are not sufficiently supported by localisation tools and technology frameworks available today. Therefore, large multinational digital content publishers have developed their highly proprietary environments relying on internal “standards” and proprietary technologies; for them, a move to join open standard-based technology and process automation development efforts leading to open generic solutions would be the ideal and preferred choice. However, it will require proof that the vision of such an open and generic solution can be realised in order for them to join industry-wide initiatives to standardize and connect localisation technologies in automated workflows. Smaller publishers do not have the resources to develop their own proprietary solutions and rely on the third party technology market for the provision of adequate technologies. While individual tools and technologies are available and affordable, the sophisticated and expensive localisation process and automation technologies available today are
out of reach for most small and medium sized content publishers.
Methodology
What is required to convince large multinational content publishers to join open standards based industry-wide initiatives, and small and medium sized publishers to invest in state-of-the-art technologies is a solution that is scalable, modularized, interoperable and affordable. What is required is a demonstrator system capable of delivering the proof that the vision of an open localisation platform can be achieved. The risks involved in building such a system are considerable. Leading global management systems have been developed by companies such as Idiom and GlobalSight (Ambassador). However, while they aimed to be comprehensive they were not; for example, some services such as MT never became part of the core offering of these systems. While they attracted significant investment, more than US$50 million in some cases, they never reached their projected market potential. Although they demonstrate a good understanding of basic technologies required for a “Localisation Factory”, significant research is still necessary to improve their overall architecture in order to provide a modularised and extensible framework, to enable seamless data flows, and to allow for the automatic configuration and execution of tasks. Given the enormity of this undertaking, even a project of the size and the scale of the CNGL would not be in a position to tackle this effort starting from a green site scenario. Therefore, all LOC research areas will integrate and build on available research results and technologies wherever possible. The aim is the development and the deployment of a localisation technology platform similar to that of Moses in MT or Festival in speech synthesis. In LOC, research concentrates on the improvement of key areas of localisation automation, such as the construction of a data model to build, process and maintain localisation knowledge (LOC1), the evaluation and selection of suitable tools and technologies (LOC2) and the modelling of intelligent localisation processes (LOC3).
The availability of an industry scale demonstrator system is a pre-requisite for advancing this research and for measuring its success. In 2008, Welocalize bought Transware, the owners of GlobalSight, one of the industry-standard global management systems combing workflow definition and automatic execution with access to localisation industry standard applications. GlobalSight represents an investment of more than US$50m and consist of 1.5 million lines of code. In early 2009, Welocalize decided to move GlobalSight into the open source domain thus making available and accessible, for the first time, an industrial scale, “heavy-lifting” highly sophisticated base system for the development of a platform for the construction of an open localisation platform, a platform comparable to that provided by Moses for MT and Festival for speech processing.
This development represents a seismic shift in our ability to remove a main barrier to our research efforts, i.e. the lack of an industry scale test bed and framework for the deployment of component technologies developed with LOC and the other CNGL research areas. We can now concentrate on the individual scientific research tasks in LOC and the CNGL as a whole and plan to integrate these in a platform and an environment that meets most of our requirements for a large scale testing, demonstrator and deployment framework. This does not eliminate all system framework and integration efforts, but these are quite manageable in comparison to the effort necessary to build a new framework from ground zero.
Industry Engagement
LOC has closely collaborated with its main industrial partners, especially with Symantec, VistaTEC, and Microsoft. Additional collaboration with international collaborators from Translators Without Borders also provided valuable input. Following the open sourcing of GlobalSight and the establishment of The Rosetta Foundation as a spin-off from the University of Limerick and CNGL, LOC also collaborated closely with The Rosetta Foundation and Welocalize. The engagement with industrial partners happened through site visits and one-to-one focused meetings between them and LOC researchers.
LOC supports the development of an open localisation platform that will, in addition to serving as a test bed for the research in the different work packages, provide large multinational publishers with a solid case study for the viability of open standards for the negotiation of localisation data and localisation knowledge thus providing them with the arguments necessary for a migration from enclosed proprietary localisation scenario to a more open, interconnecting and interoperable framework. This platform will also encourage the uptake of localisation and process automation solutions by small and medium sized enterprises, create new business opportunities and support the upscaling of localisation offerings by smaller firms. More than 20 companies have so far joined the Dynamic Coalition for a Global Localisation Platform: Localization4all, initiated by LOC and The Rosetta Foundation.
The Coalition will organise a workshop at the fifth annual IGF Meeting in Vilnius, Lithuania, on 14-17 September 2010. We expect the platform to generate an increased activity in sectors of the localisation industry (some first indicators show that growth by a factor of 100, in certain sectors, is not out of reach). Subsequently, we expect employment to rise in these sectors driven by a growth in translation and localisation as well as in the technical support and development area. Among these will be a significant number of positions to be created by The Rosetta Foundation within the next two years.


