Integration of an existing rule-based open-source machine translation platform with efficient corpus-based machine translation modules and tools

Integration of an existing rule-based open-source machine translation platform with efficient corpus-based machine translation modules and tools, a 1-year project June 2009 - June 2010), an SFI E.T.S. Walton Fellowship, comprises 1 academic partner (DCU).

In 2004 Prof. Mikel Forcada coordinated a group that started developing Apertium, a free/open-source machine translation (MT) platform designed to build rule-based machine translation systems for any language pair. Apertium is not only being used by companies in Spain to offer MT services to institutional and private customers, but is also providing an environment to carry out open, radically reproducible, and transferable research. Researchers in the MT group of the NCLT and the CNGL, led by prof. Andy Way, have pursued corpus-based approaches to MT which culminated in MaTrEx, a modular, maintainable and efficient data-driven machine translation system which combines example-based and statistical machine translation.

The goal of this project is to free/open-source and integrate MaTrEx technology (as well as EBMT technologies from other groups) to work with Apertium so that hybrid rule-based and corpus-based systems can be built using linguistic data and available bilingual text corpora, both to ease future research in the field and to provide new solutions to the user community and the language industry in particular.

Please contact

for further information on this project.

All comments are submitted to the feedback forum in the members area.