You are here: Home | News |

Current News

DCU's participation in WMT 2010

This year ACL 2010 JOINT FIFTH WORKSHOP ON STATISTICAL MACHINE TRANSLATION (WMT) 2010's shared translation task involved 8-pairs of European languages (English to French/German/Spanish/Czech and vice versa). DCU took part in the English to Spanish and English to Czech translation tasks.

Best System Matrix

The team comprised Sergio, Rejwanul, Sandipan, Pratyush, Ankit, Mikel Forcada, Pavel Pecina , Antonio Toral , Jinhua and the coordinator, Sudip Naskar.

In English-Spanish, DCU submitted 5 systems' outputs, and they are ranked 1st, 2nd, 4th, 5th and 6th. In English-Czech, they submitted only 1 system output, and got overall 7th position.

Brian O'Donovan giving talk at UL

Brian O'Donovan (IBM) is giving a talk on the topic "Confronting the Crisis: Can Technology Save Us?" as part of the series of conversations hosted by the Institute for the Study of Knowledge in Society (ISKS). The talk is scheduled from 4-6pm and will be held in the East Room of Plassey House, University of Limerick on Tuesday 9th March.

CNGL Researcher secures funding for work on 1641 Depositions

A CNGL researcher, Dr. Seamus Lawless, has secured funding along with humanities researchers from TCD and The University of Aberdeen for almost £334,000 under the Arts and Humanities Research Council (AHRC - British Arts Council) Digital Equipment and Digital Enhancement for Impact scheme, to help devise new techniques to analyse a rare manuscript collection of the 1641 Depositions held by Trinity College Dublin. This project will build on an earlier £1 million project involving collaboration between Dr. Lawless and Prof. Vinny Wade in Trinity College Dublin and the universities of Aberdeen and Cambridge which led to the recent digitisation of the archive http://www.tcd.ie/history/1641.

1641 Depositions

The 1641 Depositions are witness testimonies, mainly by Protestants but also by some Catholics, describing their experience of the 1641 Rebellion - one of the most violent chapters of Irish history.

This AHRC funding will allow the researchers to interrogate the database for a variety of information including the development of the English language in Ireland and the settlers' lifestyle there in the 1640s, the language of atrocity appearing in the witness testimony and the reliability of the evidence in the depositions.

Researchers will work closely with IBM in Dublin, one of the world's leading technology companies, and use its LanguageWare© technology to analyse the depositions and to cross-correlate an array of features of the text - a process which would be too complicated and potentially take a lifetime for a scholar to undertake manually.

Dr Barbara Fennell, Senior Lecturer in Language and Linguistics at the University of Aberdeen, who will lead the project, said: "This body of material is unparalleled anywhere in early modern Europe, and provides a unique source of information on the 1641 rebellion.

The year-long project will bring together linguists, historians, digital humanities experts, geographers and computer scientists to create a new interactive research environment.

Dr. Lawless will work with the Department of History at Trinity College Dublin, researchers from the University of Aberdeen, the Digital Humanities Observatory, Dublin and the IBM LanguageWare© Group, Dublin to gather and evaluate their findings.

Coverage in the press today:
BBC News - 1641 massacre accounts examined
The Washington Post - Experts explore 1641 Irish slayings of Protestants

Report from ILT 1.9 meeting with UEA Norwich

ILT 1.9 are focussing on providing translation support for patients with limited English when they go to book an appointment at the GP's surgery. We are targeting Deaf users of Irish Sign Language (ISL) and also Bangla speakers. To this end we have developed a corpus based on patient-receptionist dialogues, and have recently finished filming an ISL user signing the corpus phrases. The corpus is also being translated into Bangla. We will then transcribe the ISL corpus using HamNoSys, a transcription system for sign languages. The corpus will be used in a corpus-based MT system with an avatar simulating the ISL output.

We want to use SiGML to represent the HamNoSys transcription: SiGML provides an interface between the transcription and the animation program that drives the avatar. A particular area of research interest is the incorporation of NMFs (Non-Manual features are used in sign language to convey intonation and emotion amongst other things) of ISL in the Avatar animation.

We recently visited the "virtual humans" team at UEA Norwich, where SiGML was developed. Reuse of their research results for our project will make an interesting collaboration.

After a very early start (5.30), we arrived at UEA and were treated to a nice lunch by our counterparts. The UEA team have completed an impressive number of projects involving avatars to assist Deaf people communicate. They are currently involved in two EU projects: Dictasign and to Signspeak.

Four views of Visio

They very kindly offered to collaborate with our team and offered us use of their software. This will facilitate our task of building a corpus in HamNoSys transcription as we wished.

On a personal note, further collaboration will allow me to build more NMF into their system, facilitating my PhD research topic.

After a long drive and a delayed flight home we arrived back in Dublin at midnight. A long but fruitful day.

Friedel Wolff to visit UL

Friedel Wolff will be visiting the University of Limerick and meeting with the Localisation Research Centre at CSIS on Wednesday, 11 April. As part of his visit, he will give a talk on the following topics:

ANLoc – Unlocking technology in Africa

African languages are mostly absent from the Internet and other information and communication technologies. Whereas technology can help solve many problems, current computer systems often fail to support African languages, and this complicates technology adoption. The African Network for Localisation works to improve this. The individuals and organisations in the network address several aspects, such as fonts, keyboards, locales, localisation tools, spell checkers, terminology development, and several activities to promote the goals of the project.

More information: http://www.africanlocalisation.net/

Open Source platforms for CAT, crowd-sourcing and localisation research

Translate.org.za develops localisations tools for their own needs in South Africa, as well as the requirements of the wider localisation landscape, specifically in the world of Free and Open Source software. Pootle and the Translate Toolkit has been assisting localisation teams of Mozilla Firefox, OpenOffice.org, Debian, and many others.

More information: http://translate.sourceforge.net/

About the Speaker

Friedel Wolff is a software developer, localiser and language technologist working for Translate.org.za. He holds a masters degree in computer science and is involved in the localisation of several key pieces of Free and Open Source software, such as Firefox, OpenOffice.org, GNOME, and spell checkers for South African languages. The CAT tools developed by him and his team are used by many in the world of Free and Open Source software and elsewhere. He has trained localisers in South Africa and elsewhere in Africa.

All Ireland Linguistics Olympiad 2010

Over 250 second level students all over Ireland took the first round paper of the CNGL All Ireland Linguistics Olympiad (AILO) in their own schools on Wednesday February 3rd 2010. 70 students have qualified for the final in DCU on March 24th 2010. A CNGL tutor has been allocated to each school. CNGL has allocated a tutor to each school to offer guidance on how to tackle the linguistic and logic problems students will face in the final. More information can be found on the CNGL AILO Website.

Team competition at AILO 2009
Team competition at AILO 2009

Following the success of last year's inaugural AILO competition, CNGL invited transition-year, 5th- and 6th-year students in Ireland and Northern Ireland with an interest in languages and good analytical skills to put them together, learn about linguistics, and participate in this fun competition. The winners of the overall AILO individual competition will represent Ireland at the International Linguistics Olympiad in Sweden in July 2010.

DCU MT GROUP RELEASES FREE/OPEN-SOURCE EBMT SYSTEM ‘MARCLATOR’

The Centre for Next Generation’s (CNGL) Machine Translation group, led by Prof. Andy Way at Dublin City University (DCU), announces the release of ‘Marclator’ (Marker-based Translator), a free/open-source system for Example Based Machine Translation (EBMT). This release coincides with the 4th MT Marathon, a week-long event being hosted January 25th-30th by the CNGL and the National Centre for Language Technology (NCLT) at DCU in conjunction with the EuroMatrix+ project, where over 100 participants from 20 countries will have a chance to test and program open-source MT tools and systems.

The Marclator EBMT system release includes a fully functional marker-based chunker/tagger (based on Green’s “marker hypothesis”) with markers for some languages and a chunk aligner, as well as a proof-of-concept ‘naïve’ (monotone) recombination module or ‘decoder’.

This free/open-source release results from collaboration with Prof. Mikel L. Forcada of Universitat d’Alacant in Spain who is currently a visiting researcher within the CNGL MT group at DCU through an ETS Walton Award from Science Foundation Ireland (SFI).

Through SFI funding of the Centre for Next Generation Localisation and additional funding from EU FP7 research projects currently coming on stream, DCU now boasts one of the largest academic research groups focused on MT worldwide. The Marclator release is seen as a ‘first-step’ in a strategy of participation in the free/open-source community in parallel with a programme of commercial engagement with companies interested in adopting, tuning and deploying machine translation technology.

Over the past number of years, Prof Andy Way has led the MT group at DCU in pursuing corpus-based approaches to MT, which have culminated in the MaTrEx system, a modular, maintainable and efficient data-driven machine translation system which combines example-based machine translation (EBMT) and statistical machine translation (SMT) and which consistently ranks as one of the top-performing MT systems in open machine translation evaluations (e.g. WMT-09, IWSLT-09, etc.).

As a follow-on to the Marclator release, Prof. Way and Prof. Forcada will continue to collaborate toward a free/open-source release of a baseline MaTrEx system, combining Marclator with the Moses SMT decoder. This OpenMaTrEx release is anticipated for Spring 2010.

Resources:

http://www.cngl.ie
http://nclt.dcu.ie/mt
http://www.computing.dcu.ie/~mforcada/fosmt.html
http://www.computing.dcu.ie/~mforcada/marclator.html
http://www.euromatrixplus.net/
http://www.mtmarathon2010.info/web/Welcome.html

For more information please contact: info (AT) cngl.ie

CNGL at BT Young Scientist Exhibition on Thursday 14th January 2010

CNGL researchers exhibited at the BT Young Scientist Exhibition on Thursday 14th January 2010 from 09.30-12.30. Neil Peirce demo'd his interactive language learning game, Declan Dagger and Dominic Jones demo'd their TweetTranslate translation service which plugs into Twitter allowing users, with the click of button, to translate their tweet streams into multiple languages. Sara Morrissey demo'd her sign language translation tool. Thanks very much to Neil, Declan, Dominic, and Sara for getting involved in Young Scientist.

Sara Morrissey signing with a student at Young Scientist
Sara Morrissey signing with a student at Young Scientist Exhibition 2010

Neil, Sara, Declan, Dominic at Young Scientist
Neil Peirce, Sara Morrissey, Declan Dagger and Dominic Jones at the Young Scientist Exhibition 2010

Find out more about the BT Young Scientist exhibition on their website: http://www.btyoungscientist.ie/

Investing in research boosts the economy

An article in the Irish Times argues that the continued investment by the Irish government via the Science Foundation Ireland supports the Irish economy in the long term by creating new jobs, generating new investment and attracting new industry. In particular, CSETs such as CNGL with their academy-industry partnerships are validating the potential Ireland is developing as a research economy and also the role of research in anchoring the significantly greater investments by these companies in associated manufacturing facilities.

Read the full article at http://www.irishtimes.com/newspaper/finance/2010/0111/1224262051456.html

Intern joins ILT 1.9

Shane Gilchrist has begun working with the ILT 1.9 group on an intern contract for the month of January. He has been employed to create Irish Sign Language video from an English corpus of GP secretary-patient dialogue. This will form the first part of our bilingual corpus for MT. He will also partake in the transcription of the ISL videos into an annotated format suitable for translation and then animation. He is also acting as a representative of the Deaf community and a consultant for us. He is aided in his video translation by Alvean Jones, who will verify the translations for accuracy. Shane is currently finishing a Masters at the University of Amsterdam on General Linguistics.

Sara and Shane
Sara Morrissey with intern, Shane Gilchrist, who is working on Irish sign language video

Obama White House Calls for Machine Translation

The Executive Office of the President and National Economic Council issued its “Strategy for American Innovation.” Among the recommendations was a call for “automatic, highly accurate and real-time translation between the major languages of the world — greatly lowering the barriers to international commerce and collaboration.” See Global Watchtower for the full story.

Localisation Innovation Showcase

In conjunction with Innovation Dublin, CNGL hosted a 'Localisation Innovation Showcase' event at DCU on Friday October 16th. The CNGL showcase highlighted localisation business and technology innovation through exhibitions and demonstrations of products, technologies and projects across both industrial and academic partners of the SFI-funded CNGL centre.
--> Find out more.

Site designed by Designit