- Home
- About Us
- Research
- E & O
- Collaboration
- Collaboration Framework
- Affiliated Projects
- CoSyne Project
- Kreshmoi
- EUROMATRIXPLUS
- Panacea
- PLuTO
- T4ME
- AXES
- Commercialisation Development Support
- Discriminative Word Alignment (Internship)
- IISSCoS
- Integration of an existing rule-based open-source machine translation platform with efficient corpus-based machine translation m
- Keywords International Feasibility Study
- LabJam (Internship)
- Manual transcription of Irish Sign Language Videos for Bilingual Corpus Development
- Mobile Apps
- PERCOLATE
- Perception of Multimodal Referring Expressions in a Cross-Cultural Context
- Personal Lenses for the Web of Data
- PetaMedia
- Rule-based Annotation Tools for Modern Standard Arabic
- Sorosoro Community Translation Effort
- TM Interfaces
- 1641 Depositions
- EMT Network
- Affiliated Centres
- ICHEC
- News
- Contact Us
IISSCoS: Improving Indexing for Search of Spontaneous Conversational Speech
IISSCoS: Improving Indexing for Search of Spontaneous Conversational Speech is a 3 year (Sept 08 – Aug 11) SFI-funded Research Frontiers Programme, lead by Dr. Gareth Jones.
The emergence of digital audio recording and networks such as the internet is enabling large scale capture and distribution of spoken audio content from many diverse sources, e.g. broadcast news, lectures, and personal autobiographical testimonies. The value of these recordings is greatly increased if they can be searched efficiently to find interesting material. This requires that the spoken content be automatically recognised and that it be assigned labels which describe the topic of the recordings for use with search engines. This project will develop new methods to improve the recognition accuracy and labeling of spoken audio for enhanced search.
Searching digital spontaneous conversational spoken recordings requires that it be automatically indexed by recognising contents and assigning labels describing the topics covered. Current speech recognition (SR) systems typically make large numbers of errors arising from the complexity of the acoustic signal and their fixed vocabulary. Words outside this vocabulary cannot be used to search the data, and often assumed context knowledge means that important concepts are not articulated in the speech. This project will develop methods using external text resources to identify out-of-vocabulary words and insert them into SR output or use them to contextually annotate the content.
Please contact Dr. Gareth Jones for further information on this project.


