Research and Development Project in Spoken Language Technology

Project in BIOSINF Master programme (1st year)

Teachers

Project Assistant: Assoc.Prof. Horia Cucu

Description

The Spoken Language Technology project introduces the students into the field of Automatic Speech Recognition (ASR). The project involves an introductory theoretical part regarding phonetics, acoustic modeling and language modeling and an extended practical part that concludes with the development of a complete connected-digits recognition system for Romanian. The project guides the students to collect their own speech database, to create a simple phonetic dictionary (comprising of the 10 Romanian digits), to train an acoustic model using the speech database and to create a simple, rule-based task grammar. Finally, all these pieces are linked to create a complete connected-digits ASR system. The last part of the project aims at evaluating the performance of the ASR system and optimizing the acoustic model. In addition to this compulsory part, the most motivated students are guided to extend the speaker-depedent ASR system into a speaker-independent one and to develop a graphical user interface for the application. The R&D project is based on the CMU Sphinx Speech Recognition Toolkit.

The general architecture of an ASR system
Phonetic, acoustic and linguistic resources
Acoustic modeling: main principles
Language modeling: main principles
ASR evaluation
Construction of a connected-digits ASR system for Romanian

Download

Research and Development Project in Spoken Language Technology: Project Guide v11, Audio Recording Guide v3.
Horia Cucu, “Towards a speaker-independent, large-vocabulary continuous speech recognition system for Romanian”, PhD Thesis, University “Politehnica” of Bucharest, Oct 2011.
Andi Buzo, “Automatic speech recognition over mobile telecommunication channels”, PhD Thesis, University “Politehnica” of Bucharest, Oct 2011.
Steve Renals and Thomas Hain, “Speech Recognition“, Chapter 12 in “Handbook of Computational Linguistics and Natural Language Processing“, 2010.
Daniel Jurafsky and James Martin, “Automatic Speech Recognition”, Chapter 9 in “Speech and Language Processing: An introduction to natural language processing, computational linguistics, and speech recognition” (2nd edition), Pearson Education, 2009.

Additional resources

Grading

Project presentation (oral evaluation): 100%