Voice Communication Interfaces with Intelligent Systems

Course in BIOSINF and TMAM Master programmes (2nd year)


Teacher: Prof. Dragoş Burileanu
Teaching Assistant: Dr. Cătălin Ungurean

Course Description

The course provides basic knowledge about human-computer communication principles and paradigms and discusses the main issues in implementing speech-based interfaces with various intelligent systems (e.g., computers and mobile devices), focusing on the speech synthesis technology. The course also presents advanced spoken dialogue systems and strategies, and discusses the concept of multimodality in designing current and future interactive interfaces.

The laboratory’s aim is practical learning of developing voice-based interfaces for embedded applications.



  • Introduction. Definitions, key concepts
  • Human-computer interaction
  • Spoken language interfaces. Network-based and embedded applications
  • Fundamentals on speech production and perception; speech signal features and representations
  • Automatic speech synthesis for voice communication interfaces: the architecture of a generic text-to-speech (TTS) synthesis system; the natural language processing stage; speech signal generation techniques
  • TTS systems: quality assessment; state-of-the-art and future trends; implementation issues for a TTS system in Romanian language
  • Dialogue systems: dialogue strategies, error handling, dialogue control models
  • Voice-based interfaces in mobile environments. Multimodality and ubiquitous computing


  • Voice communication interfaces to PC. Applications for Romanian language
  • Development environments for Android: Andriod SDK, Eclipse IDE, Samsung Galaxy S3 Toolkit; basic applications
  • Complex UI components, menus and dialogs; multimedia elements
  • Text-to-speech synthesis-based interfaces, implemented on a smart mobile terminal
  • Speech recognition-based interfaces, implemented on a smart mobile terminal
  • Laboratory assessment


Laboratory (homework + oral evaluation): 40%
Course final examination (written evaluation): 60%