Text-to-Speech Synthesis for Low Resource Languages

We are sorry, this position has been filled.

The rapid improvement of speech technology over the past few years has resulted in very natural and intelligible TTS synthesis for a small number of languages, including English, French, and Mandarin. These high-resource languages have been studied extensively by speech researchers who have built various language tools and collected and annotated massive amounts of speech data in these languages. However, there are thousands of languages in the world, many of which are spoken by millions of people, which have not been so fortunate to receive this attention from the speech and natural language processing community. Low-resource languages do not enjoy rich computational resources and vast amounts of annotated data. Thus, speakers of these languages are deprived of the benefits of modern speech technology which enable us to communicate across language barriers. We are developing tools to allow people to develop TTS systems from "found" data in multiple LRLs. Candidates should have good programming skills in python and have some speech and language coursework (preferably).

Lab: Columbia Speech Lab, CEPSR 7LW3

Direct Supervisor: Erica Cooper

Position Dates: 5/5/2017 - 5/10/2017

Hours per Week: 35

Paid Position: Yes

Credit: Yes 

Qualifications: N/A

Eligibility: Freshman, Sophomore, Junior

Professor Julia Hirschberg, julia@cs.columbia.edu, CEPSR 705, 212-939-7114