Speech Technology for Minority Languages: the Case of Irish (Gaelic)
Citation:Ni Chasaide, A., Wogan, J., O Raghallaigh, B., Ni Bhriain, A., Zoerner, E., Berthelsen, H. and Gobl, C., Speech Technology for Minority Languages: the Case of Irish (Gaelic), Proceedings of the 9th International Conference on Spoken Language Processing, INTERSPEECH 2006, INTERSPEECH 2006, Pittsburgh, 2006, 181 - 184
corpus.pdf (Published (publisher's copy) - Peer Reviewed) 75.47Kb
Abstract?Unit selection is a data-driven approach to speech synthesis that concatenates pieces of recorded speech from a large database in order to create novel sentences. Many corpora are available in the English language, including the Arctic database , which allows a user to create small, reliable speech synthesisers using only a small set of recorded sentences. Such resources for minority languages are scarce however, despite their increasing importance for the survival of such languages. This paper describes the current research in creating efficient Irish language corpora for speech synthesis. Corpus design techniques are discussed, in particular, two methods of data reduction that are applied to an aligned spoken corpus of Irish in order to create smaller, more efficient speech corpora.
Other Titles:Proceedings of the 9th International Conference on Spoken Language Processing, INTERSPEECH 2006
Type of material:Conference Paper
Availability:Full text available