This paper describes the development of the Italian modules and the building of a new Italian female voice for the MaryTTS Text-To-Speech synthesis system. The building of new resources, such as Natural Language Processing (NLP) modules and corpus based voices for a new language in a Text To Speech system is a costly task. MaryTTS provides a number of useful tools for automatize and simplify this task. Nowadays two state-of-the-art speech synthesis technologies are applied on modern TTS: unit selection and HMM-based synthesis. A brief introduction about the peculiar characteristic of the HMM-based speech synthesis is given in this paper; the HMM-based synthesis approach has been chosen for its higher degree of flexibility. In the paper, the main steps necessary to built the essential NLP modules used in a TTS system using the MaryTTS tools are described. For the Italian language, more advanced NLP modules have been implemented with respect to the basic ones provided by the automatic procedures of MaryTTS. A detailed description of the Italian MaryTTS NLP modules (such as Lexicon, LTS rules and homograph pronunciation disambiguation, numbers expansion, Part of Speech Tagger and prosodic labels prediction) has been reported here. The paper finally illustrates the MaryTTS process necessary to select a phonetically and prosodic balanced text corpus for TTS and reports the details of the procedure used to build the first Italian MaryTTS voice with the HMM synthesis technology.
A New Language and a New Voice for MaryTTS
Contributo in volume
Bulzoni Editore, Roma, ITA
Multimodalità e Multilingualità - La sfida più avanzata della comunicazione orale (Multimodality and Multilinguism: new Challenges for the study of Oral Communication), edited by Vincenzo Galata, pp. 435–443. Roma: Bulzoni Editore, 2013