This work describes the use of the Julius ASR engine for Italian children speech recognition in child-robot interactions within the European project called ALIZ-E, "Adaptive Strategies for Sustainable Long-term Social Interaction". The goal of the project is to develop embodied cognitive robots for believable any-depth affective interaction with young users over an extended and possibly discontinuous period. Speech recognition plays an important role whenever the verbal interaction between the robot and the child user is crucial for the ALIZ-E experiments. This work focuses on the Quiz Game ALIZ-E scenario, where the child and the robot ask each other general knowledge questions. Julius' low system requirements and small memory footprint makes it an excellent candidate for implementing speech recognition into a real-time integrated system handling several components, like the ALIZ-E one. Also, with Julius, it has proven to be very easy to integrate the desired features into the system, because of its simple API. The Italian FBK ChildIt corpus has been used to train the acoustic model for the system, while a very simple target-specific language model has been created using the questions and answers database of the Quiz Game ALIZ-E scenario. Preliminary results on real data recorded in Wizard of Oz setup reports an average of 75.7% correct words recognition rate, 10.4% inserted words and 34.7% WER.
JULIUS ASR for Italian Children Speech
Contributo in volume
Bulzoni Editore, Roma, ITA
Multimodalità e Multilingualità - La sfida più avanzata della comunicazione orale (Multimodality and Multilinguism: new Challenges for the study of Oral Communication), edited by Vincemzo Galatà, pp. 275–283. Roma: Bulzoni Editore, 2013