|
Authors: | Beat Pfister |
Group: | Computer Engineering |
Type: | Techreport |
Title: | Lexical and Syntactic Analysis of Mixed-Lingual Sentences for Text-to-Speech |
Year: | 2002 |
Month: | November |
Pub-Key: | PW02 |
Institution: | SNSF Project |
Abstract: | Real-life texts contain numerous inclusions from foreign languages, such as proper names, technical terms, and foreign phrases. Therefore, a text-to-speech (TTS) system which has to read aloud such mixed-lingual texts need the corresponding capability to analyze the words and sentences and derive the appropriate pronunciation. In order to get a more precise view of the of typical inclusions from other languages, this joint project of LATL/UniGe and TIK/ETHZ started with an investigation of various German and French texts. The outcome shows: foreign inclusions are quite frequent; the majority of them is English; and the size of the inclusions ranges from a part of a word up a whole phrase. This investigation in turn defined the requirements for a corresponding text analyzer. Two quite different basic approaches have been successfully pursued: LATL worked with the government-binding theory, and TIK with definite clause grammars. Both of the approached have proven to be appropriate, but a formal ranking has not been done so far. |
Resources: | [BibTeX] |