Language Weaver, the leader in statistical software-based language translation solutions for the enterprise, announced today the development of a new, highly effective method of translating names using statistical machine translation.
The findings are included in a research paper co-written by Language Weaver Founder, Vice-President and Chief Scientist Kevin Knight, a professor in USC's Department of Computer Science and a senior research scientist and fellow at the university's Information Sciences Institute.
The paper, titled "Name Translation in Statistical Machine Translation: Learning When to Transliterate," was co-written by Knight with colleagues Ulf Hermjakob of the University of Southern California Information Sciences Institute and Hal Daumé III of the University of Utah School of Computing.
Transliteration is the process of changing words, characters, and symbols of one language into corresponding characters of another. Unlike most common nouns, names are typically transliterated.
The researchers present an improved, automated method of transliterating names in the framework of end-to-end statistical machine translation. Using the new method, Knight and his colleagues achieved better name translation accuracy than 3 out of 4 professional translators.
"Translating names of people, organizations and locations has traditionally been one of the more challenging aspects of translation for both humans and computers," said Knight. "When you translate a name into another language across that boundary when the sounds and characters are different, information has a tendency to get lost or transformed."
In an audio interview that is available for listening and download from Language Weaver's blog at http://blog.languageweaver.com, Knight said that he and his colleagues trained software to, "learn when to transliterate," on a bitext of 7 million sentences and Google's English terabyte n-grams.
Ulf Hermjakob will present the paper and research findings at The 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies on June 19th in Columbus, Ohio.
About Language Weaver, Inc.
Language Weaver has a unique approach to automatic language translation using proprietary statistical translation algorithms that resulted from research and development at the University of Southern California's Information Sciences Institute (USC/ISI). Language Weaver's translations are fluent, natural sounding and save customers money and time through automation. For more information, visit http://www.languageweaver.com.