Next: Data Collection
Up: Speechalator: two-way speech-to-speech translation
Previous: Speechalator: two-way speech-to-speech translation
As an initial part of the DARPA Babylon project we were tasked with
building an interlingua-based two-way speech-to-speech translation
system on a small device in a language that our group had no
(significant) previous experience in. This required us to solve three
specific problems:
- How to collect sufficient and appropriate data for translation,
recognition, and synthesis, in the most efficient way. Using
foreign language experts we designed protocols to define the
translation domain and collect examples to allow an appropriate
interlingua to be designed.
- How to take advantage of both knowledge-based techniques in defining
an interlingua; and statistical techniques, in learning the relationship
between surface forms and that interlingua; in such a way to make
transfer to new domains and languages efficient.
- How to fit two recognizers, two synthesizers and a two-way translation
system on a device with only 40Mb of available space and limited CPU
power. This required addressing engineering issues: lack of floating
point support, synthesis database compression, efficient
recognition decoding algorithm; as well as research issues in model
design for size and efficient access.
The end result is a working prototype on a Compaq iPaq which can
recognize, translate and synthesize bi-directionally between two
languages, English and Egyptian Arabic, and do so in a reasonable
time. Although this prototype is limited, it was aimed at medical
interviews, and deals with only many hundreds of sentence types, it
shows the feasibility of such a system.
This particular system was built over a period of six months, using
the tools and techniques we have developed over a number years in
rapid development for speech-to-speech translations systems
[1], [2].
Next: Data Collection
Up: Speechalator: two-way speech-to-speech translation
Previous: Speechalator: two-way speech-to-speech translation
Alan W Black
2003-10-27