Initial phoneme labels are generated automatically by building database-specific acoustic models using the CMU SphinxTrain package and then forced aligning with the CMU Sphinx recognition system. The labels are then manually corrected. The manual correction in this voice was not done by native speakers of Arabic. The labeling team, while highly experienced in phonetic annotation, had no knowledge of Arabic beyond a basic introduction to the writing system and phoneme inventory. They did have access to native speakers for questions, but in most cases had very little difficulty defining boundaries and identifying speech errors and errors in the autolabeling. Most of the problems referred to native speakers involved labeling of the uvular fricative `ayn and incorrect transcription of doubled consonants and vowel length.
After labels have been hand-corrected, the voice can be built and evaluated.