The phrase break model is trained by examining the database again, but this time ignoring the POS information and only examining junctures. A n-gram of order N is constructed which represents the probability of different sequences of junctures. Using JNi-1 to represent the the previous sequence of N junctures, we have:
P(ji | JNi-1) = P(ji | ji-1, ji-2, ji-3,...,ji-N+1) | (3) |