A Mandarin Dictation Machine with Improved Chinese Language Modeling
具备高效率语言模型技术的国语听写机
Lee-Feng Chien 简立峰; Keh-Jiann Chen 陈克健; Lin-Shan Lee 李琳山

Abstract 摘要
Golden Mandarin (I) is the first successfully implemented real-time Mandarin dictation machine to recognize Mandarin speech that has a very large vocabulary and almost unlimited texts for the input Chinese characters into computers. The achievable performance is limited, however, since only the relatively simple Marko Chinese language model is used in the machine. In this paper, not only are the basic concepts and structure of the Mandarin dictation machine briefly summarized, but various efforts are proposed to improve the efficiency and accuracy of the Chinese language model. The basic idea is that the statistical approach of the Markov Chinese language model and the grammatical approach of the unification grammar can be properly integrated in a preference-first word lattice-parsing algorithm. Using this new Chinese language modeling approach, preliminary experiments indicated that a performance much higher than the previously developed Markov Chinese language model used in the Golden Mandarin (I) can be obtained at very high speed when a good parsing strategy is chosen. Such a high performance is due entirely to the effective reduction of noisy word interferences; that is, the grammatical analysis eliminates all illegal combinations, while the Markovian probabilities and proper design of the preference-first parsing strategies indicate the correct direction of processing. With this new Chinese language model, the performance of the Mandarin dictation machine is expected to improve significantly in the future.

金声一号 (Golden Mandarin I) 是国际上第一套可以实时辨认大字彚，无限文句的国语语音听写系统。这套系统的语言模型较为简单，因此语言处理能力较为有限。为了改进这项缺失，本文提出一项新的语言模型方法。这个方法利用一最佳优先的格状词组剖析算法，成功地结合统计式马可夫语言模型与联并文法理论。实验结果证实这项新方法所得的正确率优于原有语言模型，且如果剖析策略适当，辨认速度甚至可以更快。根据分析，这是因为利用文法分析一些不合文法的词汇组合可以先事先去除，而成功地剖析策略与语言模型机率可以导引正确搜寻方向。本文除提出这项新的语言模型方法外，对金声一号国语语音听写系统的设计以及统计式语言模型与文法理论的特殊性差异也都会加以介绍讨论，相信藉由这个新的语言模型方法，可以进一步提升国语听写机的成效。