Analysis of present-day Mandarin
今日汉语的统计分析
Chao-Ming Cheng 郑昭明
Abstract 摘要
The present research presents statistical information about Mandarin and Chinese characters, which was obtained based on a frequency analysis of a corpus of 1,177,984 Chinese characters of natural-language text. The results suggested that (a) only about 40% (4,583 characters) of Chinese characters are used in present-day Chinese, (b) the frequency of Chinese words tends to be distributed lognormally, whereas that of Chinese characters does not show a lognormal distribution, (c) there are in total 401 frequently-used Mandarin sounds, consisting of 20 monograms, 219 digrams, and 162 trigrams, (d) frequently-used consonants of Chinese syllables are unaspirated stops, fricatives, dentals, and retroflexes, (e) a large proportion of frequently-used vowels are produced by closing the oral cavity, with the highest portion of the tongue in either the front or back part of the mouth, (f) there are several patterns of combination of consonants and vowels, primarily conditioned by the place of articulation of the initial and the medial of the final, and (g) syllables with the 4th tone occupy 40% of the total count.
Journal of Chinese Linguistics volume 10 (ISSN 0091-3723)
Copyright © 1982 Journal of Chinese Linguistices. All rights reserved.