Analysis of present-day Mandarin
今日汉语的统计分析
Chao-Ming Cheng 郑昭明

Abstract 摘要
The present research presents statistical information about Mandarin and Chinese characters, which was obtained based on a frequency analysis of a corpus of 1,177,984 Chinese characters of natural-language text. The results suggested that (a) only about 40% (4,583 characters) of Chinese characters are used in present-day Chinese, (b) the frequency of Chinese words tends to be distributed lognormally, whereas that of Chinese characters does not show a lognormal distribution, (c) there are in total 401 frequently-used Mandarin sounds, consisting of 20 monograms, 219 digrams, and 162 trigrams, (d) frequently-used consonants of Chinese syllables are unaspirated stops, fricatives, dentals, and retroflexes, (e) a large proportion of frequently-used vowels are produced by closing the oral cavity, with the highest portion of the tongue in either the front or back part of the mouth, (f) there are several patterns of combination of consonants and vowels, primarily conditioned by the place of articulation of the initial and the medial of the final, and (g) syllables with the 4th tone occupy 40% of the total count.

本研究从1,177,984个字的读物中对汉字与汉语语音作出现频率的分析。结果显示:(一)出现一次与一次以上者共4,583个字,说明了现行中文只使用了低于40%的汉字,(二)汉词的出现频率呈现一种对数常态分配,但汉字的频率则不呈现此种常态分配。(三)在此分析中,收集到的汉语语音共401个,(四)常用的汉语子音是非舒气塞音,擦音,齿音与卷舌音,(五)常用的汉语母音大多是以不甚张开的口腔并且使舌头最高部位在舌根或舌尖所发出的音,(六)子音与母音的组合呈现某些规则性,主要的是决定于子音的咬音位置与中间滑音的性质,与(七)汉语的四声以第四声用得最多,几占总使用次数的40%。

Article 文章

<< Back 返回

Readers 读者



Journal of Chinese Linguistics   volume 10 (ISSN 0091-3723)
Copyright © 1982 Journal of Chinese Linguistices. All rights reserved.