Knowledge mining on root word correlation based on Modern Chinese corpus
基于当代汉语流通语料库的根词相关性知识挖掘研究
Yuqi Sheng 盛玉麒
Abstract 摘要
In the past, the generalization of basic vocabulary and general vocabulary is too general. The core element of “basic vocabulary” is the “root word”, which is stable, productive and frequent. The knowledge of “root word correlation” is the basis to parse the structure and generative model of all phrases and sentences. This paper uses the corpus linguistics theory and method. Through the adequate description and quantitative analysis for the Chinese root word correlation based on the 14 million character Corpus of modern Chinese, this paper discovers the Chinese temporary phrase structure patterns and the knowledge extraction problems of unknown words identification. This study has important theory significance and the positive practical reference value for Chinese ontology and application research.
Keywords 关键词
Corpus 语料库 Root word 根词 Correlation 相关性