Lexicalized statistical pattern matching: Search engine-aided analysis for the Chinese language
词汇化模板定量匹配——借助于搜索引擎的中文分析
HMaosong Sun 孙茂松; Ruying Sun 孙如颖
Abstract 摘要
"This article presents an idea of search engine-aided analysis for the Chinese language. The core of the idea is the proposed concept “Lexicalized statistical pattern matching”. The basic methodology is to perform some degree of Chinese analysis at different linguistic levels by designing and exploiting a lexicalized statistical pattern system, together with the simplest string matching technique search engines used. The rationality of the idea is discussed centering on several typical case studies and, some related key issues are also addressed. It should be noted that this idea is preliminary, needing further validation by large-scale experiments.
Keywords 关键词
Lexicalized statistical pattern matching 词汇化模板定量匹配 Search engine 搜索引擎 Web corpus 互联网语料库 Chinese analysis 中文分析 Natural language processing 自然语言处理