A multimedia corpus of child Mandarin: the Tong corpus
一个多媒体汉语普通话儿童语料库:同语料库
Xiangjun Deng 邓湘君; Virginia Yip 叶彩燕

Abstract 摘要
This article features a new multimedia corpus with 22 hours of recordings of a Mandarin-speaking child from the age of 1;7 to 3;4. We review the state of the art in the use of corpora for first language acquisition of Mandarin, and highlight the importance of corpus studies in evaluating children’s language developmental patterns vis-a-vis adult input. The transcripts in our new corpus are annotated with a morphological tier indicating parts of speech, and linked to audio or video files. This corpus goes beyond existing published corpora of child Mandarin in having more data for a single child, as well as media linking. It contributes to a number of fields including language acquisition, Chinese linguistics, corpus linguistics, developmental psycholinguistics, education, and speech and language therapy.

本文发布一个新的多媒体语料库的首阶段成果。这部分内容记录了一名普通话儿童从1岁7个月到3岁4个月期间的语言发展,共录得22个小时的语料。借此机会,我们回顾了汉语普通话一语习得研究中语料库使用的最新情况,强调语料库研究在考察儿童语言发展和成人语言输入时的重要作用。在我们这个新的语料库中,文字转写材料添加了词类注释层,并已实现与多媒体材料的链接。这个语料库在单个普通话儿童数据量和音频视频链接上超越了现有已发表的语料库。它将为语言习得、汉语语言学、语料库语言学、发展心理语言学、教育以及言语治疗等诸领域做出贡献。

Subject Keywords 主题词

Child language corpus 儿童语料库 Mandarin Chinese 汉语普通话 Language input 语言输入 Media linking 多媒体链接 Morphological tier 词类注释层


Journal of Chinese Linguistics vol.46, no.1 (January 2018): 69-92
Copyright © 2018 Journal of Chinese Linguistices. All rights reserved.

Article 文章

<< Back 返回

Readers 读者