More than Just Babble
The Patterns of Language in Young Cantonese-Speaking Children

Children's Language Ability Greater than Expected

Newborn infants with a few days' exposure to their mother tongue can distinguish it from other languages they have not been exposed to. The babbling of 10-month-olds reflects the phonetic characteristics of their native language. By 3, they will have developed a basic grasp of the rules of their language and acquired an impressive vocabulary. These are facts that many adults may find hard to believe. People usually think that toddlers know hardly anything about their language, and that it would take years of coaching by parents and teachers before children become competent speakers. Research in the past 30 years has shown that, contrary to widespread belief, very young children have far greater linguistic knowledge than previously imagined.

In-depth studies of the language abilities of young children were inspired by the revolutionary ideas of linguist-philosopher Noam Chomsky from the Massachusetts Institute of Technology, who advocates that children are born with a rich innate linguistic knowledge that cannot be learned from experience. This school of thought sees the language of young children as complex and systematic rather than rudimentary and unstructured.

A Cantonese Child Language Corpus

A group of seven researchers from The Chinese University of Hong Kong, the Hong Kong Polytechnic University, and the University of Hong Kong won a grant of HK$700,000 from the Research Grants Council of Hong Kong in 1991 to study the early grammar of Cantonese-speaking children. Leader of the group Dr. Thomas Lee explains that they observed for a year the language development of eight children aged between 1 and 2. Their conversations with parents and other adults were audio-recorded. The data were transcribed according to the format adopted by the Child Language Data Exchange System (CHILDES) and input as computer texts.

A major outcome of the project is the creation of a Hong Kong Cantonese Child Language Corpus (CANCORP) containing 14 megabytes of tagged utterances -- the largest child language corpus to date in Chinese-speaking communities. This corpus has been deposited at the Arts Faculty server (humanum.arts.cuhk.hk) of The Chinese University and at the child language archive at Carnegie Mellon University (poppy.psy.cmu.edu) in the USA. The computerization of child language has helped speed up data analysis and made it more systematic. It has also enabled the researchers to understand the characteristics of early child Cantonese, and the ways in which it is similar to and different from the language of non-Cantonese speaking children.

An illustration of the opening of a file recording a conversation with a child at age 2 years 4 months:
Key
%mor = the part of speech of each of the words in the preceding utterance
q = quantifiernn = common nouncl = classifier
nnpp = proper nounnnpr = pronounvt = transitive verb
wh = interrogative wordsfp = sentence final particledet = determiner
asp = aspect markervf = functional verb

Toddlers Have a Large and Varied Vocabulary

A conservative analysis of the data shows that children, by the age of 2 use over 60 distinctive common nouns and a wide range of verb types. They have a grasp of the verbs of existence, location, and possession, viz.'be', 'at', 'have'. They show active use of around 10 intransitive verbs (including 'sleep', 'walk', 'sit', 'fly'), up to 50 transitive verbs (like 'see', 'pinch', 'pierce', 'seek', 'know'), six directional verbs ('come', 'move-up', 'move-out', 'go', 'return', 'move-down'), and a couple of dative verbs, which take more than one object (e.g. 'give'). In addition, children at this stage use around half a dozen adjectives, including adjectives denoting size (e.g. 'big'), quantity (e.g. 'many/much'), colour (e.g. 'green'), and evaluation (e.g. 'pretty').

The research has also shown that the vocabulary of toddlers is not limited to nouns, verbs and adjectives; these young children spontaneously produce various function words, including aspect markers, which signal temporal relationships, e.g. completion and progression ( 'perfective' and 'progressive'). Their vocabulary consists of around 10 noun classifiers (e.g. ). Before they reach the age of 3, Cantonese children have started using adverbs such as 'also' and 'still'. The child's knowledge of these adverbs is surprising, since these forms encode complex semantic information such as presupposition and elements of logical structure. Among the function words that young children use, sentence final particles figure prominently. These are forms attached to the final position of the utterance to signal mood, attitude, and qualification. By the age of 2, Cantonese children are producing at least 15 final particles (e.g. ). The corpus shows that sentence final particles emerge right from the beginning of the two-word stage, when children begin to form sentences.

Do the two-year-olds show any grasp of sentence structure? It is found that at this age, they produce canonical subject-verb-object sentences such as 'Bernard (Bernard kicked a hamburger.)' or ' (I like this ball.)', in which the subject is the agent or experiencer. They also produce sentences in which the subject is a location, e.g. ' (My pocket has toilet-paper.)', or ' (This place is a barbershop.)'.’.

Young children's grasp of Cantonese grammar is reflected in their use of sentences containing a series of verbs or verb phrases strung together. In these sentences, the first verb may express a desire or intention, e.g. ' (want-take-taxi)' or ' (I-like-eat-watermelon)'. These sentences may also contain verbs denoting successive actions. ' (eat-breakfast-walk-street)' or ' (take-car-go-Park 'N Shop)'. The researchers also found in young children's speech the type of complex sentence referred to as the 'pivotal construction', in which the object of the first verb serves as the subject or object of the second verb. Examples of these are ' ( Sister A-help-me-write)' and '(have-lollipop-eat)'.

Active Young Learners of Language

Research findings point to the conclusion that children are not passive learners of their mother tongue; instead, they make active hypotheses about the structure of their language based on what they hear and understand. They do not however always hit on the right hypothesis, and will often make systematic errors. One child has been found to always attach a negator to classifiers at a certain point in his language development. Thus he gave combinations such as ' (not-classifier-particle)' and ' (not-classifier-particle)', which are ungrammatical by adult standards. Think of an English-speaking child saying, 'a not cup (of water)'. While arguing that very young children have rich linguistic knowledge, the researchers do not dispute that there are still many details to be filled in. For example, it normally takes children more than five or six years to fully understand which classifiers go with which nouns. Errors of classifier-noun incompatibility have long been observed. An example from the group of children studied is: ' ?(my-classifier for animal-towel-particle)'.

Characteristics of Mother Tongue Reflected

The speech of young Cantonese-speaking children also clearly reflects the characteristics of their mother tongue. For instance, sentence final particles are peculiar to certain languages such as Cantonese, and are not found in the speech of English-speaking children. These particles emerge almost at the same time when word combinations appear. In a similar vein, serial verb phrases figure more prominently in Cantonese child language than in the speech of English children. This kind of variation across different child languages poses new challenges for theoretical analysis.

Earlier Hypotheses Incorrect

The early child Cantonese data have led to a more precise understanding of specific details of children's language development. It has allowed the researchers to test hypotheses which aim to capture universal features of child language. A hypothesis which was proposed in the late 1980s was that children typically pass through a stage (from around 1 to 2 years old) in which their grammar lacks functional categories such as tense markers, complementizers (e.g. clause-introducing 'that'), and prepositions. It was believed that at that stage, not all syntactic categories are available to the child; children only have lexical categories such as nouns and verbs. However, the research evidence emerging in the early 1990s, which is based on a variety of languages, argues strongly against this claim. The argument is borne out by the research findings of this project -- which show that functional categories such as aspect markers, classifiers, modal auxiliaries, and sentence final particles are established between the ages of 1 and 1.

An Important Area of Study

An accurate profile of early child language as provided by research of this kind is important for practical issues such as policies for language in education. At the global level, research on early child language is as relevant to policy makers, educational administrators, and teachers as it is to language researchers. And, as Dr. Lee puts it, to the extent that we have come closer to grasping the reality of early child language, we have certainly advanced our understanding of the human mind.


Dr. Thomas Hun-tak Lee has been teaching in the Department of English since 1987. He obtained his Ph.D. in linguistics from UCLA, and has been Nuffield research fellow at the University of York and visiting scientist at MIT.

His main research interest lies in the comparative study of logical form properties of the Chinese language, and the acquisition of these properties by young children. He is currently a member of the editorial boards of Journal of East Asian Linguistics, Journal of Chinese Linguistics, Linguistics Abroad, and International Review of Chinese Linguistics.


Dr. Thomas Lee (right 2) and the rest of the
research team