To Apply Modern Technology to Ancient Chinese Studies

CHANT (CHinese ANcient Texts): a computerized database of all ancient Chinese texts up to 600 AD

Chinese civilization boasts of an extensive literature going back to the earliest times, and this makes it difficult for sinologists to be exhaustive in the assemblage of primary materials.

In 1988, Prof. D.C. Lau and Dr. Fong-ching Chen of the Institute of Chinese Studies (ICS) put forward a proposal for the establishment of a computerized database of the entire body of extant Han and pre-Han (206BC-AD220) traditional Chinese texts. Grants amounting to HK$1.35 million were given to this project by the UPGC and technical support was rendered by the Computer Services Centre of the University. The entire database consisting of nine million Chinese characters in 103 works was completed in March 1991.

Uniqueness of the Database

While the establishment of databases of ancient Chinese texts have been attempted in both mainland China and Taiwan, the ICS database is the first of its kind in Hong Kong and is significantly different from those in China and Taiwan in research orientation and mode of compilation.

First, the database contains the entire body of extant traditional texts of a specific period, i.e. Han and pre-Han. Such texts constitute the sources of Chinese culture and are of great historical value. With the establishment of the database the international community of scholars will have ready access to all extant ancient Chinese texts of the period, thus facilitating a wide range of research, whether in Chinese literature, history, philosophy, linguistics, or lexicography. Lexicographers can use the database for exhaustive study of lexical items found in ancient works.

For the database, the best early editions have been chosen, mainly from the Sibu Congkan collection. Punctuation and textual notes are then added, and emendations are clearly marked in such a manner as to render it easy for the reader to recover the original text.

The Compilation of Concordances Top

The completion of the database makes it possible to compile a complete series of concordances to ancient texts in the Han and pre-Han periods, which is the ultimate objective of the entire research project.

Concordances are extremely useful tools for research as they give ready access to every occurrence of a word in the works concordanced (Table 1).

It is not an exaggeration to say that in the past 60 years or so in the field of sinology, no other research tool has contributed more to ancient Chinese studies than the concordances produced by the Harvard-Yenching Institute in the 30s under the editorship of Dr. William Hung. Unfortunately this work was cut short by the Second World War. Although some 60 concordances were published, a far greater number of texts remained to be done.

Prof. Lau recalls that, in 1965, when he called on Prof. Yang Lian Sheng at Harvard, he raised the question of reviving the concordance series and was told that the institute was no longer interested in resuming the work. He has since been waiting for an opportunity to complete the work interrupted by the war.

With the advent of the microcomputer and its extensive use, and with the completion of the database in 1991, the research group in the ICS decided to develop its own indexing systems for the compilation of concordances for all the texts in the database, viz., the ICS Ancient Chinese Texts Concordance Series. For this, Prof. Lau and Dr. F.C. Chen serve as consultants. Prof. Lau supervises and makes decisions on text management and textual notes, while Dr. Chen oversees conceptual design of the information retrieval system to be used for electronic publication, and monitors programming progress. Other key members of the team include Mr. Ho Che Wah, executive editor and project coordinator, and Mr. Ho Kwok Kit, computer projects officer.

Two Forms of the Concordance Series Top

Book Form

The ICS Ancient Chinese Texts Concordance Series, published by the Commercial Press (Hong Kong), consists of 87 titles in 62 volumes (Click to view the publication list of ICS Ancient Chinese Texts Concordance Series).

Electronic Form

Since personal computers became more common in the 1990s, we started publishing ancient Chinese texts in electronic media (floppy disks and later CD-ROM) with search programmes established by ourselves. Searches based on single words, phrases and sentence patterns can be performed easily. Users can save plenty of time on data retrieval and preparation, thus enhancing their research efficiency and helping them to focus on advance analytical work.

There is no doubt that the Internet is becoming more and more popular. In view of this trend, we have decided to publish all of our databases online in stages. In 1989, ICS began an electronic database of traditional ancient Chinese texts, hoping to input all traditional texts from the Pre-Han (pre 220AD) period up to the Six Dynasties (581AD) – totaling over 30 million words- into the database. Since 1998, over 1000 titles of traditional and excavated materials have been released through the Internet. It is known as the CHANT (CHinese ANcient Texts) Database (Website: So far, there are five databases available, including Jiaguwen Database, Jinwen Database, Jianbo Database and Jianbo Database II, Pre-Han and Han Database, Six Dynasties Database. Meanwhile, a Computerized Database of the Entire Body of Extant Chinese Encyclopedias (Leishu) is expected to be released in 2006. 

The basic requirements for using the CHANT Database are:

·       PC (with Pentium or above CPU)

·         64MB RAM (128MB or above is recommended)

·         High-color Display (800X600 resolution, 16-bit color)

·         Windows 98 (Traditional Chinese) or above

  •   Microsoft IE 5.X or above

A New Era in Sinology Study Top

The publication of the concordance series in both book and electronic form is the culmination of years of hard work. All basic information will be at the fingertips of the reader, who no longer has to spend hours on primary data collection, analysis, and comparison. The time saved can be used for work of a more intellectual nature.