It is a well-known fact that the most complicated task in doing Sinological research is to deal with the vast quantity of ancient texts. In the past, talented scholars, for instance, Gu Yan Wu in the Qing Dynasty, could commit to the memory numerous texts. Gu not only could memorize the Thirteen Classics but also all of their commentaries and even sub-commentaries, which amount to 8.7 million words. It is unthinkable and impractical for modern people to follow his method of studying ancient texts.

In view of the huge quantity of ancient texts, good tools and reference books are indispensable for Chinese studies. Reference books on ancient Chinese texts first appeared in 1930s. Under the support of the Harvard Yenching Institute, Hung Ye had edited over 60 indexes or concordances of traditional Chinese texts. But unfortunately, the outbreak of the Second World War had halted his endeavour prematurely which has left hundreds of concordances still to be done.

Any large-scale work on traditional Chinese texts must be done with the long-term support of a research institute. In 1989, the Institute of Chinese Studies in the Chinese University of Hong Kong has begun to establish an electronic database of traditional ancient Chinese texts, hoping to input all traditional texts from the Pre-Han (pre 220AD) period up to the Six Dynasties (581 AD) totaling over 30 million words into the database and to compile concordances to these texts. This project has received different grants from the RGC in Hong Kong and the Chiang Ching Kuo Foundation for International Scholarly Exchange in Taiwan. The project was virtual finished in 1996. 75 concordances have been published since 1993.

Before the texts were inputted, researchers would compare different versions of the same texts carefully and choose the best available version with the minimum subsequent tampering, mostly from the Sibucongkan edition, and add in modern punctuations. Comparison with other versions, parallel passages as well as citations found in Leishu would be carefully adopted and would be showed in footnotes. Thus, it was not only a transformation of texts from print to electronic media, but was a systematic reworking of ancient texts.

Since personal computers were getting more popular in the 90s, we started publishing ancient Chinese texts on electronic media (floppy disks and later CD-ROM) with search programme established by ourselves. It is called the CHANT Database. Searches on single words, phrases and sentence patterns can be performed easily. Users can save plenty of time on data retrieval and preparation, thus enhancing their research efficiency and letting them focus on higher-level creative work.

The Institute of Chinese Studies (ICS) received a grant from the RGC in Hong Kong in 1994 to build up "A Computerized Database of Excavated Wood/Bamboo and Silk Scripts of China" (Jianbo) and it was successfully completed on 1996. ICS received another grant from the RGC again in 1996 to establish "A Computerized Database of Oracular Inscriptions on Tortoise Shells and Bones" (Jiaguwen). This project was divided into two phases and was also completed on schedule. Over one million words of oracular inscriptions and their orthographic translation can now been searched via the CHANT website.

Once again, ICS received a grant from the Research Grants Council in Hong Kong in 1999 to build up "A Computerized Database of Bronze Inscriptions" (Jinwen). The project is now near completion and will be released through the Internet first in early 2003.

Besides the three databases on excavated materials, ICS also received a grant from the Research Grants Council to build up "A Computerized Database of the Entire Body of Extant Chinese Encyclopedias (Leishu)". Since the scope of the project is very substantial, it is divided into two phases. The first phase, which was estimated to be over 50 million words, is in good progress. We anticipate that the first batch of Leishu would be released through the Internet in 2004.

There is no doubt that the Internet is getting more and more popular in recent years. In view of this trend, we have decided to publish all of our databases online in stages. Since 1998, over 1000 titles of traditional and excavated materials have been released through the Internet. All upcoming databases will also be accessible through the CHANT website.

It is a daunting task to establish a comprehensive online database of ancient Chinese texts and it must be accomplished by the long-term dedication and collaboration of academic institutes. Users' comments and ideas are also important for the improvements of interfaces and functions of the website. We have now successfully put all of our databases onto the Internet. We, however, believe that we still have a very long way to go. We would deeply appreciate any comments about the contents as well as computer software for the continuous improvement of our Database.

The database system has been established by the Centre in 1998. It consists of six traditional and excavated ancient Chinese texts databases and includes about 80 million characters. It covers the most important cultural regions such as the Chu, the Qi, and the Lu, while the period extends from the Shang to the Six Dynasties. Its main features are as follows:
  • Using seven major mainland and overseas collections of oracular inscriptions.
  • All jiagu characters were copied, emended and interpreted by our researchers.
  • Original jiagu characters and orthographic translation can be displayed side by side to each other.
  • There is a complete list of all jiagu characters with orthographic translation and individual numbers in the Yinxu jiagu keci leizuan. A number of previously unrecorded jiagu characters and bone pieces were appended.
  • With multiple search functions: single words, phrases, and specific sentence structures.
  • Provides the frequency of each individual jiagu character.
  • With interpretation scripts provided by the Cultural Relics Publishing House of Beijing.
  • Punctuation marks and textual notes are added. All emendations are clearly marked so as to allow readers to restore the original texts.
  • Scanned images of original texts can be viewed side by side with the interpretation in standardized characters.
  • Relevant parts of the interpretation script will be highlighted when a designated scanned image is clicked.
  • Search functions allow searches for a single character, a phrase, or a specific sentence structure, and the results can be sent to a printer or a file.
  • Based on the award-winning Compilation of Yin and Zhou Bronze Inscriptions.
  • Different searches on jinwen characters available: based on original jinwen radicals or radicals of standard Chinese characters.
  • Punctuation marks are added to the orthographic translation which are displayed in two formats for easy understanding of the original jinwen characters and their meanings.
  • Original jinwen characters are displayed alongside the orthographic translation in Windows.
  • Detailed information of each bronze vessel is included.
  • Provides the frequency of each individual jinwen character.
  • The entire corpus of transmitted texts of a specific period, regardless of what categories they belong to and what volumes they have, is inputted into the database. It can assist scholars in the comprehensive study of the culture, history, and language of the relevant period.
  • The texts chosen to be inputted were the best available editions with the least subsequent tampering, mainly from the Sibu congkan. Textual comparisons were carried out for all texts. Modern punctuation marks were added. Variant readings were given in footnotes. All alterations were marked so that if readers disagree with our judgement, they can always convert back to the original states.
  • The entire corpus of transmitted texts of a specific period, regardless of what categories they belong to and what volumes they have, is inputted into the database. It can assist scholars in the comprehensive study of the culture, history, and language of the relevant period.
  • Since a lot of the texts in this period were lost or only fragments survived, researchers had made use of the works collected by Qing scholars.
  • The texts chosen to be inputted were the best available editions with the least subsequent tampering, mainly from the Sibu congkan. Textual comparisons were carried out for all texts. Modern punctuation marks were added. Variant readings were given in footnotes. All alterations were marked so that if readers disagree with our judgement, they can always convert back to the original states.
依據舊刻善本,輸入自唐至清的主要類書,然後重新點校,並於校改之處加上校改符號。
  • The database contains all major Leishu (encyclopaedia) from the Wei-Jin period down to the Qing dynasty, including the Taiping yulan, Yiwen leiju, and Cefu yuangui. There are altogether 50 titles and approximately 34M words of text.
  • The texts chosen to be inputted were the best available editions with the least subsequent tampering. Textual comparisons were carried out for all texts. Modern punctuation marks were added. Variant readings were given in footnotes.
  • It provides a comprehensive database of Chinese Leishu which helps scholars to focus on textual comparison.
收錄先秦兩漢典籍所見詞彙近15萬個,並提供釋義、讀音、用例等資料。
  • The Database contains a total of 180,244 characters/phrases, covering the most frequently used words and phrases found in transmitted ancient Chinese texts.
  • To explain the meanings of the characters/phrases collected, and provide their pronunciations. The information is instrumental in furthering research on disciplines as wide-ranging as from ancient Chinese dialects to the excavated text lexicon.