From Synergy to Knowledge: Corpus as a Natural Format for Integrating Multiple ......


Staff member
From Synergy to Knowledge: Corpus as a Natural Format for Integrating Multiple Educational Resources

Prof.Chu-Ren Huang
Institute of Linguistics, Academia Sinica, Taiwan

Presented at PANEL F: CORPUS-BASED EDUCATIONAL RESEARCH IN ASIA, in the 2nd International Education Conference (Redesigning Pedagogy: Culture, Knowledge and Understanding), 28-30 May 2007, National Institute of Education, Nanyang Technological University, Singapore.


Integrating information from multiple domains and cross-lingual sources is probably one of the most important skills necessary for students to learn. In turn, design of a high-performance learning environment needs to incorporate multilingualism and multi-domain information. We demonstrate in this talk how various corpora and language resources can be effectively integrated to provide an infrastructure of synergy for new knowledge. First, by integrating a billion word corpus with in-depth grammatical knowledge, Chinese WordSketch is a system which generates linguistic descriptions that can be easily applied in language pedagogy. The most salient usage of each word, as well as how the uses of two near synonyms contrast with each other, can be automatically summarised, and with hundred of supporting example sentences. Second, Hantology integrates the conventionalized structure of Chinese characters with the cutting edge knowledge engineering theory of ontology. In terms of language teaching, it provides a meaningful way to breakdown components of Chinese characters to learn to write them, as well as a more explanatory framework to derive their meanings. Lastly, corpora can be integrated for structured learning, as in Adventures in Wen-Land. Three different curricula of elementary school Mandarin in Taiwan were converted to corpora and integrated as the main framework of this digital language learning site. Resources dedicated to linguistic skills (classifiers, idioms), and classical literature (Tang poems, The Dream of the Red Chamber, etc.) are linked through a tracked lexical list. This allows cross-curricula and cross-domain learning for both teachers and students.

PPT is available at Laohong's corpus stuff folder.