About me

My mission: to advance Natural Language Processing and Artificial Intelligence technology to benefit humanity
My vision: to build a globally admired research lab in Natural Language Processing and Artificial Intelligence

I received my B.Sc., M.Sc. and Ph.D. from XI'AN Jiaotong Univ. in 1989, Shanghai Jiaotong Univ. in 1992 and the National Univ. of Singapore in 1999 (Ph.D. thesis submitted in Dec 1997 and Ph.D. degree conferred in June 1999), respectively. Besides, I worked as a lecturer at Shanghai Jiaotong Univ. from April 1992 to March 1995 and as a post-doc research fellow at the National Univ. of Singapore from Jan 1998 to March 1999. I joined the Institute for Infocomm Research, Singapore as an associate scientist in April 1999 and was promoted to scientist and associate lead scientist in April 2002 and April 2005, respectively. In the meantime, I had been a Ph.D. supervisor at the institute (jointly with School of Computing, the National Univ. of Singapore) since April 2002. I joined Soochow Univ. as a full-time distinguished professor and Ph.D. supervisor in Aug 2006.

Currently, I am a memeber of the Academic Board of Soochow University, an associate editor of ACM Transaction on Asian Language Information Processing(2010.07-), an editorial member of Journal of Software (Chinese)(2012.01-) and a vice chairman of Technical Committees on Chinese Information/China Computer Federation(2010.12-2016.12), and Natural Language Understanding/Artificial Intelligence Society of China. Besides, I had been a member of the Editorial Board of Computational Linguistics (2010.01-2012.12), and acted as a panel project appraisal expert of Information Division, NSFC during 2009-2014. In addition, I have severed on the program committees/reviewer boards of several prestigious international journals and conferences, including Bioinformatics (Journal), BMC Bioinformatics (Journal), Information Sciences (Journal), Information Systems(Journal), Information Processing and Management (Journal), Journal of Computer Science and Technology (Journal), IEEE TASLP(Journal), ACM Transactions on Asian Language Information Processing (Journal), Natural Language Engineering (Journal), Computer Speech and Language (Journal), ACL (Annual Meeting of the Association for Computational Linguistics), EMNLP(Empirical Methods on Natural Language Processing), COLING (International Conference on Computational Linguistics ), SIGIR, AAAI, IJCAI, CIKM.

Finally, I have been a member of ACM, IEEE computer society and ACL (Association for Computational Linguistics) since 1999.

NLP Lab, School of Computer Science and Technology, Soochow Univ., China
2006.8-till now: Distinguished Professor, Ph.D. Supervisor
My current interest includes Natural Language Processing and Statistical Machine Translation, in particular, information extraction, syntactic parsing, semantic parsing and discourse parsing. I am mainly involved in (in charge of/overseeing) several NSFC and 863 projects. For details, please refer to my research homepage.

Institute for Infocomm Research (I2R), Singapore
2005.4-2006.8: Associate Lead Scientist, Ph.D. Supervisor (Joint with School of Computing, the National Univ. of Singapore)
2002.4-2005.3: Scientist, Ph.D. Supervisor (Joint with School of Computing, the National Univ. of Singapore)
1999.4-2002.3: Associate Scientist
From April 1999 to March 2002, I was in charge of a research project in computational linguistics and natural language processing, named "Multilingual Efficient Analyzer & Thesaurus", which aimed to develop an efficient multilingual toolkit for English, and Chinese, including word segmentation (for Chinese only), morphological analysis (for English only), part-of-speech tagging, shallow parsing, full parsing, knowledge representation and thesaurus constructions. From April 2002 to Aug 2006, I was in charge of a research project in information extraction, named "Intelligent Multilingual Information Extraction", which aimed to advance the information extraction technology and develop an efficient multilingual information extraction toolkit for English and Chinese, including named entity recognition, coreference resolution, entity relation extraction and event extraction, in the newswire domain. From April 2003 to Aug 2006, I was in charge of a joint I R and NUS research project, "Information Extraction in the Biomedical Domain", which aimed to deploy the above toolkit to the biomedical domain. Form Sept 2003 to May 2004, I was in charge of a joint I2R and MOM (Ministry of Manpower, Singapore) industry project, named "Validation of Materials Security Data Sheet", which deployed the above information extraction toolkit to validate the carcinogenic information in the chemical data sheets.

School of Computing, National Univ. of Singapore, Singapore
1998.01-1999.03: Post doctoral Research Fellow
My post doctoral research was on a national project named "voice command system in a noise environment", which was jointly supported by the National Univ. of Singapore and the Ministry of Defense, Singapore. The purpose of this project was to enable the military personnel to control the military equipments via speech in a very noisy environment (e.g. underwater by Singapore Navy). This was done by applying some noise reduction technology and deploying the speech recognition system developed during my Ph.D. study.

School of Computing, National Univ. of Singapore, Singapore
1995.04-1997.12: Ph.D.
The purpose of my Ph.D. research was to study language modeling in Mandarin speech recognition. In this research, three modeling approaches were studied. First, N-gram modeling was studied to capture short-distance dependency. Next, new MI (Mutual Information)-Trigger modeling was proposed to capture long-distance dependency. Finally, PCFG (Probabilistic Context Free Grammar) was proposed to capture structure dependency. Moreover, merging of different language models was studied. All the language modeling approaches and their various combinations were evaluated on an experimental continuous Mandarin speech recognition system through a new interface called N-best BST speech lattice. It was found that different kinds of context dependencies can be well captured by N-gram, MI-Trigger and PCFG with character recognition rates of 89.6%, 90.2% and 91.2% respectively while language models merging can further improve the performance up to 93.6%. It concluded that different knowledge should be captured properly by different modeling approaches and proper language models merging can further improve the performance.

Dept. of Computer Science & Engineering, Shanghai Jiaotong Univ., China
1992.4-1995.3: Lecturer
During this period, I mainly focused on Chinese Information Processing, such as Chinese word segmentation, automatic classification and summarization of Chinese documents. In the meantime, I also participated in several China national projects under Prof. Wang Yongcheng, such as the "86.3" project "Research of Automatic Summarization of Chinese Text", the "8.5" projects "Development of Automatic Segmentation System of Chinese Words" and "Automatic Keyword Extraction of Chinese Text".

Dept. of Computer Science & Engineering, Shanghai Jiaotong Univ., China
1989.9-1992.3: M.Sc.
During this period, I mainly focused on Chinese Information Processing, such as Chinese word segmentation and research of Chinese word database. In the meantime, I also participated in several China national projects under Prof. Wang Yong Cheng, such as the "8.5" projects "Development of Automatic Segmentation System of Chinese Words" and "Automatic Keyword Extraction of Chinese Text", and the "7.5" projects "Research of Chinese Automatic Summarization" and "Research of Chinese Word Database".

Dept. of Computer Science & Technology, XI'AN Jiaotong Univ., China
1985.9-1989.7: B.Sc.