Seminar — Prof Pierre Zweigenbaum

Speaker:	Prof. Pierre Zweigenbaum (LIMSI-CNRS and CRIM-INALCO, Paris)
Title:	Semi-automatic Enrichment of Lexical and Terminological Resources in the Medical Domain
Date:	16 February 2007
Location:	MIB building, LG0.10
Abstract:	The medical domain is characterized by a rich terminology. Capturing this terminology, with its variants and relations, provides a key asset for medical language and information processing. While English medical terms are extensively represented and linked in the UMLS Metathesaurus®, the terminological resources available to researchers or to applications in other languages are much more limited. A similar situation holds for lexical resources. This situation motivated work in our group on acquiring lexical and terminological resources for French medical language. I will describe acquisition methods and experiments: to learn lexical relations: within a given language, morphological relations between words based on existing thesauri or based on text corpora; between two languages, identifying translational equivalents through transducer induction, morphological segmentation, or based on parallel or comparable corpora; to identify term variants: within a given language, through morphological and syntactic relations; between two languages, based on parallel corpora. Some of this work was performed within projects UMLF (lexicon) and VUMeF (terminology), while I was at Assistance Publique - Hôpitaux de Paris and at Inserm U729.