NaCTeM

Seminar — Prof Pierre Zweigenbaum

Speaker: Prof. Pierre Zweigenbaum (LIMSI-CNRS and CRIM-INALCO, Paris)
Title: Semi-automatic Enrichment of Lexical and Terminological Resources in the Medical Domain
Date: 16 February 2007
Location: MIB building, LG0.10
Abstract:

The medical domain is characterized by a rich terminology.  Capturing this terminology, with its variants and relations, provides a key asset for medical language and information processing.  While English medical terms are extensively represented and linked in the UMLS Metathesaurus®, the terminological resources available to researchers or to applications in other languages are much more limited.  A similar situation holds for lexical resources.  This situation motivated work in our group on acquiring lexical and terminological resources for French medical language.

I will describe acquisition methods and experiments:

  • to learn lexical relations:
    • within a given language, morphological relations between words based on existing thesauri or based on text corpora;
    • between two languages, identifying translational equivalents through transducer induction, morphological segmentation, or based on parallel or comparable corpora;
  • to identify term variants:
    • within a given language, through morphological and syntactic relations;
    • between two languages, based on parallel corpora.

Some of this work was performed within projects UMLF (lexicon) and VUMeF (terminology), while I was at Assistance Publique - Hôpitaux de Paris and at Inserm U729.

   

Slides [PDF] | Slides 2x4 [PDF]