Seminar - Jasmin Saric
| Speaker: |
Jasmin
Saric, European Media Lab Research, Heidelberg, Germany |
| Title: |
Extracting Data for Molecular Biology |
| Date: |
12:00, Friday 8th April |
| Location: |
Room F10, MSS Building |
| Abstract: |
Information extraction technology is getting more and more popular within
the biomedical domain. This technology basically aims at extracting relations
between entities, like interactions of proteins and genes. We apply this
technology to generate data for the population of our database for molecular
biology. Applying NLP (natural language processing) techniques is particularly
difficult in molecular biology since many forms of complex terminological
variations frequently occur. To resolve these ambiguities it is indispensable
to take semantic criteria into consideration. Ontologies can provide these
semantic criteria. However, it is extremely labour-intensive to build such
domain specific ontologies. To overcome this hurdle we are trying to use
existing resources (like the GENIA corpus) and machine learning aproaches
to semi-automatise this process. In my talk I will give an overview of
the activities in our group (the SDBV of EML Research in Heidelberg, Germany)
concerning Information Extraction, Ontology Learning, Parsing of Chemical
Compound Names and Manual Extraction of Kinetic Data from biology-related
scientific literature. |