Natural Language Processing in Biomedicine

ACL 2003 Workshop

Sapporo Convention Center, Sapporo, Japan

July 11, 2003


Workshop Program

Accepted Papers


Workshop description

The vast amount of knowledge found in existing scientific literature is a challenge for the scientist working in rapidly growing areas such as molecular biology and biomedicine. Medline contains over 10 million abstracts, and approximately 40,000 new abstracts are added each month. Although there are growing numbers of sequence databases and other hand-constructed databases, most new information is unstructured text in Medline and full text journals. The ability to have access to natural language techniques and tools that automate and facilitate the process of knowledge discovery consistently is of paramount importance.

In the past couple of years, the need for the two fields of biomedicine and natural language processing to exchange ideas has been demonstrated in the emergence of special interest groups and dedicated workshops and tutorials. The needs of biomedicine are practical and NLP techniques are mature enough to respond to those needs.

For NLP researchers, processing biomedical texts is a challenge especially in the area of terminology, information extraction from texts, knowledge discovery or ontology building from large collection of documents, sharing knowledge in the form of factual and textual data bases, annotation tools and techniques.

For biologists, it is now not uncommon to study protein complexes and pathways composed of dozens of dynamically interacting proteins. With the recent advent of high sensitivity methods to rapidly identify components of multi-protein complexes, the extent of this complexity is likely to grow exponentially in the next few years.

At the same time, researchers in biomedicine have already constructed large scale linguistic resources such as UMLS, SNOMED, Mesh, Gene Ontology, etc., which can be used for knowledge-based NLP, Intelligent IR, knowledge-triggered discovery of new scientific knowledge, etc.

The aim of this workshop is to bring together NLP researchers in biomedicine and to discuss recent advances in the computational analysis of text, which go beyond traditional keyword-based indexing methods and begin to offer content-based analysis. Past experience has shown that sharing of common resources, not only domain specific dictionaries and thesauri but also properly annotated common test/training corpora play a significant role in the systematic development of NLP techniques in a specific domain. Processing a sublanguage like biomedicine requires the systematic construction of such common resources and the use of different NLP techniques. What is lacking in this area is standardisation of terminological resources, agreement on the annotation standards, evaluation metrics and initiatives similar to TREC, MUC for the biomedical domain. For these purposes interaction between the two research fields is crucial.

Areas of interest

In this workshop, we will address the following issues:

Intended audience

This workshop follows up workshops with similar objectives such as NLP and Ontology Building (Tokyo, 2001), ISMB (2001, 2002), ACL (2002). The organisers hope to create SIGs in areas of common interest such as acronym detection in molecular biology, annotation standards in biology, evaluation techniques. The organisers plan to have a separate session for co-operation and sharing in resource building and formulating challenge problems.

The Workshop will be organised during the 41st annual meeting of the Association for Computational Linguistics (ACL 2003) , to be held in Sapporo Convention Center, Sapporo, Japan, July 7-12, 2003. Expected number of participants: 45

Formating guidelines

The paper (up to 8 pages) should be formatted according to the stylesheet provided at: http://www.cs.jhu.edu/~yarowsky/acl03/workshop_submissions.html

Please send your electronic submissions AND hard copies no later than the 29th of May (absolute deadline) to Prof. Junichi Tsujii (tsujii@is.s.u-tokyo.ac.jp).

Note also that you need to fill out the Copyright Transfer Agreement which you can find at the URL above. It should be signed by ALL authors of the paper. Please fax the agreement to the following number +81-3-5802-8872 (for Prof. Junichi Tsujii) or mail it to:

Prof. Junichi Tsujii
Department of Computer Science
Faculty of Information Science and Techology
University of Tokyo
7-3-1 Hongo Bunkyo-ku Tokyo
113-0033 Japan

Please remember that the inclusion of your paper in the proceedings is contingent upon the registration of one of the authors.

Important Dates

Organisers

Program Committee members

Workshop contact person

 Dr Sophia Ananiadou
 Computer Science, University of Salford,
 Manchester,  M5 4WT, UK
 Email: S.Ananiadou@salford.ac.uk

 Tel: +44(0)161 295 0480
 Fax: +44(0) 161 295 5559