GENIA LOGO

GENIA Project Home Page

Welcome to the homepage of the GENIA project at the Tsujii laboratory of the University of Tokyo.

What's New

12 July 2005
Added 300 abstracts to GENIA Treebank beta version.
7 Oct 2004
The content of the CDROM distributed at BioNLP/NLPBA 2004 is available from the download page.
17 Sept 2004
Report on Bio-Entity Recognition Task at BioNLP/NLPBA 2004 is released.
16 Aug 2004
GENIA Treebank beta version is available for download.

Project outline

The GENIA project seeks to automatically extract useful information from texts written by scientists to help overcome the problems caused by information overload. We intend that while the methods are customized for application in the micro-biology domain, the basic methods should be generalisable to knowledge acquisition in other scientific and engineering domains.

We are currently working on the key task of extracting event information about protein interactions. This type of information extraction requires the joint effort of many sources of knowledge, which we are now developing. These include a parser, ontology, thesaurus and domain dictionaries as well as supervised learning models.

Members

Research Subtopics

Resources

GENIA Corpus

GENIA-based Corpus by Third Parties

Automatically Parsed MEDLINE Abstracts

Tools

Part-of-speech Tagger

Shallow Parser

Parser

XML File Manager

Presentations

Download

Publications


Links



The pages were last updated on the 19th Aug 2003 by Yuka Tateisi.

Department of Information Science, Faculty of Science, University of Tokyo, Hongo 7-3-1, Bunkyo-ku, Tokyo 113, Japan.

The GENIA project was partially supported by Grant-in-Aid for Scientific Research on Priority Areas (C) "Genome Information Science" from the Ministry of Education, Culture, Sports, Science and Technology of Japan and JST, CREST.