NaCTeM

KISTI Pathway project










Background

The construction of detailed, machine-readable models of biomolecular pathways is a major goal of systems biology, and hundreds of models capturing the physical entities and reactions involved in various pathways are already available from repositories such as the BioModels Database and the PANTHER Pathway repository.


Fragment of Yeast Cell Cycle pathway model showing references to literature supporting specific reactions.

However, the manual construction, quality control and maintainance of pathway models is a demanding and expensive effort, and one of the key challenges in this effort is the information overload caused by the exponential growth of the biomedical scientific literature: currently, a new citation is added into the PubMed literature database on average once every 40 seconds.

Biomedical text mining systems are increasingly capable of creating rich structured representations of information automatically extracted from literature. Such text mining systems open many opportunities for supporting the curation, validation, and updating of pathway models.

Project

Following the joint signing of a memorandum of understanding, NaCTeM is collaborating with the Korea Institute of Science and Technology Information (KISTI) to develop the next generation of information extraction and text mining systems for supporting and automating various aspects of biomolecular pathway model curation.

Building on the PathText text mining integration technology for pathways, text mining systems such as MEDIE, event extraction tools such as EventMine, we are developing methods for identifying literature relevant to specific reactions in pathway models and for automatically analysing documents to extract event structures that capture the full semantics of pathway reactions.


Visualization of simple event capturing a statement of binding between Whi3 and Cdc28 (from PMID 14685274).

Key among the aims of the project are the development of advanced ranking technology for determining the relevance of documents to given pathway reactions and the extension of the scope of event extraction resources and methods to fully capture the semantics of statements relevant to biomolecular pathways.

Supporting Tools

  • Argo - online environment for collaborative construction of text mining workflows and text annotation.
  • brat - online environment for collaborative text annotation.

BioNLP 2013 Shared Task

To encourage the development of event extraction technology capable of pathway model curation support tasks, we are organizing the Pathway Curation event extraction task as part of the upcoming BioNLP Shared Task 2013.

We will provide task participants with documents relevant to reactions in a variety of signaling and metabolic pathways and full manual event annotation for these documents for use in the training and evaluation of event extraction methods. Please see the BioNLP Shared task 2013 page for more information and updates.


Project Team

NaCTeM Principal Investigator: Prof. Sophia Ananiadou
NaCTeM researchers: Dr. Tomoko Ohta, Dr. Sampo Pyysalo, Dr. Makoto Miwa, Dr. Rafal Rak
NaCTeM software engineer: Dr. Andrew Rowley
KISTI Principal Investigator: Dr. Sung-Pil Choi
KISTI researcher: Dr. Hong-woo Chun

References

The following studies are relevant to the project: