BioNLP'09 Shared Task on Event Extraction
in conjunction with BioNLP, a NAACL-HLT 2009 workshop, June 4-5 2009, Boulder, Colorado
NOTICE:

This page contains answers to questions regarding the shared task. If you have a question that is not answered here or in the shared task description, please contact the organizers at the contact email address.

Questions and answers:

What tools and resources can be used?

In short: with the exception of manually created annotations for test set sentences, any tools and resources can be used, including the gold annotation for the development test set.

Following experiences in previous domain shared tasks, the BioNLP'09 shared task will not contain separate "open" and "closed" tracks. Instead, any available tools and resources can be used. The one important exception is that manually created annotations of the test set data can not be used. While no Event annotation for this data has been previously published and thus all of the GENIA Event corpus data can be freely used, other corpora published previously or to be published in the near future do contain annotations for some aspects of parts of the test data.

To make it possible for participants making use of separate, large-scale annotated PubMed resources to check that no documents in the test set are included, we are making available the following list containing the PubMed IDs of documents for which manually created annotation may not be used:

For convenience, we will also seek to provide versions of other corpora that exclude test set documents.

Which entities need to be recognized?

As gold standard annotations are provided for named entities of the protein/gene/RNA types, no entity recognition is necessary to participate in Task 1. However, for Task 2 the recognition of entities relevant to the events (e.g. Site for Binding and Phosphorylation events) is required, as described in the Event Definition. It is not necessary to assign types to these entities.

How should additional entities (e.g. binding sites) be recognized?

The development of entity recognition methods for additional entities such as the Sites for Binding events is a central part of Task 2. We expect taggers such as those integrated in U-compare to be useful.

Participants are encouraged to make use of the annotations for these entities in the training data .a2 files, with the caveat that the annotation is not exhaustive: entities not relevant to events are not annotated. Additionally, other resources can be used (see question "What tools and resources can be used?").

Which trigger words need to be detected?

Participants need to detect trigger words for events, but not for negation and speculation.

How is event trigger detection evaluated?

Performance at the identification of trigger expressions (the words expressing events) will only be considered together with events in determining whether or not an event has been successfully extracted.

Event trigger extraction will thus not be separately evaluated, and trigger annotations not specified as expressing an annotated event are ignored in evaluation. Additionally, we will provide results using multiple criteria, including also a set that does not require correct identification of the event triggers.

Why are no agents specified for events?

The event specifications do not include an "Agent" slot, but entities stated as causing events such as phosphorylation are annotated, using a separate Regulation event.

Following the GENIA Event annotation scheme, in statements such as "Histone H3 phosphorylation by IKK-alpha" the enzymatic function of IKK-alpha for the Phosphorylation event is interpreted as a causal factor and annotated as a Positive_regulation event. Thus, for this statement, annotation would include (with T1:Histone H3 and T2:IKK-alpha)

E1 Phosphorylation:T? Theme:T1
E2 Positive_regulation:T? Theme:E1 Cause:T2

(where "T?" are the identifiers of the event triggers, not relevant to this example.)

Can multiple "runs" be submitted?

Only a single final submission of results is allowed. Participants are strongly encouraged to test their methods and submission process prior to final submission on the separate development test data set that will be provided along with the training data. Multiple test submissions using the development test data are allowed.

What are the precise evaluation criteria?

The detailed evaluation criteria will be given and a web interface for evaluating development test submissions will be made available on the release of the training data.

Will results be made public?

The evaluation results will be provided to all participants and made publically available, but the identity of the participants will be blinded. Each participant can decide whether to publicize their own identity or not with full knowledge of their resuts and rank among the shared task participants.