bionlp09_shared_task_sample_data_rev3.tar.gz (8631 bytes)
It contains sample files of shared task data for training and evaluation. The data is in the following file types:
- *.txt - files containing target texts
- *.a1 - files containing annotations for proteins
- *.a2 - files containing annotations forh other entities and events
bionlp09_shared_task_training_data_rev2.tar.gz (721906 bytes)
bionlp09_shared_task_development_data_rev1.tar.gz (146695 bytes)
bionlp09_shared_task_evaluation_tools_v1.tar.gz (14180 bytes)
a2 file converter
generate-task-specific-a2-file_pl (4161 bytes)
*Please rename generate-task-specific-a2-file_pl to generate-task-specific-a2-file.pl after download.
The *.a2 files in the sample training data has all the annotations required to fullfill the all three tasks. For those who do not participate in Task 2 and/or Task 3, some of the annotations are unnecessary, i.e. the 'M' type annotations are only necessary to those who participate in Task 3. With this script, participants can filter out such unnecessary annotations by specifying tasks they are interested in. Note that only the following task specifications are allowed as Task 1 is mandatory:
- generate-task-specific-a2-file.py -t 1 filename(s) : generates *.a2.t1 files
- generate-task-specific-a2-file.py -t 12 filename(s) : generates *.a2.t12 files
- generate-task-specific-a2-file.py -t 13 filename(s) : generates *.a2.t13 files
- generate-task-specific-a2-file.py -t 123 filename(s) : generates *.a2.t123 files
standoff format checker
standoff-check_pl (11940 bytes)
*Please rename standoff-check_pl to standoff-check.pl after download.
It performs format checking for the task specific a2 files, /.a2.t12?3?/. For the detail of its usage and the format, please excuete it without parameters.
eventview_pl (4866 bytes)
*Please rename eventview_pl to eventview.pl after download.
It is a simple text-based event annotation viewer. It is not a fancy viewer, but rather developed to support a quick collection of event patterns in a readable shape. The output is designed based on the assumption that it will be used together with the unix command 'grep'.