Software

Applications
MEDIE
An intelligent search engine for MEDLINERetrieve abstracts or sentences in MEDLINE using deep-parsing results.
Info-PubMed
A GUI-based efficient MEDLINE search toolList the proteins or genes which interact with the given protein or gene.
Corpus
GENIA Corpus
A collection of annotated abstracts taken from MEDLINE database.Technical terms, parts-of-speech, and syntactic trees are annotated. TEI version of GENIA corpus version 3.0 (Dr. Tomaz Erjavec, Jozef Stefan Institute, Slovenia).
NLP Tools
Enju
A deep syntactic parser for EnglishOutput phrase structures and predicate argument dependencies. High parsing speed (more than 20 sentences per second) and high accuracy (88-90% accuracy of predicate argument dependencies).
LRDEP
A shift-reduce dependency parserA shift-reduce dependency parser that uses maximum entropy models for scoring parser actions and a best-first strategy to search for the best parse.
GENIA Tagger
Part-of-speech tagging and shallow parsing for biomedical textsSpecifically tuned for biomedical texts. POS tagging accuracy of 97-98%. Shallow parsing accuracy of 91-94%.
GENIA Sentence Splitter
A sentence splitter for biomedical textsOptimized for biomedical texts. The classifier achieved an F-score of 99.7 on 200 unseen GENIA abstracts.
Machine Learning
Amis
A maximum entropy estimator for feature forests.Parameter estimation algorithm for feature forests. Support GIS, IIS, and limited-memory BFGS
Maxent Classifier
A simple C++ library for maximum entropy classifiersFast parameter estimation using the BLMVM algorithm. Modelling with inequality constraints.
Programming Language
LiLFeS
A logic programming language for typed feature structures.A logic programming language similar to Prolog. Manipulation of feature structures as builtin data structure. High-speed runtime system. C++ library support for feature structures.
Development Tools
RenTAL
A grammar converter from LTAG to HPSGThe conversion guarantees strong equivalence of grammar.
