GENIA Treebank Beta

The treebank is a bracketed corpus in (almost) PTB style. The current release is a beta version, which has 200+300 abstracts in both XML (.gda files) and PTB (.tree files) formats. The PTB files are automatically converted from XML.

As this is a beta version, please be prepared for errors. (We would appreciate error reports.) To download, go to the download page.

The pages were last updated on the 6th Sep 2005 by Yuka Tateisi.

Department of Information Science, Faculty of Science, University of Tokyo, Hongo 7-3-1, Bunkyo-ku, Tokyo 113, Japan.