GENIA Sentence Splitter (GeniaSS) [1] is a sentence splitter optimized for biomedical texts. GeniaSS reads a text and splits it into sentences by inserting line breaks.


First, GeniaSS detects candidate positions for splitting using selected delimiters: periods, commas, single/double quotation marks, right parentheses, etc. Then, it classifies whether each candidate really splits the sentence or not.

How to use

1) make
2) ./geniass arg1 arg2

arg1 is a target file to split. arg2 is an output file name. If you want to get stand-off format file, please run

3) ruby sentence2standOff.rb arg1 arg2 arg3

arg1 and arg2 are same as 2). arg3 is an output stand-off file name.

Note: you need to run GeniaSS in the directory which includes EventExtracter.rb, Classifying2Splitting.rb, model1-1.0.


