NaCTeM Software Tools
The National Centre for Text Mining bases its service systems on a number of text mining software tools.
- Part-of-speech (POS) taggers
- A part-of-speech tagger for English
- GENIA Tagger — Part-of-speech tagging for biomedical text (Web Service )
- Parsers
- Enju — A deep syntactic parser for English
- CFG Parser — A fast CFG parser for English
- GENIA Tagger — Shallow parsing for biomedical text. (Web Service )
- Named entitities/terms
- Named-entity Recognizer — Part of the GENIA Tagger
- NEMine — Recognizes gene/protein names in text.
- Yeast MetaboliNER — Recognizes yeast metabolite names in text.
- ACELA — Tool for efficient annotation of named entitites
- Smart dictionary lookup — machine learning-based gene/protein name lookup
- Smart Dictionary Lookup Tool Web Service — Looks up term variations of a given gene/protein name based on an automatically trained similarity measure
- Term Normalization Tool — Normalizes terms with string rewriting rules automatically generated based on a dictionary.
- DECA — A species disambiguation system for biological named entities
- Other tools
- EventMine — A machine learning-based event extraction system.
- brat — A free, open-source, web-based tool for text annotation visualisation and editing.
- Cafetiere — An easy-to-use text mining system for carrying text mining on your own document collection
- Sentence and paragraph breaker — An accurate sentence and paragraph detector based on heuristic rules
- Clinical Document Classification — automatic document classification demo
- Sentiment Analysis Tool — Analyses sentiment of input text.
Featured News
- NaCTeM quoted in Nature journal news
- BioNLP - call for papers
- NaCTeM joins signatories of open letter to EC Commissioners on licences for Europe
- Keynote speech at Neuroinformatics 2013
- BioNLP ST'13: Data Release and 1st Call for Participation
- New paper on analysis and recognition of negated bio-events
- Biomedical causality corpus








