Web Demonstration
Automatic recognition of multi-word terms and acronyms
After using this service, tell us what you think via our feedback form.
Usage
- Select the data entry method (text, file upload, or URL).
- Choose a POS tagger (TreeTagger for general text, GENIA for biomedical text).
- Click Analyze.
C‑value term extraction benefits from sufficient text for termhood scoring; try the sample buttons for a quick demo.
Limitations
- Documents larger than 2MB are rejected to protect the server; use the TerMine Processing Service for larger jobs.
- Text must be ASCII-encoded.
- Layout of original HTML/PDF may not be reproduced.
- Some HTML/PDF may not be extractable.
Background
TerMine integrates C‑Value multiword term extraction and AcroMine acronym recognition. Terms are recognized using linguistic analysis (POS tagging, extraction of adjective/noun sequences) and statistical analysis (termhood scoring) for scalable, high‑throughput processing.
References
- Frantzi, K., Ananiadou, S. & Mima, H. (2000) Automatic recognition of multi‑word terms. International Journal of Digital Libraries 3(2), 117–132.
- Okazaki, N. & Ananiadou, S. (2006) Building an abbreviation dictionary using a term recognition approach. Bioinformatics.
- GENIA Tagger — POS tagging, shallow parsing, and NER for biomedical text.
- TreeTagger — language‑independent POS tagger.