Batch submission to TerMine (request access)
When you submit a batch request, your job will enter a queue. When your job is complete, you will receive an email containing the URL where you can view the results.
Please note: if you want to analyze a PDF document, you must specify a URL. PDF uploading is not currently supported.
About the C-value and TerMine ...
Technical terms are important for knowledge mining, especially in the bio-medical area where vast amount of documents are available. A domain independent method for term recognition is very useful to automatically recognize terms from documents.
C-value is a domain-independent method for automatic term recognition (ATR) which combines linguistic and statistical analyses; emphasis being placed on the statistical part. The linguistic analysis enumerates all candidate terms in a given text by applying part-of-speech tagging, extracting word sequences of adjectives/nouns based, and stop-list. The statistical analysis assigns a termhood to a candidate term by using the following four characteristics:
- the occurrence frequency of the candidate term
- the frequency of the candidate term as part of other longer candidate terms
- the number of these longer candidate terms
- the length of the candidate term
We have been developing a system for terminological management called TerMine. It employs the C-value method to extract terms. The implementation is optimized for scalability and processing speed: given a set of 1.3 million MEDLINE abstracts (2GB text), TerMine (standalone version) extracts 9.8 million term candidates and their termhood scores in about ten minutes.
Featured News
- Text mining enhances Educational Evidence Portal - new article and demo site
- Medal of honour awarded to Professor Tsujii
- Improved acronym disambiguation - release of updated software service and paper
- Species disambiguation of biomedical named entities- release of software, corpus and article
- Launch of new features on UKPMC website
- New Biomedical Event Corpus (GREC) released
- ELRA Distribution Agreement signed for BioLexicon





