NaCTeM

Home Aims & Objectives NaCTeM Services Software Services Customisation Text Mining Tools Text Mining Infrastructures U-Compare Argo Text Mining for Biodiversity Mining Biodiversity Project COPIOUS Project Biodiversity Inventory Resources Corpora ACE Meta-Knowledge Anatomy Corpora BioCause ChEBI CHR Controllable Readability COPD GENIA GENIA Meta-Knowledge GREC HIMERA Metabolite and Enzyme MC-Fake Occupational Exposure PHAEDRA PhenoCHF Terminologies Time-sensitive Medical Inventory Other Resources Chinese Biomedical Bio-Lexicon Anatomy Resources Evaluation Terms & Conditions FAQ General TerMine Cheshire TerMine/Cheshire News & Events News What others are saying about us Press and Journal Mentions NaCTeM Seminars People Projects Current Projects AIRC British Heart Foundation EPHOR Mental Health NEDO-AIRC Past Projects 10be5 ADVISES Arabic WordNet ASSIST ASSERT AstraZeneca Project Automated screening for systematic reviews BBC Big Mechanism BOOTStrep Bott and Co. CheTA Clinical Trials COPIOUS DECA eScholar EMPATHY Europe PMC FixRep FLaReNet Graphene HSE Lloyds Infectious Diseases INTUTE ISHER KISTI Pathway META-NET Mining for Public Health Mining the History of Medicine MMPathIC NCS TOX ONDEX OpenMinTeD OSSMETER Pacific Life Re PathText/Refine SLiM Thalia Turing Project Publications Community External Collaboration Vacancies Teaching & Tutorials Contact Us How to Find Us

Anatomy resources

Overview

Anatomical entities such as kidney, muscle and blood are central to much of biomedical scientific discourse, and the detection of mentions of anatomical entities is thus necessary for the automatic analysis of the structure of domain texts.

This page provides various tools and resources related to anatomical entities and their detection in text.


Anatomical Entity Mention (AnEM) corpus

To advance automatic anatomical entity mention detection, we have created the AnEM corpus, a domain- and species-independent resource manually annotated for anatomical entity mentions using a fine-grained classification system. The corpus consists of 500 documents (over 90,000 words) selected randomly from citation abstracts and full-text papers with the aim of making the corpus representative of the entire available biomedical scientific literature. The corpus annotation covers mentions of both healthy and pathological anatomical entities and contains over 3,000 annotated mentions.


example annotations

To allow the corpus to serve as a reference standard for the development and evaluation of methods for anatomical entity mention detection, we make the corpus available under the open CC-BY-SA licence and provide standard train/test splits and evaluation tools.

Corpus description

The AnEM corpus is presented in the following manuscript:

Annotation visualisations

The AnEM corpus annotations can be browsed using visualisations created using the brat tool here: browse AnEM data online.

Downloads

The AnEM corpus data and evaluation tools as well as a set of supplementary data (feature representations, models, system outputs and evaluation results) are available for download:


Licence

1. Annotations

The annotations in the anatomy resources are copyrighted and licensed under the Creative Commons BY-SA 3.0 license.

Briefly, under this open licence, you are free to use and build on these resources as long as you attribute them correctly and distribute works building on the resources under a similar licence (click here for details).

Please attribute the resources by citing the DSSD'12 paper (see below) in publications and linking to this page in online resources.

2. Texts

The abstracts contained in the anatomy resources are from PubMed, a database of the U.S. National Library of Medicine (NLM). Please see the NLM page on copyright information regarding the copyright of the abstracts.

The full text extracts contained in the anatomy resources are from articles in the Open Access Subset of the PubMed Central (PMC) database of the NLM. These articles are made available under a Creative Commons or similar licence. Please see the REFERENCES file in the distribution for references to the articles and the PMC version of each article for the specific license terms.


References


See also

  • Previously released Anatomy resources (content of linked page is due to be merged with this one, please link to this page instead)

Contact

For any queries relating to the corpus, please contact: sampo pyysalo at gmail dot com.