New article describing annotated corpus for drug reactions
2018-08-15
We are pleased to announce the publication of a new article in the Jourmal of Cheminformatics describing the construction of a new, richly annotated corpus for pharmacovigilance
Thompson, P., Daikou, S., Ueno, K., Batista-Navarro, R., Tsujii, J. and Ananiadou, S. (2018). Annotation and Detection of Drug Effects in Text for Pharmacovigilance. Journal of Cheminformatics, 10:37 .
Abstract
Pharmacovigilance (PV) databases record the benefits and risks of different drugs, as a means to ensure their safe and effective use. Creating and maintaining such resources can be complex, since a particular medication may have divergent effects in different individuals, due to specific patient characteristics and/or interactions with other drugs being administered. Textual information from various sources can provide important evidence to curators of PV databases about the usage and effects of drug targets in different medical subjects. However, the efficient identification of relevant evidence can be challenging, due to the increasing volume of textual data. Text mining (TM) techniques can support curators by automatically detecting complex information, such as interactions between drugs, diseases and adverse effects. This semantic information supports the quick identification of documents containing information of interest (e.g., the different types of patients in which a given adverse drug reaction has been observed to occur). TM tools are typically adapted to different domains by applying machine learning methods to corpora that are manually labelled by domain experts using annotation guidelines to ensure consistency. We present a semantically annotated corpus of 597 MEDLINE abstracts, PHAEDRA, encoding rich information on drug effects and their interactions, whose quality is assured through the use of detailed annotation guidelines and the demonstration of high levels of inter-annotator agreement (e.g., 92.6% F-Score for identifying named entities and 78.4% F-Score for identifying complex events, when relaxed matching criteria are applied). To our knowledge, the corpus is unique in the domain of PV, according to the level of detail of its annotations. To illustrate the utility of the corpus, we have trained TM tools based on its rich labels to recognise drug effects in text automatically. The corpus and annotation guidelines are available at: http://www.nactem.ac.uk/PHAEDRA/.
Previous item | Next item |
Back to news summary page |
Featured News
- CFP: First Workshop on Patient-Oriented Language Processing (CL4HEALTH) - deadline 15/03/2024
- Postdoctoral research position in Athens, Greece. Application deadline: 18th March 2024
- Four-year funded PhD in collaboration with A*STAR, Singapore. Deadline 20 March 2024
- PhD opportunity in collaboration with Athens Univ. of Economics and Business. Deadline 31 Mar 2024
- iCASE EPSRC funded PhD- multimodal NLP - UoM & BAE - Application deadline 30th March 2024
- Invited talk at Annual Meeting of the Danish Society of Occupational and Environmental Medicine
- CFP: BIONLP 2024 and Shared Tasks @ ACL 2024
- Advances in Data Science and Artificial Intelligence Conference 2024
- New review article on emotion detection for misinformation
Other News & Events
- BioNLP 2024 accepted as workshop at ACL 2024
- Junichi Tsujii awarded Order of the Sacred Treasure, Gold Rays with Neck Ribbon
- Chinese Government AwardAward for PhD student Tianlin Zhang
- Keynote talk at EMBL-EBI industry club Machine Learning for Text Mining
- Prof. Ananiadou appointed as Senior Area Chair for ACL 2023 and IJCNLP-AACL 2023