NaCTeM

RobotAnalyst

Introduction

RobotAnalyst was developed as part of the Supporting Evidence-based Public Health Interventions using Text Mining project to support the literature screening phase of systematic reviews. RobotAnalyst is designed for searching and screening reference collections obtained from literature database queries. It combines search engine functionality with machine learning and text mining technology, including topic modelling and relevancy feedback-based text classification models, to minimise the human workload involved in identifying relevant references.

Features

RobotAnalyst offers the following functionality:
  • Create a new collection: Upload a collection of references in RIS format into the system.
  • Update an existing collection: Upload further references or import from PubMed.
  • Faceted Search: Search a collection for relevant studies by applying various filters based on keywords, multiword terms, authors, publication year or journal name.
  • Cluster-based search: Navigate large collections by browsing automatically generated clusters each labelled by descriptive keywords. Choose between different clustering algorithms and the number of clusters.
  • Topic-based search: Identify relevant topics by a keyword search or browse a network graph of the topics. Each topic is associated with a set of references pertaining to the topic.
  • Similarity-based search: Find the references that are the most similar to a particular reference of interest, based on the topics it contains.
  • Mark, annotate and view screening decisions: Mark references for inclusion or exclusion, add notes to references and retrieve results based on screening decisions using the notes to filter the results.
  • Screening with relevancy feedback: Prioritise references by relevancy predictions (inclusion confidence) from RobotAnalyst's text classification functionality and update the predictive model after making new screening decisions.
  • Cluster-based screening: Combine relevancy predictions with clustering to screen only the clusters with the highest proportion of references predicted to be relevant. Alternatively, screen clusters that are predicted to be irrelevant and exclude the irrelevant references.
  • Quality assurance: Re-examine screening decisions, especially those whose predictions deviate from the system's predictions, using RobotAnalyst's filtering and sorting functionality.
  • Exporting results: Export the set of references for any search or screening filters (RIS format).
  • Finishing screening: Estimate the recall rate for relevant references and automatically finalise screening by applying the predictive model's suggestions to any references without manual screening.

Evaluation

RobotAnalyst has been evaluated by systematic reviewers to measure its performance and gather feedback. NaCTeM is presenting the initial evaluation results at the Global Evidence Summit (GES) Cape Town, South Africa, Sept. 13–16, 2017. The workshop Screening evidence for systematic reviews using a text mining system: the RobotAnalyst is organised in collaboration with the National Institute for Health and Care Excellence (NICE) and Cochrane Switzerland. NaCTeM and NICE are organising another workshop at the GES, RobotAnalyst: an online system to support citation screening in evidence reviewing, focussed on an interactive demonstration of RobotAnalyst's functionality and use cases.

Availability

The current version of RobotAnalyst with the above functionality is scheduled to be released September 30, 2017. (Those attending the Global Evidence Summit will have the opportunity to preview the system.) If you wish to use the system, please contact us to request an account.

References

Kontonatsios, G., Brockmeier, A. J., Przybyła, P., McNaught, J., Mu, T., Goulermas, J. Y., and Ananiadou, S. (2017). A semi-supervised approach using label propagation to support citation screening. Journal of Biomedical Informatics

Sato, M., Brockmeier, A. J., Kontonatsios, G., Mu, T., Goulermas, J. Y, Tsujii, J. and Ananiadou, S. (2017). Distributed Document and Phrase Co-embeddings for Descriptive Clustering. In: Proceedings of EACL

Haynes, C., Kay, N., Harrison, K., McLeod, C., Shaw, B., Leng, G., Kontonatsios, G. and Ananiadou, S.. (2016). Using text mining to facilitate study identification in public health systematic reviews. In: Guidelines International Network (G-I-N) conference

Hashimoto, K., Kontonatsios, G., Miwa, M. and Ananiadou, S. (2016). Topic Detection Using Paragraph Vectors to Support Active Learning in Systematic Reviews. In: Journal of Biomedical Informatics, 62, 5965

Mu, T., Goulermas, J. Y., Korkontzelos, I. and Ananiadou, S. (2016). Descriptive document clustering via discriminant learning in a co-embedded space of multilevel similarities. Journal of the Association for Information Science and Technology, 67, 1

Mo, Y., Kontonatsios, G. and Ananiadou, S. (2015). Supporting Systematic Reviews Using LDA-based Document Representations. Systematic Reviews , 4, 172

O'Mara-Eves, A., Thomas, J., McNaught, J., Miwa, M. and Ananiadou, S. (2015). Using text mining for study identification in systematic reviews: A systematic review of current approaches. Systematic Reviews 4:5 (Highly Accessed)

Miwa, M., Thomas, J., O'Mara-Eves, A. and Ananiadou, S. (2014). Reducing systematic review workload through certainty-based screening. Journal of Biomedical Informatics

Contact

To obtain further information about RobotAnalyst, please contact Prof. Sophia Ananiadou.