RobotAnalyst was developed as part of the Supporting Evidence-based Public Health Interventions using Text Mining project to support searching and screening in systematic reviews. RobotAnalyst builds upon state of the art text mining technologies, including topic modelling and feedback-based text classification models, to minimise the human workload involved in the study identification phase.


RobotAnalyst offers the following services to the users:
  • Create a new collection: allows upload of a set of citations, encoded in standardised RIS format, into the system.
  • Update a collection: allows update of an existing citation list (already uploaded to the system) with additional citations retrieved either from your local disk (also a RIS-formatted file) or from PubMed.
  • Faceted Search: a search engine that enables users to search for relevant studies by applying various filters (e.g., keywords, authors, year, type of publication, name of journal, etc).
  • Topic-based search: RobotAnalyst automatically induces clusters of thematically related citations. Additionally, the system generates a network graph in which each node represents a topic (described by a set of keywords) while an edge determines the semantic similarity between two topics. The topic-based search engine allows users to re-order the list of studies in terms of their relevance to a specified topic (i.e., the most relevant studies are placed towards the top of the list).
  • Similarity-based search: Users can retrieve citations related to a given study. The system uses words from titles and abstracts to compute pairwise similarities between citations.
  • Semi-automatic citation screening: To directly reduce the time and cost needed to complete the screening phase of a systematic review, RobotAnalyst implements a feedback-based (i.e., active learning) text classification model that aims to automatically exclude irrelevant studies while keeping all eligible studies in the final review. The active learner is iteratively trained on an increasing number of validated labelled citations. At each learning cycle, the model selects a small sample of automatically labelled citations and interactively requests feedback from the analyst (i.e., a systematic reviewer corrects erroneous predictions made by the model). The manually corrected sample of citations is then used to re-train (update) the model. We have conducted experiments which demonstrate that active learning classification approaches can substantially decrease the screening burden without reducing the sensitivity of the review.
  • Save a screened dataset: The analyst can terminate the screening process once all eligible studies have been identified. RobotAnalyst will then produce two files (RIS format), one for studies to be included and one for studies to be excluded.


RobotAnalyst is accessible via a web interface. It is currently in the evaluation phase, in which systematic reviewers are using it in their tasks in order to measure its performance, gather feedback, improve the interface etc. Once this step is complete, a thorough description will be published and RobotAnalyst will be made freely available. If you wish to use the system in the meantime, please contact us to request an account.


Sato, M., Brockmeier, A. J., Kontonatsios, G., Mu, T., Goulermas, J. Y, Tsujii, J. and Ananiadou, S. (2017). Distributed Document and Phrase Co-embeddings for Descriptive Clustering. In: Proceedings of EACL

Haynes, C., Kay, N., Harrison, K., McLeod, C., Shaw, B., Leng, G., Kontonatsios, G. and Ananiadou, S.. (2016). Using text mining to facilitate study identification in public health systematic reviews. In: Guidelines International Network (G-I-N) conference

Hashimoto, K., Kontonatsios, G., Miwa, M. and Ananiadou, S. (2016). Topic Detection Using Paragraph Vectors to Support Active Learning in Systematic Reviews. In: Journal of Biomedical Informatics, 62, 5965

Mu, T., Goulermas, J. Y., Korkontzelos, I. and Ananiadou, S. (2016). Descriptive document clustering via discriminant learning in a co-embedded space of multilevel similarities. Journal of the Association for Information Science and Technology, 67, 1

Mo, Y., Kontonatsios, G. and Ananiadou, S.. (2015). Supporting Systematic Reviews Using LDA-based Document Representations. Systematic Reviews , 4, 172

O'Mara-Eves, A., Thomas, J., McNaught, J., Miwa, M. and Ananiadou, S. (2015). Using text mining for study identification in systematic reviews: A systematic review of current approaches. Systematic Reviews 4:5 (Highly Accessed)

Miwa, M., Thomas, J., O'Mara-Eves, A. and Ananiadou, S. (2014). Reducing systematic review workload through certainty-based screening. Journal of Biomedical Informatics


To obtain further information about RobotAnalyst, please contact Prof. Sophia Ananiadou.