The role of reliable data in medicine cannot be underestimated. This applies not only to information describing general population-level phenomena covered in scientific publications, but also to health service records describing individuals. Although text mining methods have been widely applied to the former category, the latter has attracted much less attention. One of the main reasons is that these data were previously stored in a format that made them less accessible for digital processing. i.e., as paper documents, which were frequently handwritten. However, increasing adoption of digital solutions both by health service institutions and individual medical practitioners has started to change the picture. This new situation poses both new challenges and opportunities for text mining methods, since there is potentially valuable knowledge contained in individual medical records. In this project, we aim to analyse medical reports using text mining techniques, with the specific goal of quantifying the risk associated with the evidence described.

This project is being undertaken by NaCTeM in cooperation with a commercial partner, Pacific Life Re. The main task is to analyse an individual’s medical report and determine the level of risk associated with the conditions described.

More information