Text Mining Methods for Real Time Intelligence on Graphene Enterprise
The project aims to develop new data sources and methods for real-time intelligence to understand and map enterprise development and commercialisation in a rapidly emerging and growing new technology. More specifically, the project focusses on new venture and small and mid-size (SME) enterprise development and commercialisation of graphene. This is a nanoscale two-dimensional material with exceptional properties holding great promise for path-breaking applications across a range of domains including electronics, medicine, batteries, and sensors. The field is expanding rapidly, with thousands of new patents and hundreds of companies already entering the graphene domain.
Project goalsThe project will develop novel and scalable methods to mine and combine information from three sources:
- unstructured enterprise webpages;
- unstructured data from Twitter; and
- data from established structured databases, including data on patenting.
Web pages are used to extract information on enterprise business strategies, trials, tests and new products, funding, managerial and ownership developments, and relationships with other businesses and research organisation. Twitter feeds are accessed and sourced to provide data on fast-breaking developments related to graphene, including developments associated with start-ups and SMEs. Databases on publications and patent applications (such as the Web of Knowledge and Derwent Innovations) are accessed to validate company names and corroborate the presence (or absence) of intellectual property applications and grants by graphene-related topic areas.
Outputs from the information extraction suite are stored in a repository at processing time, so that the information is available on the fly at demonstration time. For instance, users can retrieve graphene based products grown on specific substrates (e.g., epitaxial graphene grows on SiC), properties of graphene (e.g., conductivity, flexibility), which companies produce which products, information about companies, e.g., location, partnerships, funders, social media environments used.
The project is funded by NESTA.
- New article on using neural architectures to aggregate sequence labels from multiple annnotators
- New article on improving biomedical extractive summarisation using domain knowledge
- New article on automated detection and analysis of depression and stress in social media data
- Keynote Talk at womENcourage 2022
- Junichi Tsujii awarded Order of the Sacred Treasure, Gold Rays with Neck Ribbon
- Prof Juni'ichi Tsujii receives ACL Lifetime Achievement Award 2021
- NaCTeM's HSEarch semantic search system mentioned in the news
- Prof. Sophia Ananiadou featured on the 2021 North Innovation Women list
- New article in BMJ Open examining change over time of women’s health in clinical studies
Other News & Events
- Invited talk to Evotec
- Invited talk to ML-Labs, Ireland
- Keynote speaker and panellist at NLDL 2022
- Keynote Speaker at Northern Lights Deep Learning Conference 2022
- Vacancy for Postdoctoral Researcher in NLP at DTU Copenhagen working with NaCTeM