OSSMETER
Description
OSSMETER aims to extend the state-of-the-art in the field of automated analysis and measurement of Open Source Software, and develop a platform that will support decision makers in the process of discovering, comparing, assessing and monitoring the health, quality, impact and activity of open-source software.
To achieve this, OSSMETER will compute trustworthy quality indicators by performing advanced analysis and integration of information from diverse sources including the project metadata, source code repositories, communication channels and bug tracking systems of Open Source Software projects.
OSSMETER does not aim at building another OSS forge but instead at providing a meta-platform for analysing existing Open Source Software projects that are developed in existing Open Source Software forges and foundations such as SourceForge, Google Code, GitHub, Eclipse, Mozilla and Apache.
OSSMETER is a 30-month small or medium-scale focused research project (STREP) project funded by the European Community’s Seventh Framework Program [(FP7/2007- 2013) [grant agreement number 318736 (OSSMETER)]. It started in October 2012.
NaCTeM's role in OSSMETER
NaCTeM is leading workpackage 4, which concerms the extraction of quality metrics related to the communication channels, and bug tracking facilities of Open Source Software projects using Natural Language Processing and text mining techniques.
Text mining objectives
The objective of workpackage 4 is to derive results that contribute to the overall measurement and evaluation of the quality of user support and the level of user satisfaction over time in relation to Open Source Software. This is carried out through analysis of discussion threads in Open Source Software online forums via:
- classification of Open Source Software online discussion threads in sets of questions and their answers
- identification of contents (problems, solutions, complaints, feedback)
- identification of opinions (positive, negative) in threads
Methods to help achieve this objective will be based on supervised text mining techniques to identify automatically questions and answers in threads, to analyse types of threads (e.g. problems, solutions, complaints) based on the extracted questions and answers in threads. Opinion mining techniques for the classification of sentiment in threads will be based on a combination of supervised methods using statistical, linguistic and pragmatic features, and resources such as Wordnet and Wiktionary. Text mining analysis of online threads at several levels will result in rich multi-layer, feature-based annotations over the input texts, enabling indexing, flexible interrogation, manipulation and re-use in subsequent OSSMETER processes.
OSSMETER website: http://ossmeter.eu
OSSMETER LinkedIn group: http://linkedin.com/groups/OSSMETER-6531488
OSSMETER on Twitter: http://linkedin.com/groups/OSSMETER-6531488
Project team
Prinicpal Investigator: Prof. Sophia AnaniadouResearchers: Dr. Ioannis Korkontzelos, Mr. Paul Thompson
Software Engineers: Jacob Carter, Andrew Rowley
Related publications
Internationally refereed conference proceedings
B. Almeida, S. Ananiadou, A. Bagnato, A. B. Barbero, J. Di Rocco, D. Di Ruscio, D. Kolovos, I. Korkontzelos, S. Hansen, P. Maló, N. Matragkas, R. Paige, J. Vinju, (2015). OSSMETER: Automated Measurement and Analysis of Open Source Software Project showcase at STAF 2015 - Software Technologies: Applications and Foundations
Miwa, M., Thompson, P., Korkontzelos, I. and Ananiadou, S. (2014). Comparable Study of Event Extraction in Newswire and Biomedical Domains. In Proceedings of Coling 2014
Kontonatsios, G., Mihaila, C., Korkontzelos, I., Thompson, P. and Ananiadou, S. (2014). A hybrid approach to compiling bilingual dictionaries of medical terms from parallel corpora. In: Statistical Language and Speech Processing, Second International Conference, SLSP 2014, pages 57-69, Springer
Kontonatsios, G., Korkontzelos, I., Tsujii, J. and Ananiadou, S. (2014). Combining String and Context Similarity for Bilingual Term Alignment from Comparable Corpora. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, pp. 1701-1712, Association for Computational Linguistics
Kontonatsios, G., Korkontzelos, I., Tsujii, J. and Ananiadou, S.. (2014). Using a Random Forest Classifier to Compile Bilingual Dictionaries of Technical Terms from Comparable Corpora. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, volume 2: Short Papers, Association for Computational Linguistics, Gothenburg, Sweden, pp. 111-116, Association for Computational Linguistics
Kontonatsios, G., Thompson, P., Batista-Navarro, R. T. B., Mihaila, C., Korkontzelos, I. and Ananiadou, S. (2013). Extending an interoperable platform to facilitate the creation of multilingual and multimodal NLP applications. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics: System Demonstrations, Association for Computational Linguistics, Sofia, Bulgaria, pp. 43-48
Korkontzelos, I. and Ananiadou, S. (2014). Locating Requests among Open Source Software Communication Messages. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), Reykjavik, Iceland, pp. 1347-1354, European Language Resources Association (ELRA)
Internationally refereed workshop proceedings
J. Williams, N. Matragkas, D. Kolovos, I. Korkontzelos, S. Ananiadou, and R. Paige (2014). Software Analytics for MDE Communities. In Proceedings of the Open Source Software for Model Driven Engineering Workshop (OSS4MDE’14).
Mihaila, C., Kontonatsios, G., Batista-Navarro, R. T. B., Thompson, P., Korkontzelos, I. and Ananiadou, S. (2013). Towards a Better Understanding of Discourse: Integrating Multiple Discourse Annotation Perspectives Using UIMA. In: Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse, Association for Computational Linguistics, Sofia, Bulgaria, pp. 79-88 (LAW Challenge Award)
Ioannis Korkontzelos, Torsten Zesch, Fabio Massimo Zanzotto, and Chris Biemann (2013) SemEval-2013 Task 5: Evaluating Phrasal Semantics. In Proceedings of the 6th International Workshop on Semantic Evaluation (SemEval 2012), Atlanta, Georgia, USA.
Georgios Kontonatsios, Ioannis Korkontzelos, Sophia Ananiadou and Jun’ichi Tsujii (2013). Using a Random Forest Classifier to recognise translations of biomedical terms across languages. In Proceedings of the 6th Workshop on Building and Using Comparable Corpora, Association for Computational Linguistics, Sofia, Bulgaria.
Journal papers
Ioannis Korkontzelos, Dimitrios Piliouras, Andrew Dowsey, and Sophia Ananiadou (to appear). Boosting Drug Named Entity Recognition using an Aggregate Classifier. Artificial Intelligence in Medicine, Special Issue.
Tingting Mu, John Y. Goulermas, Ioannis Korkontzelos, and Sophia Ananiadou (In Press). Descriptive Clustering via Discriminant Learning in a Coembedded Space of Multi-level Similarities. In: Journal of the Association for Information Science and Technology
Book chapters
Mihaila, C., Batista-Navarro, R. T. B., Alnazzawi, N., Kontonatsios, G., Korkontzelos, I., Rak, R., Thompson, P. and Ananiadou, S. (In Press). Mining the biomedical literature. In: Health Care Analytics, CRC Press
Korkontzelos, I. and Ananiadou, S. (2014). Term Extraction. In: Oxford Handbook of Computational Linguistics (2nd Ed.)
Korkontzelos, I (2014). Mining Big Textual Data. Editor: Prof. Stephan Kudyba In: Big Data, Mining and Analytics: Key Components to Strategic Decisions, CRC Press/Taylor & Francis Group
Featured News
- Shared Task on Financial Misinformation Detection at FinNLP-FNP-LLMFinLegal
- New Named Entity Corpus for Occupational Substance Exposure Assessment
- FinNLP-FNP-LLMFinLegal @ COLING-2025 - Call for papers
- Keynote talk at Manchester Law and Technology Conference
- Keynote talk at ACM Summer School on Data Science, Athens
- Congratulations to PhD student Panagiotis Georgiades
Other News & Events
- Invited talk at the 8th Annual Women in Data Science Event at the American University of Beirut
- Invited talk at the 2nd Symposium on NLP for Social Good (NSG), University of Liverpool
- Invited talk at Annual Meeting of the Danish Society of Occupational and Environmental Medicine
- Advances in Data Science and Artificial Intelligence Conference 2024
- New review article on emotion detection for misinformation