Speaker: Prof. Dr. Iryna Gurevych, Ubiquitous Knowledge Processing (UKP Lab), Computer Science Department, Technische Universität Darmstadt
Title: Analyzing the Collaborative Writing Process in Wikipedia - Case Studies at UKP Lab
Date: Monday 19th November 2012 from 14:00 - 15:00
Location: IT407 in the IT Building (Acessed across bridge from the Kilburn building)

With the emergence of collaborative writing platforms, there is an increasing demand in computational analysis of the collaborative writing process. With over 23 million articles in 285 languages, Wikipedia is a unique corpus for the research of this kind. Not only because of its size and the huge amount of (mostly anonymous) authors this resource is one of the most valuable assets for computational linguistics researchers. It furthermore provides a full edit history and discussion pages for most of its articles, which let the researchers investigate otherwise unobservable processes, i.e. text production, reception and collaboration.

In the talk, I will present a couple of case studies from this area at UKP Lab. We propose an annotation schema for the discourse analysis of Wikipedia Talk pages aimed at the coordination efforts for article improvement. We apply the annotation schema to a corpus of 100 Talk pages from the Simple English Wikipedia and make the resulting dataset freely available. Furthermore, we perform automatic dialog act classification on Wikipedia discussions. The second part of the talk presents FlawFinder, a modular system for automatically predicting quality flaws in unseen Wikipedia articles. It competed in the inaugural edition of the Quality Flaw Prediction Task at the PAN Challenge 2012 and achieved the best precision of all systems.



Iryna Gurevych holds an endowed Lichtenberg-Chair "Ubiquitous Knowledge Processing" of the Volkswagen Foundation at the Computer Science Department of the Technische Universität (TU) Darmstadt, Germany. She got a PhD in Computational Linguistics from the University of Duisburg-Essen. Her research primarily concerns lexical semantic processing, text mining and information management algorithms. She is particularly interested in innovative applications of language processing in digital humanities, collaborative information management, educational research, or forensic linguistics. Iryna Gurevych has authored and co-authored papers in Computational Linguistics, Empirical Methods in NLP, Natural Language Engineering, Language Resources and Evaluation and Semantic Computing.