NaCTeM

Frequently Asked Questions

What does Cheshire bring to NaCTeM?

There are many text mining tools and techniques available to users, each of these works in its own way producing its own output. Cheshire can act as a central controller taking in requests for processing, assigning them to particular pieces of software, storing the output and converting it for use by other programs. All of this is only possible because of the powerful internal architecture and scalability of the Cheshire system. A user can create a workflow to inform Cheshire the order and type of tools to be used in an analysis, which can subsequently be modified to allow for reprocessing. The strength in this solution is that Cheshire can store the intermediary results to save reprocessing from the start when a change is made.

Other benefits include the integration of information retrieval technologies to assist in the filtering of large document collections when working on a focussed analysis. The close integration of the information retrieval techniques and text mining tools allows for this filtering of collections based upon many different properties of the documents, including the results of other tools. For example the Termine service allows a user to investigate a term in a document by presenting a list of other terms closely related to it within a given collection. The service can then provide the user with a list of all documents related to that term and offer to repeat the process, effectively narrowing down the search.

Back