NaCTeM

Frequently Asked Questions

How are the documents ranked in your output?

The documents are ranked by relevance to your chosen term. The current default algorithm for this is CORI, which was developed by Callan et al at Carnegie Mellon as a database selection algorithm, but can equally be applied to documents within a single database. It is a derivative of the more common TFIDF algorithm, but is enhanced to take into account problematic factors such as different document sizes. This ensures that large documents with more occurrences of a term don't outstrip smaller but more relevant ones.

Back