Evaluation in natural language processing

Lecturer: Diana Santos
Type: Foundational Course
Section: Language and Computation
Week: TBA
Time: TBA
Webpage: http://www.linguateca.pt/Diana/esslli07.html


What are the purposes of evaluation. Different kinds of evaluation (of an hypothesis, of a resource, of a system in terms of its requirements, of a system in termos of usability, of model adequacy, of economical impact). Measures and concepts (properties of measures, relationship with desirable properties, statistical remarks). Evaluation of user-visible vs. user-transparent tasks; black-box vs. glass-box evaluation. The evaluation contest paradigm. Evaluation resources (golden resources, pooling, ablation). Baselines, ceilings, inter-annotator agreement. Corpus-based evaluation. Detailed examples: parsing, information retrieval, information extraction, machine translation, morphological analysis, and generation.