Skip to main content

Trinity College Dublin, The University of Dublin

Menu Search



Module Descriptor School of Computer Science and Statistics

Module CodeCS7IS3
Module NameInformation Retrieval and Web Search
Module Short Title
ECTS5
Semester TaughtMT
Contact Hours

2 lecture hours per week

Module PersonnelAssistant Professor Seamus Lawless
Learning Outcomes

Having completed the module the student will be able to:

IS3LO1. Explain the process of content indexing in information retrieval including stop word removal, conflation (stemming, string-comparison), and the language dependency of these methods.

IS3LO2. Demonstrate an understanding of the importance and application of data structures in efficient information retrieval, in particular inverted file structures.

IS3LO3. Have knowledge of the theoretical basis and operation of standard algorithms for ranked information retrieval, including the term weighting and ranking models e.g. tf-idf weighting, vector-space model, probabilistic model, language modelling.

IS3LO4. Describe the process of relevance feedback for improved ranking in information retrieval, and apply standard relevance feedback algorithms.

IS3LO5. Understand the importance of evaluation in development of search engines, and the application of standard evaluation metrics such as precision and recall and test collections in measuring effectiveness of information retrieval systems, both in terms of the system's performance and user satisfaction with the system.

IS3LO6. Appreciate the application and operation of search engines in diverse environments e.g. web search, audio-visual search, context-aware and mobile search, patent search, search in microblogs etc.

IS3LO7. Be able to begin to combine technologies relevant to search systems in novel ways to synthesise new information retrieval applications.

Learning Aims

The use of information retrieval techniques and web search technologies to identify relevant information from the enormous volumes of online digital media are rapidly becoming a vital part of everyday life. Online content can take many forms, including: formally published text-based materials, web pages, social media, and audio-visual content. Designing systems which can reliably search and discover information, and effectively deliver that information to users, poses many challenges.

 

This module aims to present students with an in-depth examination of the theoretical and practical issues involved in searching for information across large collections of documents, especially in the context of the World Wide Web. The module introduces relevant approaches from information retrieval and examines search technologies in applications such as web search, image and video search, microblog search and mobile search applications. The module will introduce students to the practical engineering issues raised by the design and implementation of information retrieval systems and the algorithmic approaches used in ranking and evaluation.

Module Content

Specific topics addressed in this module include:

  • Introduction to Web Search
  • Boolean Retrieval
  • Text Processing
    • Stopword Removal, Stemming, Spelling Correction…
  • Index Construction and Compression
  • Probabilistic Information Retrieval
  • Computing Scores for Ranking
    • BM25, Vector Space Model, PageRank…
  • Classification
    • Naïve Bayes, kNN, decision boundaries
  • Evaluation
    • Precision, Recall, F-score, NDCG…
  • Link Analysis
  • Web Crawling
  • Question Answering
  • Personalisation 
Recommended Reading List
  • Christopher D. Manning, Prabhakar Raghavan, Hinrich Schutze:: 2008, Introduction to information retrieval, 1, Cambridge University Press, 506, 978-0521865715 - https://nlp.stanford.edu/IR-book/
  • Ricardo Baeza-Yates, Berthier Ribeiro-Neto: 2010, Modern Information Retrieval: The Concepts and  Technology Behind Search, 2, Addison Wesley, 978-0321416919
Module Prerequisites
Assessment Details

Exam: 50%

Coursework: 50%

Assessment in the Supplemental session will be based on 100% exam.

Module Website
Academic Year of Data2018/19