Skip to main content

Trinity College Dublin, The University of Dublin

Menu Search



Module DescriptorSchool of Computer Science and Statistics

Module CodeCS4400
Module NameScalable Computing
Module Short Title
ECTS5
Semester TaughtMichaelmas Term
Contact Hours

Lecture hours: 22, Tutorial hours: 11

Total hours: 33

A significant individual project is undertaken over the semester in student’s time, with contact hours scheduled in support of that activity. Students are also expected to engage with staff online as necessary to progress their work.

Module PersonnelAssistant Professor Stephen Barrett
Learning Outcomes

When students have successfully completed this module they should be able to:

  • Describe the basic characteristics, structure and operation of a distributed system, and the issues which a distributed system poses to a systems architect;
  • Identify and evaluate appropriate architectural models for distributed problem scenarios;
  • Design, construct, document, deploy and test scaled distributed system solutions to realistic real-world problems;
  • Reason about the performance trade-offs of decentralised architectures;
  • Make use of appropriate documentation and reference material.
  • Develop a strategy for parallel implementation of a complex algorithm
  • Utilise container and virtualisation infrastructure proficiently.
  • Describe and analyse social structures as employed in social web
  • applications, and develop behaviour measurement infrastructure.
  • Consider the ethical and engineering issues regarding data sovereignty. 
Learning Aims

This module aims to provide a theoretical and practical understanding of the Internet as a high scale application platform that is becoming increasingly important economically and socially. It covers the fundamental content, social and meta-data structures that make up the web and how they can be represented, analysed and manipulated. It addresses the practical tools and techniques of internet application programming, ranging from introductory client/server programming techniques through to high scale user-facing service delivery and analytics focussed computation over large scale data, covering the key architectures and techniques relevant to today’s deployments.

Students will complete significant practical work building to an end-to-end cloud based individual project.

Module Content

The module will be delivered in 4 phases, sequenced as described below. Each phase will be motivated by a lecture series, but will also require focus on and delivery of graded practical development work. Students are cautioned that this practical programme will build from week to week towards a final project output. Grade values for each phase are noted.

 Phase 1: Introduction to Distributed Systems Programming (Weeks 1-3 – 20%)

 This phase will equip students with a practical engineering method and capability to design, develop, test and deploy classic client/server distributed systems, operating on cloud infrastructure. A selection of distributed systems theory topics will be reviewed on a case study basis, covering such topics as data synchronisation and replication, fault tolerance, CAP, and so forth. A socket based development task will be set necessitating practical design derived from these studies.

 Phase 2: Web Application and Service Development (Weeks 4-6 – 20%)

 This phase will introduce core web programming theory, technologies and techniques, and review practical systems architectures. Topics will include an overview of the concept of web stacks (ie. LAMP) and relevant systems and technical architectures, Web service design and development, RESTful service design. A development task focussed on web service development will be set, combining individual distributed components to deliver a unified service.

 Phase 3: Introduction to Computational Analytics (Weeks 7-9 – 10%)

 Towards reading week, the topic of web analytics will be introduced via a set of readings and lectures that will invite the student to develop an understanding of the nature and potential of large scale computation over internet based data sources and end-user activity. Key topics include social network design and measurement, social network analysis, practical network computation and visualisation, and the ethics and practical implications regarding data sovereignty and privacy in the face of these trends. Under guidance, students will develop and submit a short written report for an individual data analytics case study, that they will implement in phase 4.

 Phase 4: Scalable Service Delivery (Weeks 7-12 – 50%)

 The final phase of this module will focus on methods for the design, implementation and execution of scaled social network analytics, delivered as data service, that students will be required to deliver. Topics in this practically focussed phase will include models for large scale data processing, relevant technologies, such as Object store, GFS, Map-Reduce, and BigTable, the emergence of dedicated analytics architectures, stream processing, and various application architectures for B2B, B2C and C2C scenarios. Topics will be introduced in terms of their relevancy as design patterns for students’ design and implementation work.

Recommended Reading List

Graham Hutton, Programming in Haskell, Cambridge press, 2016.

Extensive use will also be made of research papers and other material from the literature and ongoing research examples from our work at TCD.

Module Prerequisites

Students are required to be competent in at least one high level programming language (e.g. Python, Java, C++, C# etc.). Previous experience with concurrent programming is beneficial but concurrency will be reviewed in the module. Haskell is the module programming language. Student’s unfamiliar with functional programming in practice should devote some preparatory time to building simple Haskell based systems. Consult the module web site for more details and suggested readings.

Assessment Details

100% coursework.

A summary of assessment follows. The final grade awarded will be a simple accumulation of grades achieved in each element. Assessment of each component will be based on the quality of the student’s solution design and implementation, and the development methodology employed.

 

Phase 1: Introduction to Distributed Systems Programming (Weeks 1-3 – 20%)

A socket based development task will be set that must be implemented and deployed using a defined engineering methodology and deployment infrastructure.

 

Phase 2: Web Application and Service Development (Weeks 4-6 – 20%)

Students will complete a prescribed development task focussed on web service development.

 

Phase 3: Introduction to Computational Analytics (Weeks 7-9 – 10%)

Students will deliver a short written report for an individual data analytics case study, this describing a social network measure of individual or community behaviour within a target social network.

 

Phase 4: Scalable Service Delivery (Weeks 7-12 – 50%)

Students will deliver a scaled social network analytics system, implemented as a web accessible data service, capable of gathering and processing social network data to yield measurement of network members as described in the phase 3 report.

Module Website
Academic Year of Data2017/18