|
|
|
All course material is on the College e-Learning environment (Blackboard). Please go to mymodule.tcd.ie once you are enrolled in this course. For those considering taking the course some information is provided here....
Video Presentation for Junior Sophister students in March 2020
Why take Computer Vision?
Computer vision is a technology which is increasingly in the spotlight and it is important that everyone involved in technology understands the possibilities it presents and the current limitations of the technology. Progressively more complex cameras are being integrated into many devices (Smartphones how have cameras producing depth images as well as colour images, wide angle cameras are being integrated into cars so that a birds-eye image can be created, cameras are appearing in smart glasses, ...), and this in turn is pushing the development of progressively more complex vision systems. Just the existence of all these cameras means that much more video data is being produced (every minute, 720 hours of video was being uploaded to YouTube alone in 2024), and systems are desperately needed which can 'understand' this video so that we can search video (and don't have to try to watch it all).
Since 2012 Computer Vision has been changed (and is changing) by the emergence of Deep Learning which is permitted the development of Superhuman (best than human) recognition systems. However there are huge questions over the reliability of such systems as they are only as good as the data on which they are trained.
This course is about traditional computer vision (i.e. it is not a deep learning course), but we will consider the state of the art in deep learning in computer vision. And I'll try to explain why I don't want to be in an autonomous vehicle which uses a deep learning system!
Traditional computer vision deals with images and video processing, trying to extract information from images and video in a reliable manner.
The ultimate motivation. The motivation for developing computer vision is the human vision system which is richest sense that we have. To us vision seems easy, but in reality we are processing around 60 images per second with millions of points (pixels) in each image. In fact, over half the human brain is involved in processing visual information, and this seems a good indication that this is a very complex task. On top of this, only the photoreceptors in the middle of the eye are sensitive to colour and there is a big blind spot in the retina where the optic nerve is connected, yet somehow we think we see a complete image. Clearly there is more going on here than meets the eye ;-) Ultimately computer vision wants to emulate human vision but this goal is still a very long way away. In the meantime we are applying computer vision to progressly more complex applications.
Interesting and intuitive applications. The course is liberally illustrated with applications as the best way to understand something is usually in context. Hence we look at real applications such as from factory inspection systems to autonomous vehicles, from licence plate recognition to robot interaction with the world, from face recognition to augmented reality, etc.
What's in this course?
- Techniques. The techniques used in computer vision are many and varied. A broad sample of them is presented in this course ranging from basic image processing right through to the extraction of three dimensions from images (using stereo). Experience of using computer vision techniques is provided through use of the industry standard OpenCV libraries.
- Image Processsing (Digitization, Colour models incl. backprojection & Smoothing)
- Histograms
- Binary image processing (Thresholding, Connectivity & Mathematical morphology)
- Region based processing (Connected Components, k-means, mean shift, watershed segmentation)
- Geometric Operations
- Edge based processsing (Edge detection, Contour following
Hough, Least Squared, RANSAC)
- Video based processing (Moving object detection (GMM, Codebook), Tracking (mean shift, optical flow))
- Features (Moravec, Harris, SIFT)
- Recognition (Template and Chamfer Matching, SPR/SVM, Principal Component Analysis, Robust object detection, Deep learning in Vision, HOG)
- Performance (Metrics, ground truth)
- Intro to Deep Learning
- Applications. The techniques are best explained using practical examples, and also (if you think about it) the whole idea is to solve problems. Hence lots of examples are given and the tutorials address solving real world problems using the techniques that you will have learnt. The exam takes a similar form.
- Mathematics. Computer Vision techniques have mainly been developed in a rather ad hoc fashion, and afterwards (when we know that they work) they are formalised using mathematics. Hence there is a fair amount of mathematics in the course (although none of it should really bother you), but it is always preceded by explanations and examples.
- Programming. Computer Vision is also a practical subject and the course attempts to make you aware of how difficult image processing & analysis actually is. It does so by getting you (the student) to write programs (using the OpenCV libraries in C++ under Microsoft Developer Studio) for some of the well known methodologies that have been developed.
What do previous students have to say?
- Some students comments from 2024/25
- "It was very interesting and I hope I will get to use what I have learnt here in real life."
- "Best module this year, Professor Kenneth has a genuine interest and passion for teaching which reflects on his students as well."
- "I thoroughly enjoyed this module but felt the workload was quite far above other 5 ECT modules I've taken this semester (Computer Engineering stream). I would highly recommend to other students as long as theyre willing to put in the work."
- "I enjoyed it and found it very interesting. I enjoyed seeing how the techniques we learned can be applied to solve a vast rage of problems."
- "Very strong, the module structure is far above almost all modules I've taken in this course and I wish this was the standard rather than an exception. The distribution of the module's marks between exam, assignments and mini-tests match the work expected of each of them and reward me for working hard during the semester, without being overly volatile or punishing."
- "One of the best modules I've taken, very well organised and I feel like I've actually gained some understanding instead of just memorising facts."
How is the course taught?
Lectures
- All presentations are made using a combination of Powerpoint, HTML and various movie formats.
- Copies of all Powerpoint lecture presentations are provided electronically through Blackboard (the College e-Learning system). Please note that you need to be taking notes, perhaps on a printout of the slides.
Examples
- The course is example-based. Techniques are generally introduced in the context of examples (most of which work on the provided software system).
- As much hands-on experience of computer vision operations as possible is given through the course. We make use of an industry standard package called OpenCV which provides hands-on experience with many of the operations as well as providing a platform on which to develop your own vision operations.
Notes
- Lecture slides are provided through Blackboard (Trinity's eLearning portal).
- In addition to the lecture notes links to useful tutorials on the web are provided for most topics.
- The course is based partly on a text which was published by Wiley in May 2014:
A Practical Introduction to Computer Vision with OpenCV by Kenneth Dawson-Howe.
Application
- Application of techniques are worked on in group tutorials throughout the semester (around one every week).
- The coursework is based in C++ using the industry standard OpenCV package.
|