Instructor: Mohit Gupta (mohitg@cs.wisc.edu) | Office Hours: Tue 11 am - 12 pm on Zoom |
TA: Shantanu Gupta (sgupta@cs.wisc.edu) | Office Hours: Thu 12 pm - 1 pm in CS 3224 + Zoom |
Course description: The goal of computer vision is to compute properties of the three-dimensional world from digital images. Problems in this field include reconstructing the 3D shape of an environment, determining how things are moving, and recognizing people and objects and their activities, all through analysis of images and videos.
This course will provide an introduction to computer vision, including such topics as image formation, feature detection, motion estimation, image mosaics, 3D shape reconstruction, and object recognition. Applications of these techniques include building 3D maps, creating virtual characters, organizing photo and video databases, human-computer interaction, video surveillance, and automatic vehicle navigation. This is a project-based course, in which you will implement several computer vision algorithms and do a final project on a research topic of your choice.
This course will assume a reasonable knowledge of linear algebra and calculus as a prerequisite. The programming assignments will be in MATLAB, so a familiarity with MATLAB is essential. We will also upload some tutorial videos of our own on Canvas to help you get started with MATLAB if you are new.
Please speak to the instructor if you are unsure of whether you can take the course.
Class Date | Topic | Homeworks and project |
---|---|---|
T, Jan 24 | Introduction and Fun with Optical Illusions | HW1 out |
R, Jan 26 | Image Formation | |
T, Jan 31 | Image Sensing | |
R, Feb 2 | Binary Images and Processing | |
T, Feb 7 | Image Processing 1: Basic Image Filtering | HW1 due, HW2 out |
R, Feb 9 | Image Processing 2: Fourier-Domain Image Filtering | |
T, Feb 14 | Edge Detection | |
R, Feb 16 | Boundary Detection | |
T, Feb 21 | 2D Object Recognition using SIFT | HW2 due, HW3 out |
R, Feb 23 | Image Alignment | Project proposal due |
T, Feb 28 | Making Panoramas | |
R, Mar 2 | Face Detection | |
T, Mar 7 | Image Segmentation | HW3 due, HW4 out |
R, Mar 9 | Introduction to Neural Networks and Convolutional Neural Networks | |
T, Mar 14 | no class (spring recess) | |
R, Mar 16 | no class (spring recess) | |
T, Mar 21 | Radiometry and Surface Appearance | |
R, Mar 23 | 3D Shape from Photometric Stereo | HW4 due, HW5 out |
T, Mar 28 | no class | |
R, Mar 30 | no class | |
T, Apr 4 | Shape from Focus/Defocus | Project mid-term report due |
R, Apr 6 | Camera Calibration and Shape from Stereo | HW5 due, HW6 out |
T, Apr 11 | Shape from Uncalibrated Stereo | |
R, Apr 13 | Modern 3D Cameras: Structured Light and Time-of-Flight | |
T, Apr 18 | Optical Flow and Motion Measurement | |
R, Apr 20 | Image Tracking | HW6 due, HW7 out |
F, Apr 21 | Advanced Topics: Computational Cameras | |
T, Apr 25 | Project presentations | |
R, Apr 27 | Project presentations | |
T, May 2 | Project presentations | |
R, May 4 | Project presentations | HW7 due |
F, May 5 | Project webpage due |
Grading will be based on a 100 point system. There are two main components: (a) homeworks (60% grade), and (b) final research project (40% grade). Details about these components are given below.
In addition to the homeworks, we will also put up some quizzes on Canvas. However, they are optional, and will not contribute to your grade. They are only to help you check your understanding of the course material.
The course will consist of 7 homework assignments. They will together account for 60% of your final grade. Some homeworks are lighter than others; their relative weights in your final homework grade are determined by the total number of points listed at the top of each homework. Guidelines for completing homework assignments are given here. Please read these carefully.
A discussion for each homework assignment will be created on Piazza. Please post all of your questions on the discussion board so that others may learn from your questions as well. Do not email the professor or TA directly with homework questions.
Include all the files in a zip file named hwX_yourNetID.zip (where X is the homework number) and upload the zip file to Canvas. All homeworks are to be submitted by midnight on the due date.
Only for the homeworks (not project), students will be allowed a total of 5 (five) late days over the semester; each additional late day will incur a 20% penalty.
The final project is research-oriented. It can be a pure vision project or an application of vision technique in the student's own research area. You are expected to implement one (or more) related research papers, or think of some interesting novel ideas and implement them using the techniques discussed in class. Students are encouraged to propose their own project topics, subject to the instructor's approval. Some example projects from previous years' offerings of the course are at this page.
You should work on the project in groups of 2-3. In your submission, please clearly identify the contribution of all group members. Please note that members in the same group will not necessarily get the same grade.
NOTE: You can use any programming language you like for your projects.
There will be three checkpoints: a project proposal, an intermediate milestone report, and a final project webpage. Create a webpage for the project at the beginning. This page will include the proposal, the mid-term report, interesting implementation details, some results (e.g., images, videos, etc.), and so on. Your website should also include downloadable source and executable code. Do not modify the website after the due date of the assignment. Upload the URL of your webpage in the corresponding Canvas assignment BEFORE the due date.
Proposal (5%, due February 23)
This will be a short report (usually one or two pages will be enough). You will explain what problem you are trying to solve, why you want to solve it, and what are the possible steps to the solution. Please include a time table.
Mid-term report (5%, due April 4)
In this report, you will need to give a brief summary of current progress, including your current results, the difficulties that arise during the implementation, and how your proposal may have changed in light of current progress.
Final presentation (15%, from April 25 through May 4)
This will be a short presentation in class about your project. It will be 5 minutes per team. Please add a link to the presentation on the project webpage.
Webpage (15%, due May 5)
Assemble all the materials you have finished for your project in a webpage or wiki, including the motivation, the approach, your implementation, the results, and discussion of problems encountered. If you have comparison results or some interesting findings, put them on the webpage and discuss these as well.
The material covered in the course will be available in the presentation slides which will be fairly self-contained. The slides will be made available via Canvas.
Optional text-book readings include Computer Vision: Algorithms and Applications, by Richard Szeliski. An online version is available, or you may purchase the book at a variety of locations. Also see Robot Vision by Berthold KP Horn and Optics by E. Hecht.
The materials from this class rely significantly on slides prepared by other instructors, especially Shree Nayar, Oliver Cossairt, Peter Belhumeur and Alyosha Efros. Also see other similar courses by Fredo Durand and Bill Freeman (MIT), Peter Belhumeur (Columbia), Irfan Essa (Georgia Tech.), Steve Seitz (U. Washington) and Kyros Kutulakos (U. Toronto). The instructor is extremely thankful to the researchers for making their notes available online.
This course follows the University of Wisconsin-Madison Code of Academic Integrity. Unless specifically authorized by the instructor, all coursework is to be done by the student working alone. Violations of the rules will not be tolerated.