CS 4501 Computer Vision (Spring 2025)

University of Virginia

Guidelines and suggestions for course projects

Your course project provides an opportunity for you to explore an interesting problem in area of computer vision. Any topics in computer vision, such as image recognition, image filtering, multiview 3D reconstruction, the applications of 3D vision in other domains (e.g. robotics, ecology), are all acceptable. Your project will be worth 35% of your final class grade, and will have four deliverables:

  1. Proposal proposal : 2 pages excluding references (2%)
  2. Mid-term report : 4 pages excluding references (8%)
  3. Final presentation and report : 8 pages excluding references (25%)

All write-ups should use the CVPR style.



Team Formation

You are responsible for forming project teams of two or three people. In some cases, we will also accept smaller or larger teams, but a 2-3 person group is preferred. If you have trouble forming a group, please send us an email and we will help you find project partners.

Scope

As a broad target, the final project should involve approximately as much work as a homework assignment in a typical graduate-level course for each student in the group. Thus the total work should scale roughly linearly with the group size, and be distributed roughly equally. An ambitious, well-done project from a group of two should be on the order of a conference paper in depth of experimentation.

Project Proposal

You must turn in a brief project proposal that provides an overview of your idea and also contains a brief survey of related work on the topic. We will provide a list of suggested project ideas for you to choose from. However, you're very encouraged to find project ideas that you are excited about. A great starter is the papers we will discuss in our courses. We will provide feedbacks based on your project proposals and presentation.

Proposals should be approximately two pages long, and should include the following information:

  • Project title and list of group members.
  • Overview of project idea. This should be approximately half a page long.
  • A short literature survey of 4 or more relevant papers. The literature review should take up approximately one page.
  • Description of potential data sets to use for the experiments.
  • Plan of activities, including what you plan to complete by the presentation date and how you plan to divide up the work.

The grading breakdown for the proposal is as follows:

  • 40% for clear and concise description of proposed method
  • 40% for literature survey that covers at least 4 relevant papers
  • 10% for plan of activities
  • 10% for quality of writing

The project proposal will be due at 11:59 PM on Friday, 02/21/2025, and must be submitted via Canvas.

Mid-term and Final Report

Your final report is expected to be 8 pages excluding references while mid-term report has 4 pages excluding references, in accordance with the length requirements for a CVPR paper. It should have roughly the following format:

  • Introduction: problem definition and motivation
  • Background & Related Work: background info and literature survey
  • Methods
    • Overview of your proposed method
    • Intuition on why should it be better than the state of the art
    • Details of models and algorithms that you developed
  • Experiments
    • Description of your testbed and a list of questions your experiments are designed to answer
    • Details of the experiments and results
  • Conclusion: discussion and future work

The grading breakdown for the final report is as follows:

  • 10% for introduction and literature survey
  • 30% for proposed method (soundness and originality)
  • 30% for correctness, completeness, and difficulty of experiments and figures
  • 10% for empirical and theoretical analysis of results and methods
  • 20% for quality of writing (clarity, organization, flow, etc.)

The project mid-term report will be due at 11:59 PM on Friday, 03/21/2025, and must be submitted via Canvas.

The project final report will be due at 11:59 PM on Friday, 04/28/2025, and must be submitted via Canvas.

Note that late days do not apply to the final report.

Presentation

Presentation skill is critical in your career. A good presentation should convey your project to the target audience clearly and efficiently. Our course provides many great opportunies for you to practice, such as paper presentation, project proposal presentation, and final project presentation.


Project Suggestions

You are encouraged to propose your own topics – some of you already may have done so. Take a look at the the resources listed at the end of this page for potential topics. Below are some ideas:

  • Take a look at the latest papers from CVPR, ECCV, ICCV, NeurIPS, and ICML to find topics, software, datasets which you can build upon.
  • Also check out the workshops associated with these conferences. For example, take a look at the recent Fine-grained Visual Recognition workshops for datasets and Kaggle challenges related to fine-grained classifiction tasks such as recognizing animal species, or product images.
  • The website https://paperswithcode.com tracks the state of the art across datasets. This is a quick way to find baselines to compare with or build upon.
  • The wesbite https://registry.opendata.aws contains a number of publicly available datasets hosted on AWS. These include satellite imagery, RADAR and other data on which you can try out some computer vision techniques.
  • Explore the use of computer vision services on the cloud to solve some challenging problems. Some choices are AWS Rekognition, Google cloud, and Microsoft Azure.
  • Generative modeling: Train and generate data on novel domains using GAN, VQ-GAN, Diffusion models. Build an interface for interactively edit images.
  • Probing and understanding multimodal models: Models such as Llama 3.2 or Qwen2.5-VL show powerful multimodal understanding capability. Try to probe these models on specific domains, such as medical imaging, remote sensing, with tasks such as image classification, object grounding, to understand their capabilities and biases.

List of ideas for inspiration:

  • Scene text recognition
  • Improving object detection using depth estimation
  • Dust removal from images
  • Fast face-retrieval using vocabulary trees on deep features
  • Hyperspectral image classification
  • Character recognition in movies
  • Analysis of medical images
  • Stereo reconstruction
  • Counting heads in images
  • Implementation of a VR engine
  • Poselet based person identification
  • Gaze tracker
  • Photo stitching across seasons/day-night
  • Real-time tracking of wildlife for non-invasive ecological monitoring and research
  • Investigate and analyze the failure modes of multimodal language models on vision tasks
  • Design a model for generating diverse and realistic 3D room layouts from textual descriptions
  • Investigate CV model (e.g. detection, depth estimation) performance in adverse weather
  • Create a model for disaster (e.g. wildfire, flood) prediction from satellite imagery
  • Create a model for human emotion recognition from video

You could also find some inspirations from other online courses such as Standford CS231n.


Computing Resources

Some vision projects may involve large scale data and require GPU computing resources. We recommend you to check out “AWS Education” and “Google Cloud Platform”.

  • AWS: https://aws.amazon.com/education/awseducate
    • UVA is an “AWS member institution”, so you are in the higher allowance tier. Use your .edu email and the full school name “University of Virginia” when you register to get the full benefits (a total of $100 annually).
    • To get GPUs, use g3 (up to 4 NVIDIA Tesla M60 GPUs) or p2 (up to 16 NVIDIA K80 GPUs) instances in EC2. Check the pricing first and make your plan accordingly!
  • Google Cloud Platform: https://cloud.google.com
    • You get $300 credits for the first 12 months, and always free on their free-tier resources (not including GPUs)