Skip to main content Link Search Menu Expand Document (external link)

Project

The main deliverable of the course is a semester-long project, designed to give you the open-ended opportunity to either:

  1. Build a computer vision-powered application or demo. Computer vision models are powerful tools to solve exciting real-world problems. Utilizing various image processing techniques, deep learning architectures, and computer vision APIs, these models can function as ready-to-use tools. They can be employed to automate visual inspection processes, perform image-based data analytics, generate or manipulate visual content, enhance real-time video streams, or simply build something visually cool.

  2. Conduct a research project. Should you wish to explore the research aspects of computer vision more thoroughly, we invite you to undertake a research project tailored to your interests. Your focus could be on identifying a specific research topic within the realms of computer vision, image processing, and deep learning. You can conduct comparative studies to uncover the limitations of current CV models, or enhance the overall design—be it through optimizing data pipelines, training objectives, or architectures.

Be aware that the line separating a demo from a research project can be somewhat indistinct; the instructor will assist you in appropriately categorizing your project idea. Additionally, there is no grading preference for either application/demo or research projects, so feel free to select the option that most excites you!

Project logistics

Both project formats may be done in teams of 2-5 students (individual projects are not allowed as teamwork is an essential learning objective). We expect every team member to contribute to the project (and individual contributions should be clearly listed).

We will organize your project progress into two key milestones: (1) a preliminary proposal, and (2) a final submission/presentation. The dates for these milestones will be disclosed soon.

The preliminary proposal should sketch out the research question or application you’re keen to explore, along with the methodology you intend to employ. This should feature a concise overview of the computer vision methods you aim to utilize, as well as a list of potential metrics for evaluating success.

For the final submission, both write up and code repo will be required, regardless of the project format. We will schedule a presentation/poster session for each team to present their work during the final week of the semester. Additional specifics will be provided soon, but anticipate the following:

  • Application / demo submissions to include a functional demo of your application, possibly through platforms like Gradio or Streamlit. This should be accompanied by a brief written explanation that outlines the problem you’re addressing, the computer vision model(s) you’ve employed, and your implementation and evaluation process.

  • Research project submissions to a final report resembling a research paper (ranging from 4 to 9 pages, excluding references) and a code repository to replicate your findings. Clear and succinct writing is crucial; any lack of clarity or unnecessary complexity may lead to point deductions. For projects involving multiple contributors, a delineation of each participant’s role is mandatory. All submissions must be LaTeX-formatted and provided in PDF format (exceptions must be approved by the instructor). Utilizing user-friendly web platforms like Overleaf is strongly encouraged.

Project Proposal Submission Instructions

Please submit your project proposal by [TBD] at 4:00PM ET via Gradescope. Your submission should be in PDF format and include detailed information about your project idea, significance, expected deliverables, potential risks, and a preliminary timeline to monitor progress.

Proposal Feedback Sessions

Take advantage of the opportunity to receive feedback from the instructors and enhance your proposal. We’ve allocated times on [TBD], between 9:00AM-12:00PM and 3:00PM-4:00PM, for proposal review sessions. Each group will be allotted a 15-minute session to discuss your proposal with the instructor. To book your slot, please use the Calendly link provided in the email sent to you.

Access to Computing Facilities

Upon reviewing your proposal, you might be eligible to get access to both NYU HPC and Google Cloud Credit. You are free to use your own computing resources. As you draft your proposal, consider which of these resources would be best suited for your project and mention it in your submission. You are encouraged to build upon current open-source codebases.