Use Teammaker to form your team. You can log in to this site to indicate your partner preference. Once you and your partner have specified each other, a GitHub repository will be created for your team.
The goal of the project is to give you several weeks to explore an AI topic of your choice in more depth. In the next section are some suggestions, but feel free to consider other ideas (just be sure to discuss them with an instructor first).
Please keep in mind the amount of time you have to do this project; you should plan to do about the same amount of work on the project each week that you would on a normal lab. In other words, don't pick a project you can finish in two days, but don't pick one that would take two months either.
Since it's typically very difficult to know exactly how long something will take, it's best to design your project as a set of project goals as part of a phased development plan. Each phase should build on the previous ones, but each should concievably be something you could write a paper about if necessary. This way, you can work through as many phases as you have time for, but don't run the risk of falling just short of your 'goal' and having nothing to talk about.
Here are some potential project ideas:
Or come up with your own idea (again, be sure to run it by me before you get started). All the algorithms we've used this semester have a multitude of variants you can explore, and many of them can be combined with each other in interesting ways (e.g. combine a CNN with RL to play an Atari game, use a GA to evolve a board evaluation heuristic for MiniMax, etc.).
You are encouraged to make use of existing libraries (e.g. keras, aitk, etc.), as well as other resources you may find on the web. However, keep in mind the standard ethics policy: outside resources are fine, so long as you use proper attribution and it's clear what work you personally did.
As a reminder, this applies both to the coding portion and to the writing portion; any outside resources you use (including things like ChatGPT) need to be properly acknowledged.
In recent years, a large amount of work in machine learning has been motivated by various contests and challenges. One of the earliest and best known was the Netflix prize, which offered $1M to the team that could improve the site's recommendation system by 10%. The Netflix prize was claimed in 2009; since then machine learning contests have become commonplace.
Find a machine learning challenge of your choice from kaggle. Some of these contests are currently active, with prizes available. Others are inactive, but are still interesting challenges to attempt for a project.
Kaggle competitions vary widely in what sort of data and instructions are provided. You should therefore think carefully about the competition you choose: not just "is it a cool problem?" but also "how hard will this data be to work with?" and "how clearly are the expectations of the competition defined?". Please check with me and describe your plan of attack before you get too involved in a particular contest.
In order to download some data sets, you may need to sign up for a free account. Kaggle also has a discussion forum, which may have useful suggestions, especially if you are working on an active contest.
There are many sources of data available on the internet, this is just one example! Feel free to look around and see what's out there.
Some of the project ideas will likely involve large data sets that could quickly blow through your disk quota. To avoid this, you can save them to /scratch (instructions), which is unlimited, but isn't backed up.
As a general rule, /scratch is a good place for things that are large, but can be re-created if they're lost (e.g. data files, program output (if it's big), etc.). You should still keep your source code in your home directory (and in GIT). Definitely don't add giant data files to your GIT repo, though.
Also, take a look at the department's suggestions for long running jobs. As that page suggests, the screen program is very helpful, but remember that your screen sessions will last until you manually end them, so try to avoid leaving dozens of abandoned instances of screen on a server.
By the end of the first week, you need to have turned in the project checkpoint in the file checkpoint.tex. Details about what to include in your checkpoint are provided in the file. We will then take time during lab that day for each group to describe their project idea to the class in 3-5 minutes.
In the LaTeX file, project.tex, you will describe your project. This file already contains a basic structure that you should follow. Feel free to change the section headings, or insert additional sections. Recall that you use the command pdfLaTeX to convert the LaTeX into a pdf file.
Your project repo contains a file called rubric.md. This gives a detailed account of how your project will be graded.
Before the deadline, you need to submit the following things through git:
As your project develops and you create more files, be sure to use git to add, commit, and push them. Run: git status to check that all of the necessary files are being tracked in your git repo.