Overview
The focus for the last few weeks of the semester will be to propose and complete a final project of your choosing.
You will collaborate in groups of 2 or 3 individuals, preferably within the same lab section.
The deliverables are as follows:
- A project proposal. Your pre-proposal (~1 page) is due November 7. The full proposal (~3 pages), including
adjustments based on my comments and including a timeline, is due November 13. (10%)
- Weekly checkpoint demonstrations in lab (to me and/or classmates). (10%)
- A conference-style paper (~7-8 pages) due December 14. (50%)
- A final presentation on December 21, 9am-12pm. (30%)
- All project related material (e.g., code and data) so that I can verify/replicate all experiments.
Project ideas
Your project must be novel work related to the field of machine learning - it should go
beyond what we have covered in the course in terms of assignments or core lecture materials. More detailed topics, data/paper resources, and advice is available here.
Proposal (10%)
By midnight on November 7, send your pre-proposal (a 1-page abstract) as a PDF. I will use this to provide advice in lab on November 9. Your preproposal should include:
- The name of all group members
- A project title
- A description of your project
- A set of questions that you still need to answer to flesh out the details of the project. I will try to answer these,
but I will also use these questions to gauge how much thought you have put into your project.
By midnight
November 13, you will submit a full proposal (3 pages plus the timeline
as a PDF via your Git repo) with the following :
- The title and group member names
- A central hypothesis. What is the main question you would like to answer (i.e., your goals).
- Provide a background for the problem. You should cite at least 3 papers. The background should connect to the main questions of your paper. For example, if you are attacking a particular application, provide background on the problem and relevant ML work in that domain. You should also cite work related to the algorithms you plan to use.
- What is/are the central algorithm(s) for your project? Do you plan to implement them or to use libraries? If so, which ones?
- What data set(s) are you using? Are you creating them for a real-world problem, or are you using standard repository data sets? Be specific in either case.
- What experiments and what type of analysis do you plan to execute? What do you expect the results to be?
- References for the work cited
- A week-by-week timeline. What are your concrete progress goals? Be very specific and realistic - a couple of sentences for each week (November 16, November 23, November 30 (no meeting), December 7, December 12). Each week, you will update me on which goals you have completed.
Checkpoint (10%)
Each week in lab, I will ask you to update me on your progress (via your git repo checkpoints/README.md file). Your grade will be determined by your ability to:
- Make sufficient progress each week and/or be proactive about seeking assistance in the case of major roadblocks.
- Sufficiently document updated accomplishments and goals (via your checkpoints/README.md)
- Demonstrate your progress (e.g., through code review or analysis of graphs)
- Present a mid-project review to the lab. This will be on December 7 and be good practice for your
final presentation. You will be expected to a) motivate/introduce the goals of your project, b) overview your approach, and c) provide preliminary results. Each group will
have about 7 minutes.
Paper (50%)
Sample papers from CS68 Bioinformatics Project (each of these was machine learning related but there is more of a focus on the applications than some of your projects):
Your final paper is due in the paper/ directory of your Git repository by midnight on December 14.
All relevant figures and tex files should also be present. If you use an online editor (e.g., Overleaf), recompile your final
code on our systems to ensure compatibility.
You have been provided a sample report document in your lab directories.
You must utilize the provided LaTeX style files to produce your report. I am
more than happy to answer questions, but Google, Piazza and Stack Overflow are where I find all my answers so you should
try them first.
Details about the paper requirements are available here.
Note that your grade is not based on how novel your results are, but rather in your ability
to convey your understanding of the problem and analyze your results. My final grade is based on (a) the design and execution of the experiment as well as (b) the thoroughness and readability of the paper.
Presentation (30%)
Your group will present during the exam period on December 21 from 9am-noon. I am to provide ~15 minutes per group plus 2-3 minutes for questions (20 minutes maximum). The grade for this portion is completely based on your delivery, not the difficulty of the project or the impressiveness of results. Having a great project but failing to communicate your design, results, and analysis will result
in a poor grade. Please work on your slides throughout the project, and practice with another group present at least once. All members of the group are expected to present equally. Please follow Prof. Newhall's Presentation Guidelines for tips.
A few general comments:
- Your presentation should use figures and diagrams wherever possible.
In particular, you will probably have to make new figures in addition to what you
plan on putting in your paper. A visual aid is always better than words on a slide.
- Slides should not be cluttered; provide concise outline of main points,
not a transcript of what you are going to say. You don't want the audience
reading your slide, you want them paying attention to you. When in doubt,
use figures and illustrations.
- Practice. Practice. Practice.
The easiest way to handle nerves is to be comfortable with what you plan to say,
and to have given a talk to an audience beforehand.
- The presentation will be in Science Center 181. Please let me know how you plan
to present - using the room Mac or your own laptop. Either way,
do a dry run in that room or similar room with your preferred device.
I will use rubrics that are very similar to the following:
Submitting your Project
You should hand in:
- In the subdirectory paper, place all files required to compile this document, including your images, bib file, and source tex files. I must be able to compile the document from source.
- When I compile your report, the resulting pdf should be include the name of all participants, e.g. agilchr1-tgelles1.pdf. This requires two changes: change TARGET on line 2 of your file paper/Makefile (TARGET=agilchr1-tgelles1) and rename userID1-userID2.tex to match the format of your target e.g., agilchr1-tgelles1.tex. You are still responsible for your
file compiling on our systems if you use an IDE like overleaf.
- In the code directory, all of scripts and main programs
you developed for the project. Include a README describing the purpose of
each file and how to use them to produce the results. Your code should
be well designed and commented. If you used downloaded software, state as much
in the README file.
- Submit data used for evaluation in the repository if it is small in size. If it is large, please contact me
ahead of the deadline to make arrangements. Place a README file as well,
describing the source of the data (paper, website) and any pre-processing
steps you utilized (e.g., throwing out incomplete data). The idea is to
make it easy for me or someone else to re-use your data and replicate your results.