Introduction
For the final project, you will be building a system that will be submitted to Semeval-2019, Task 4: Hyperpartisan News Detection.
The full project description and deadlines are listed below. Some items may become more fleshed out, but all changes will be noted carefully so that you don’t overlook them.
Project Proposal
Due 11:59pm Wednesday November 21
To start your final project, you will submit a project proposal.
The project proposal should be 1-2 pages long and include the sections described below. You are required to write your proposal in LaTeX using the NAACL-2019 style and format guidelines. You must use either the LaTeX or Overleaf template.
You’re allowed to work in groups of up to three for the final project; your proposal should list all of your team members.
Note: You may want to make the following alterations to the naaclhtl2019.sty
file while you are working on the paper. These instructions apply to both the LaTeX and Overleaf templates:
-
Comment out lines 358-360 by placing a percent sign at the start of each line. When you are done, it will look like this:
% \AtTextUpperLeft{\%confidential % \put(0,\LenToUnit{1cm}){\parbox{\textwidth}{\centering\naaclhv\confidential}} % }
-
Replace line 377:
- Old line:
\bf Anonymous NAACL submission
- New line:
\bf\@author
- Old line:
Project Description
Your project will involve building a system to complete the Hyperpartisan News Detection task, but there are many different directions you could choose to explore within that task. At a high level, what question(s) are you trying to answer with your project? What kinds of things will you try? How will you know if they are successful?
Reading List
List at least three relevant academic papers that you will read for your project. If you are choosing to explore a technique or method that we haven’t covered at all in class, you should also list at least two sources that you will use to learn the basics of the topic. You don’t have to find papers where people wrote about this exact task. There are plenty of papers about sentiment classification that you may find interesting and relevant. The papers you find this week might not be papers you’d like to include in your final project but it’s good to start looking around to see what’s out there. (You don’t have to have read these papers carefully at this point, but you’re welcome to summarize them here if you’ve had the chance to read them.)
Data Set
The Hyperpartisan News Detection task organizers have provided you with 800,000 training articles and 200,000 validation articles. Are there other datasets you want to use for your project? What, if any, help will you need getting access to them? How are they formatted? Are they labeled? For what? How big are they? What, if any, pre-processing will you need to do to make them appropriate for your task?
Preliminary Results
You already have preliminary results for a baseline system from Labs 07 and 08. These should be included in your proposal. Now would be a great time to learn how to make tables in LaTeX!
Planned Methodologies
This section may not be fleshed-out at this point, and that’s ok, but you should demonstrate that you’ve started thinking about what your approach to the task will be. In particular, this section should discuss both the types of models you will build, the features you might want to extract, and the way(s) that you will evaluate your model(s).
Project Update (due Dec 3, 2018)
For your final project update, you should submit a 2-3 page writeup that includes all of the information discussed below. You are required to write your proposal in LaTeX using the NAACL-2019 LaTeX template or Overleaf template.
Literature Review
For the papers that you listed in your proposal (and any others that you’ve found and read since then) describe the following:
- What were the main objectives of the paper?
- How do the objectives relate to your project topic?
- How is what you’re doing similar to what the authors of this paper did? How is it different?
- How sound do the methodologies of the paper seem to be?
- What (briefly) were the results of the paper?
- What can you take from the paper to inform what you’re doing on your final project?
Changes to the Proposal / Current Methodologies
If anything has changed with respect to your project description, data set, or methodologies, you should describe the changes. Why did your original plan not work out, and how do you expect your new plan to be better? In particular, many groups will probably have updated methodologies here, including more detail on what they have done and are planning to do after spending more time with the literature and the data. If your group’s methodologies were complete and have not changed since the proposal, you should include a copy of them here.
Updated results
Ideally, you will have updated results since Labs 07 and 08 by now. If you don’t have updated results, you should explain why, and should outline a plan for having both updated and final results in time to write up the final report.
Project Report (due Dec 18, 2018)
For your final project, you should submit a 5-6 page writeup. You are required to write your project report in LaTeX using the NAACL-2019 LaTeX template or Overleaf template.
Abstract
In a single paragraph, give the “elevator pitch” for your project. What did you work on? How did you approach it? How did it work? What are the main take-aways from your work?
Introduction
Describe the task that you are working on, and very generally describe your approach. Why is the task interesting? Why is it hard? How does it fit in with other tasks in NLP? For example, is it a task like POS tagging that is mostly useful for downstream applications? If it’s an application, who do you see using it and how will it benefit them?
Previous Work
Describe how the task has been approached before. What approaches have been tried, and how successful have they been? If you’re taking an approach that’s been successful on other tasks and applying it to a new task, then describe how the approach has been used in other tasks, and how this task might be similar. How is what you’re doing similar to what’s been done before, and how is it different?
Methodology
Describe the data that you used. If you collected your own data, describe how you got it.
Describe your approach in enough detail that someone else with understanding of the underlying NLP concepts could replicate it. For example:
- You should be clear about any design decisions you made. If your model has parameters, how did you set them?
- You do not need to explain how common algorithms work. If you use the Viterbi algorithm, you can mention it by name, but you don’t need to describe how it works, since it’s reasonable to assume that your readers are familiar with it.
- If there are details you’re unsure about including, please check with me.
- You should say how your system is being evaluated.
Results
In table, graph, and/or prose form, give the results of your model, and compare your results to your baseline as well as to any previous work on the same task. Point out anything you find particularly interesting about your results, and address any areas where your system did not do well.
Discussion
This section should pop back up to the “big picture” view from the Introduction and Previous Work. What was interesting about the results of your system? (How) does your project fill a hole in the previous work? What questions are still unanswered, and how might you address those in future work?
References
You should cite any work of others that you mention, including both research papers and data.
Software Requirements
notes for installing software and getting set up on the TIRA virtual machine
notes for running your predict code on the TIRA virtual machine
Somewhat related
From the Washington Post 11/17/2018: ‘Nothing on this page is real’: How lies become truth in online America