Introduction
In this week’s lab, you’ll read some papers related to the Hyperpartisan News task. You’ll also implement (and analyze) one or more extensions to your Hyperpartisan News classifier from Lab 7.
There is no new starter code for this week’s assignment, but you should copy over your code from last week. (If you switched partners between Lab 7 and Lab 8, you should start by deciding whose code to use as your starting point.)
Readings
Before beginning your readings for the week, be sure to read How to Read a Paper if you haven’t already read it.
Each person in your group should read these papers:
-
A Stylometric Inquiry into Hyperpartisan and Fake News, by the organizers of the Semeval task
-
Hyperpartisan Facebook Pages Are Publishing False And Misleading Information At An Alarming Rate, by Buzzfeed editors, about their initial (non-computational) analysis of the dataset.
If you are working alone, you must read one of the papers below. If you aren’t working alone, you should divide up reading both of these:
Summaries
In a file called Writeup.md
, write a one-paragraph summary of each of the articles above.
Classifier Extension(s)
Based on the readings, choose one or more extensions to add to your Hyperpartisan News prediction system from Lab 7. At least one extension must come from an idea you got by reading one of the papers.
When describing your performance, you should run your experiment in two ways:
- Using 5-fold cross-validation on the training data
- Training on the training data and testing on the validation data.
In your Writeup.md
:
- Start by describing the performance you got by using the baseline classifier in Lab 7. (The baseline classifier is one of your DummyClassifiers that you ran in Lab 7.) Using some combination accuracy, precision, recall, F-measure that makes the most sense in this situation.
- Next, describe the performance you got using Naive Bayes from Lab 7. Again, choose appropriate metrics.
- For each extension (one or more), answer the following:
- What did you implement?
- What paper(s) gave you this ideas? Did a paper describe exactly what you are doing, or did you derive your idea from the paper’s idea?
- Why did you anticipate that this extension would be helpful with the task? Cite evidence from one of the papers, if possible. (You don’t need to repeat information if you already provided this in the previous question.)
- Describe your system’s performance with the extension. Again, choose the appropriate metrics.
If the extension you added didn’t take you much time at all to do, add more extensions. If the extension you added was quite substantial, you’re welcome to just do one.