sentiment-intermediate

List of lists representation

Here is the list of lists after reading the file smallReviews.txt (before calculating sentiment and adjusting the ratings):

[['terrible', 1], ['waste', 1], ['talent', 1], ['bloated', 1], ['plot', 1], ['cast', 4], ['full', 4], ['talent', 4], ['memorable', 4], ['performances', 4], ['run', 3], ['time', 3], ['bloated', 3], ['independent', 3], ['film', 3], ['anything', 3], ['run', 3], ['mill', 3]]

Each word in the file is paired with the associated rating. Here is the list after paring it down to unique words and calculating the sum of ratings:

[['terrible', -1], ['waste', -1], ['talent', 1], ['bloated', 0], ['plot', -1], ['cast', 2], ['full', 2], ['memorable', 2], ['performances', 2], ['run', 2], ['time', 1], ['independent', 1], ['film', 1], ['anything', 1], ['mill', 1]]

Parallel list representation

Here are the parallel lists after reading the file smallReviews.txt (before calculating sentiment and adjusting the ratings):

['terrible', 'waste', 'talent', 'bloated', 'plot', 'cast', 'full', 'talent', 'memorable', 'performances', 'run', 'time', 'bloated', 'independent', 'film', 'anything', 'run', 'mill']
[1, 1, 1, 1, 1, 4, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 3, 3]

The first list is the list of all words seen in the file; the second list is the corresponding rating for the movie in which the word appeared. The length of these two lists is the same. Here are the lists after paring it down to unique words and calculating the sum of ratings:

['terrible', 'waste', 'talent', 'bloated', 'plot', 'cast', 'full', 'memorable', 'performances', 'run', 'time', 'independent', 'film', 'anything', 'mill']
[-1, -1, 1, 0, -1, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1]