CPSC 65/LING 20: Fall 2014

Many students have asked for a "study guide". Below I lay out the topics that you should be familiar with for the first exam. There will be no programming questions on the exam. All of the questions will be conceptual questions, largely about the lecture material but also about the topics covered in the labs.

I will not provide sample questions for you to answer. You can expect that there will be short answer questions and extended questions requiring one or two paragraphs. There will be some questions that require you to be knowledgeable about the mathematical formulas that we've used to describe various NLP algorithms.

This list of topics is a superset of the topics that will be covered on the exam. Since the exam takes place in a 90 minute class, everything cannot be covered.

Exam 1 Topics

There's really nothing special about this list, it's just a slightly more extended version of the syllabus.

Regular expressions
Types, tokens, Zipf's law, precision, recall, f-measure
Basic Probability: Chain rule, Bayes' rule, Markov assumption, Maximum Likelihood Estimation
Language modeling: n-gram models
Smoothing: Laplace, discounting, interpolation, backoff, stupid backoff, Good-Turing, Kneser-Ney
Data sets: training, test, development
Part of speech tagging: n-gram POS taggers, markov taggers, transformation-based learning
Morphology: segmentation, trie data structure, successor/predecessor frequency/entropy, orthographic similarity (Levenshtein distance)
Hidden Markov models: formulation, computing likelihood (forward algorithm), decoding (viterbi algorithm), training (forward-backward algorithm)

CPSC 65/LING 20:Natural Language Processing

Exam 1 Topics

CPSC 65/LING 20:
Natural Language Processing