Introduction
This course will introduce you to a broad range of topics in the area of natural language processing including language modeling, part of speech tagging, machine translation, syntactic parsing, vector semantics, text classification, as well as the application of computational tools to cognitive modeling and psycholinguistics.
Course Goals
By the end of the course you will:
- learn the algorithms and data structures central to Natural Language Processing
- be able to think about language from a computational and engineering perspective
- learn how to access and extract useful representations from large text corpora
- learn how to use text corpora as the basis for training probabilistic machine learning algorithms
- build components of large NLP systems such as language models, part of speech taggers and text classifiers
- gain exposure to the concepts, terminology and systems required to read and discuss primary literature in NLP
Class Information
Professor: Spencer Caplan
Office: Science Center 262A
Phone: (610) 957-6257
Office Hours:
- Tuesdays 4:30 -- 5:30pm
- Thursdays 2:30 -- 4:00pm
- or by appointment as needed
Lecture time: Tuesdays and Thursdays 11:20am -- 12:35pm
Lecture location: Singer 222
Lab A time: Tuesdays 1:05 -- 2:35pm
Lab B time: Tuesdays 2:45 -- 4:15pm
Lab location (both A and B): Science Center 240
Class discussion board: EdStem
Note to enrolled students: please don't hesitate to contact Prof. Caplan if you are having trouble accessing any of the course resources (Ed, GitHub, etc.)
Textbook
You should not purchase any textbooks this semester. All readings (both required and optional) will be posted to the course website. Many readings come from:
- Jurafsky and Martin, Speech and Language Processing, 3rd edition, 2021 – draft edition online
Weekly Schedule
WEEK | DAY | ANNOUNCEMENTS | TOPIC & READING | LABS |
---|---|---|---|---|
1 | Jan 18 | Asynchronous Prep week | Prep WeekRequired ReadingOptional | |
Jan 20 | ||||
2 | Jan 25 | Synchronous Zoom class Lab 0 (Welcome) due | Introduction
Required ReadingLecture SlidesOptional | |
Jan 27 | Synchronous Zoom class | |||
3 | Feb 01 | First in person class! Lab 1 (Counts) due | Language Modeling
Required ReadingLecture SlidesOptional | |
Feb 03 | Add/Drop Deadline (Feb 04) | |||
4 | Feb 08 | Noisy Channel
Required ReadingLecture SlidesOptional | ||
Feb 10 | ||||
5 | Feb 15 | Lab 2 (Lang Mod) due Quiz 1 (in lab) | Guest Lecture: Ryan Budnick (UPenn)
Lecture Slides |
|
Feb 17 | Machine Translation
Required ReadingLecture SlidesOptional | |||
6 | Feb 22 | |||
Feb 24 | ||||
7 | Mar 01 | Lab 3 Part-1 (MT Pset) due | Vector Semantics
Lecture SlidesOptional | |
Mar 03 | Guest Lecture: Jonathan Washington (Swarthmore)
| |||
Mar 08 | Spring Break | |||
Mar 10 | ||||
8 | Mar 15 | Vector Semantics
Required ReadingLecture SlidesOptional | ||
Mar 17 | Lab 3 Part-2 (MT project) due (Mar 18) | POS Tagging
Required ReadingLecture Slides | ||
9 | Mar 22 | |||
Mar 24 | Withdrawal and CR/NC Deadline (Mar 25) | Classification
Required ReadingLecture Slides | ||
10 | Mar 29 | Lab 4 (Word Vectors) due | ||
Mar 31 | ||||
11 | Apr 05 | Guest Lecture: Jordan Kodner (Stony Brook)
Lecture Slides | ||
Apr 07 | Parsing
Required ReadingLecture Slides | |||
12 | Apr 12 | Lab 5 (POS) due | ||
Apr 14 | ||||
13 | Apr 19 | Lab 6 (Spam Filter) due Quiz 2 (in lab) |
| |
Apr 21 | Morphology
Lecture Slides | |||
Apr 26 | ||||
Apr 28 |