|
CS41: Final Exam Study Guide
This study guide includes topics covered over the entire
semester. The final will be cumulative, although weighted
towards the second half of the semester. This study guide is
intended to be comprehensive, but not guaranteed to be
complete. You are responsible for all material discussed in
lecture or lab, and all material on homework assignments. You
are also expected to be familiar with data structures introduced
in CS35.
In addition to using this study guide, you should:
- review your lecture notes
- review your homework assignments
- review lab questions
You can also look over assigned readings from the textbook, but they
were not required.
You should be able to define and explain the following terms:
- stable matching
- matching instability
- Gale-Shapley algorithm
- algorithm analysis
- asymptotic notation (big-Oh, big-Omega, big-Theta)
- direct proof
- induction
- proof by contradiction
- data structures
- list
- array
- adjacency list and adjacency matrix
- any data structure from CS35 that you want to use (but make
sure you are doing the correct worst-case analysis!)
- graph algorithms
- breadth-first search (BFS)
- depth-first search (DFS)
- topological ordering
- spanning tree
- minimum spanning tree (MST)
- Kruskal's Algorithm
- Prim's Algorithms
- directed, undirected graphs
- bipartite graph
- cycle
- connected component
- greedy algorithm
- stays-ahead method
- exchange argument
- divide and conquer
- recurrence relations
- recursion tree
- substitution method (and partial substitution method)
- mergesort
- Karatsuba multiplication
- dynamic programming
- network flow
- augmenting path
- Ford-Fulkerson algorithm
- maximum flow / minimum cut
- computational complexity
- intractability
- decision problem
- optimization problem
- complexity classes (P, NP, NP-Complete, NP-hard)
- The Cook-Levin Theorem
- NP-Complete problems (SAT, VertexCover, IndependentSet, SetCover, 3-Coloring, 3SAT, ...)
- polynomial-time reduction
- polynomial-time verifier
- approximation
- approximation algorithm
- approximation ratio
- minimization problem, maximization problem
- travelling salesman problem (TSP)
- maximum cut
- randomization
- randomized algorithms
- discrete probability
- random variable
- expected value
- linearity of expectation
Practice problems:
- Divide and Conquer. An investment company has designed
several new trading strategies and wants to compare them.
Unfortunately, the strategies were tested on different commodities
over different time periods, so the total profit earned is not a
fair comparison. Instead, the company wants to compare each
strategy's profit to the maximum possible profit that could have
been made (with perfect hindsight) during the same time span.
Each strategy was allowed to buy a fixed amount of the commodity
once and then sell what it bought at some later date.
During each testing period, the investment company recorded the
minimum and maximum prices that were reached each day. Your
algorithm receives a list of $n$ pairs, where the $i^{th}$ pair
has the minimum price of the commodity and the maximum price of
the commodity on day $i$. Your algorithm should output the
largest price difference that could have been achieved by buying
at the minimum price on day $i$ and selling at the maximum price
on day $j > i$.
For example, suppose $n=3$ and the (min, max) prices were
$[ (8, 14), (1, 2), (4, 6) ]$. Then you should return a per-unit
profit of 5, corresponding to buying for 1 on day 2 and selling
for 6 on day three (recall that the trading strategies must buy
first, and can't sell on the same day).
Clearly, there is a simple algorithm that takes time
$O(n^2)$: try all possible pairs of buy/sell days and see which
makes the most money. Design a divide and conquer algorithm to
determine the best profit in hindsight more efficiently. Set up a
recurrence that describes the running time of your algorithm, and
solve it to show that your algorithm is faster than $O(n^2)$.
- Dynamic Programming. In the Subset-Sum
problem, you are given $n$ items $\{1, ..., n\}$. Each item
has a nonnegative integer weight $w_i$. You are also given
an integer weight threshold $W$. For a subset of
elements $S$, define $w(S)$ to be the sum of
$w_i$, taken over all $i$ in $S$. Your goal is to output
the set $S$ that maximizes $w(S)$, subject to $w(S) \leq W$.
The Subset-Sum problem can be solved using a
dynamic programming approach where the subproblems use a
subset of the elements and a smaller threshold. The dynamic
programming table stores in entry $(i, j)$ the solution to the
subset sum problem for items $\{1, ..., i\}$ with threshold $j$.
- Explain how to calculate SubsetSum$(i, j)$, assuming
that the table already contains correct answers to all
smaller subproblems $(i', j')$, namely those such that
either $(i' \leq i$ and $j' < j)$ or $(i' < i$ and $j'
\leq j)$.
- What is the running time of the dynamic programming
algorithm that fills this table for successively larger
subsets and thresholds and then returns entry $(n, W)$?
- Intractability. Consider the following
decision-version of Subset-Sum, which we'll
call SSum. In this version, you're given n items
$\{1,..., n\}$ with item weights $w_i$, and a weight
threshold $W$, and you must output YES iff there exists
a subset whose item weights total exactly $W$; i.e., if there
is $S$ such that $w(S) = W$.
- Show that SSum is NP-Complete.
- Show how to solve SSum using your solution for
Subset-Sum from the previous problem.
- Why does your answer to (b) not pose a
contradiction with your dynamic programming algorithm for
Subset-Sum (assuming $P \neq NP$)?
- Network flow. (Kleinberg and Tardos 7.12) You are
given a flow network with unit-capacity edges: it consists of
a directed graph $G=(V,E)$, a source $s \in V$, a sink $t \in
V$, and $c_e = 1$ for all $e \in E$. You are also given an
integer $k$.
Your goal is to delete $k$ edges in order to reduce the
maximum $s \rightsquigarrow t$ flow in $G$ by as much as
possible. In other words, you should find a set of edges $F
\subset E$ so that $|F| = k$ and the maximum
$s\rightsquigarrow t$ flow in $G'%'
= (V, E \backslash F)$ is as
small as possible.
Give a polynomial-time algorithm to solve this problem.
- Approximation Algorithms. (Kleinberg and Tardos
11.10) Suppose you are given an $n \times n$ grid graph $G$.
Associated with each node $v$ is a weight $w_v$, which is a
nonnegative integer. You may assume that the weights of all
nodes are distinct. Your goal is to choose
an independent set $S$ of nodes of the grid so that the
sum of the weights of the nodes in $S$ is as large as
possible. Consider the following greedy algorithm for this
problem:
- Start with $S$ equal to the empty set
- While some node remains in $G$
- Pick the heaviest remaining node $v$.
- Add $v$ to $S$
- Delete $v$ and all its neighbors from $G$
- Return $S$
First, let $S$ be the independent set returned by this algorithm, and
let $T$ be some other independent set in $G$. Show that for each node $v$
in $T$, either $v$ is in $S$, or $v$ is adjacent to some
$v'$ in $S$ with $w_{v'} \geq w_v$.
Show that this algorithm returns an independent set with total
weight at least 1/4 of the maximum total weight of an independent set.
- Randomized Algorithms. (Kleinberg and Tardos 13.8)
Let
$G =(V,E)$ be an undirected graph with $n$ nodes and $m$ edges. For a subset
of vertices $X\subseteq V$, let $G[X]$ denote the subgraph induced on $X$; i.e.,
the graph whose vertices are $X$ and whose edge set consists of all
edges in $G$ with both endpoints in $X$.
We are given a natural number $k \leq n$ and are interested in finding
a set of $k$ nodes that induces a "dense" subgraph of $G$; we'll phrase
this concretely as follows. Give a randomized algorithm that
produces, for a given graph $G$ and natural number $k$, a set $X$ of $k$
vertices with the property that the induced subgraph $G[X]$
has at least
$\frac{mk(k-1)}{n(n-1)}$ edges.
Your algorithm should have expected polynomial time but
always output correct answers. What is the expected running
time of your algorithm?
include("../style/footer.php"); ?>
|