Exam2 Review

CS63 Artificial Intelligence

Exam 2 Review

Introduction

For each topic there is a list of terms. You should be able to define these terms, explain their relevance to AI, and provide a concrete example.

You do not need to memorize any formulas. However, given a formula, such as the perceptron learning rule or the update rule for Q-learning, you should understand how to apply it to a specific problem.
The exam will focus on algorithms/methods that you explored in labs. For each ML approach, what are its strengths and weaknesses? For what types of problems would each be most appropriate?
Machine Learning

Terminology
- Machine learning
  - Supervised learning
  - Reinforcement learning
  - Unsupervised learning
- Training set
- Testing/Validation set
- Generalization
- Overfitting
- Error
- Confusion Matrix
- Learning rate
Questions
1. Why would we want to give an AI system the opportunity to learn?
2. Why not just program in the ability we want?
3. What sort of problems is ML be good for?
4. What sort of problems is ML be inappropriate for?
5. What are the benefits and drawbacks of accuracy as an evaluation metric? When (and how) might we want to use a confusion matrix?
Artificial Neural Networks

Terminology
- Unit
- Weight
- Layer
- Feed-forward
- Fully connected
- Activation function
  - Step
  - Sigmoid
  - ReLU
- Weighted Sum
- Activation value
- Bias
- Linearly separable
- Hidden layer representations
- Back-propagation learning rule: Derived by doing gradient descent on error
Perceptron learning rule:
```
  w += learningRate * (target - output) * input
```
Questions
1. Consider a two-layer neural network with 3 inputs and 1 output that uses the step activation function (returns 1 when the netInput is greater than 0 and otherwise returns 0). Such a model can only solve problems from the class of linearly separable functions.
  For the following problems explain whether the function is linearly separable. You may want to use 3D pictures of cubes to visualize whether the functions are linearly separable. If a function is separable, determine a set of weights that solve the problem (you can do this by hand, you don't need to use the perceptron learning rule).
  - The output turns on whenever more than one of the inputs is on.
  - The output turns on whenever both inputs two and three are on.
  - The output turns on when exactly one input is on.
2. In what ways are artificial neural networks similar to and different from biological neural networks?
3. Is backprop guaranteed to converge on a solution?
4. Given a network with an input layer of size 10, a hidden layer of size 5, and an output layer of size 2, how many weights and biases does that network contain?
5. What are the differences between the original 'perceptron learning rule' and the later 'error backpropagation algorithm'?
Deep Learning

Terminology
- Vanishing gradient problem
- Loss
- Optimizer
- Convolution
- Keras layer types: Conv2d, MaxPooling2d, Flatten, Dense
- MNIST data sets (handwritten digits and fashion)
Questions
1. Describe the factors that resulted in modern deep learning gaining traction in the early 2010s. What changed compared to earlier work with artificial neural networks? What stayed the same?
2. What sort of problems are CNNs well suited to solve? What sorts are they particularly poorly suited to solve?
3. Given a network with 50x50 input images of depth 1, and a a conv2D layer with a single 5x5 kernel, how many tunable parameters would be in this layer?
4. What if instead the images were of size 100x100, how many tunable parameters would be in this layer?
5. What if instead the images were of the original size but with depth 3, how many tunable parameters would be in this layer?
6. What if instead the images were of depth 3 and you had 10 5x5 kernels, how many tunable parameters would be in this layer?
7. What if you added 10 2x2 pooling layers after the conv2D layer, how many tunable parameters does this add in?
Reinforcement Learning

Terminology
- State
- Action
- Value
- Reward
- Policy
- Discount
- Epsilon-greedy action selection
- Exploration vs Exploitation
- Temporal difference learning
- Q-learning, Q-table, Q-value
- Function Approximation/Deep Q-learning
  - Q network
  - Experience replay
- Catastrophic forgetting
Q-learning update rule
```
  
     Q(s,a) += learningRate * (reward + discount * max Q(s', a') - Q(s,a)) 
  
```
Questions
1. Explain in your own words how the Q-learning update rule works to modify the Q-values based on the current action taken and the reward received.
2. Why did we use Approximate Q-learning to solve the cart pole problem? How effective was it?
3. Is RL guaranteed to find the optimal policy?
4. Consider the following grid-based environment, where the rewards of being in each location are shown.
```
  -------------
2 | 0 | 0 | +1|
  -------------
1 | 0 | 0 | -1|
  -------------
0 | 0 | 0 | 0 |
  -------------
    0   1   2  
```
  We will represent the state in column,row format. Suppose that the actions the agent can take are to go north, east, south, or west. If it tries to go a direction that leads it off the boundary of the grid then it remains in its current state, and receives the reward for that state on that action. After 500 steps of training suppose that the Q-table contains the following values.
```
       actions
state  n     e     s     w
  0,0  0.73  0.69  0.65  0.65 
  0,1  0.76  0.81  0.65  0.72 
  0,2  0.00  0.90  0.17  0.00 
  1,0  0.81  0.00  0.35  0.48 
  1,1  0.90  -0.97 0.62  0.67 
  1,2  0.82  1.00  0.79  0.68 
  2,0  0.00  0.00  0.00  0.21 
  2,1  0.00  0.00  0.00  0.00 
  2,2  0.00  0.00  0.00  0.00 
    
```
  Using the grid below, draw an arrow in each location to show the agent's current policy based on the Q-table.
```
  -------------
2 |   |   |   |
  -------------
1 |   |   |   |
  -------------
0 |   |   |   |
  -------------
    0   1   2  
    
```
5. Is this policy optimal?
Genetic Algorithms

Terminology
- Encoding/Individual
- Population
- Generation
- Fitness
- Selection
- Crossover
- Mutation
- Replacement
Questions
1. What are the three operators used by a GA to model reproduction? How are they inspired by biological evolution?
2. How is a GA similar to and different from (Stochastic) Beam Search?
3. How did we use a GA to solve reinforcement learning problems?
Evaluating Machine Learning
Explain all of the following issues with deep learning systems and give one example of each problem:
- Long-tail events
- Lack common sense
- Lack of interpretability
- Bias
- Spurious statistical correlations
- Vulnerable to adversarial attacks
Achieving general-purpose intelligence
- What is self-supervised learning? What issues does it address with supervised learning?
- What is embodiment? How does it relate to issues with convolution networks?

CS63 Artificial Intelligence

Exam 2 Review

Introduction

Machine Learning

Terminology

Questions

Artificial Neural Networks

Terminology

Perceptron learning rule:

Questions

Deep Learning

Terminology

Questions

Reinforcement Learning

Terminology

Q-learning update rule

Questions

Genetic Algorithms

Terminology

Questions

Evaluating Machine Learning

Achieving general-purpose intelligence