

# Welcome! Do now:

Set your clicker to frequency AA: - hold the power button till lights flash

- press A twice
- lights should flash again

### Announcements

- Lab 1 grades posted y'all did well! But <please> don't get complacent.
- Clickers:
  - Present/absent: answered more than 50% of the questions
  - Correctness: check+ (>50% correct), check (<50% correct), 0 (absent)
  - IGNORE what you see on iClicker Cloud! (\*raises fists at developers\*)

# Revisiting...

- Left and Right shift
- Overflow: signed vs unsigned

Why did the US decide to ban exports of certain computer chips to China?

- A. To slow down China's military capabilities
- B. To slow down China's manufacturing industry
- C. To slow down China's Al industry
- D. Xenophobia
- E. None of these

Why did the US decide to ban exports of certain computer chips to China?

A. To slow down China's military capabilities

**B.** To slow down China's manufacturing industry

C. To slow down China's AI industry

D. Xenophobia

E. None of these

#### Moore's Law: The number of transistors on microchips doubles every two years Moore's law describes the empirical regularity that the number of transistors on integrated circuits doubles approximately every two years

Moore's law describes the empirical regularity that the number of transistors on integrated circuits doubles approximately every two years. This advancement is important for other aspects of technological progress in computing – such as processing speed or the price of computers.



Data source: Wikipedia (wikipedia.org/wiki/Transistor\_count) Year In Which the microchip w OurWorldinData.org – Research and data to make progress against the world's largest problems.

Licensed under CC-BY by the authors Hannah Ritchie and Max Roser.

# Moore's Law

- In 1965: Intel co-founder Gordon Moore **predicted** a doubling of transistors every year for the next 10 years in his original paper published in 1965.
- Today: An observation that the number of transistors on a microchip roughly doubles every two years, while cost is halved over the same time period.

### Your mission, if you choose to accept it:

# Build a CPU

This message will self-destruct in...

### Abstraction



### Abstraction





### Logic Gates

- Input: Boolean value(s) (high and low voltages for 1 and 0)
- Output: Boolean value result of Boolean function Always present, but may change when input changes





More Logic Gates

Note the circle on the output. This circle means bitwise "not" (flip bits).



| А | В | A NAND B | A NOR B |
|---|---|----------|---------|
| 0 | 0 | 1        | 1       |
| 0 | 1 | 1        | 0       |
| 1 | 0 | 1        | 0       |
| 1 | 1 | 0        | 0       |

# **Combinational Logic Circuits**

• Build up higher level processor functionality from basic gates



- Outputs are boolean functions of inputs
- Outputs continuously respond to changes to inputs

# What does this circuit output?



# What does this circuit output?



# What does this circuit output?



# Building more interesting circuits...

• Build-up XOR from basic gates (AND, OR, NOT)

| Α | В | A ^ B |
|---|---|-------|
| 0 | 0 | 0     |
| 0 | 1 | 1     |
| 1 | 0 | 1     |
| 1 | 1 | 0     |

• Q: When is A^B ==1?



# Building an XOR circuit

Not

- General strategy:
- 1. Determine truth table (given ->)
- 2. Determine for which rows the result is 1
  - express each row with 1 result in terms of input values A, B combined with AND, NOT
  - combine each row expression with OR
- 3. Translate expression to a circuit

| А | В | A ^ B |
|---|---|-------|
| 0 | 0 | 0     |
| 0 | 1 | 1     |
| 1 | 0 | 1     |
| 1 | 1 | 0     |

# Which of these is an XOR circuit?

Draw an XOR circuit using AND, OR, and NOT gates.

I'll show you the clicker options after you've had some time.

| А | В | A ^ B |
|---|---|-------|
| 0 | 0 | 0     |
| 0 | 1 | 1     |
| 1 | 0 | 1     |
| 1 | 1 | 0     |

### Which of these is an XOR circuit?









E: None of these are XOR.

### Which of these is an XOR circuit?









E: None of these are XOR.

### XOR Circuit: Abstraction

$$A^{B} == (\sim A \& B) | (A \& \sim B)$$



A:0 B:1 A^B:

A:1 B:1 A^B:

# Recall Goal: Build a CPU (model)

Three main classifications of hardware circuits:

- 1. ALU: implement arithmetic & logic functionality
  - Example: adder circuit to add two values together
- 2. Storage: to store binary values
  - Example: set of CPU registers ("register file") to store temporary values
- 3. Control: support/coordinate instruction execution
  - Example: circuitry to fetch the next instruction from memory and decode it



# Recall Goal: Build a CPU (model)

Three main classifications of hardware circuits:

- 1. ALU: implement arithmetic & logic functionality
  - Example: adder circuit to add two values together

Start with ALU components (e.g., adder circuit, bitwise operator circuits) Combine component circuits into ALU!

| The CPU                  |       |  |  |  |
|--------------------------|-------|--|--|--|
| 1. Processing 2. Control |       |  |  |  |
| Unit                     | Unit  |  |  |  |
| ALU registers            | PC IR |  |  |  |



# Arithmetic Circuits

- 1 bit adder: A+B
- Two outputs:
  - 1. Obvious one: the sum
  - 2. Other one: ??

| Α | В | Sum (A + B) | Cout |
|---|---|-------------|------|
| 0 | 0 |             |      |
| 0 | 1 |             |      |
| 1 | 0 |             |      |
| 1 | 1 |             |      |

### Which of these circuits is a one-bit adder?

| Α | В | Sum (A + B) | $C_{out}$ |
|---|---|-------------|-----------|
| 0 | 0 | 0           | 0         |
| 0 | 1 | 1           | 0         |
| 1 | 0 | 1           | 0         |
| 1 | 1 | 0           | 1         |









### Which of these circuits is a one-bit adder?

| Α | В | Sum (A + B) | $C_{out}$ |
|---|---|-------------|-----------|
| 0 | 0 | 0           | 0         |
| 0 | 1 | 1           | 0         |
| 1 | 0 | 1           | 0         |
| 1 | 1 | 0           | 1         |









# More than one bit?

• When adding, we sometimes have *carry in* too

### 1111 0011010 + <u>0001111</u>

### Write Boolean expressions for Sum = 1 and $C_{out} = 1$

| А | В | $C_{\texttt{in}}$ | Sum | $C_{out}$ |
|---|---|-------------------|-----|-----------|
| 0 | 0 | 0                 | 0   | 0         |
| 0 | 1 | 0                 | 1   | 0         |
| 1 | 0 | 0                 | 1   | 0         |
| 1 | 1 | 0                 | 0   | 1         |
| 0 | 0 | 1                 | 1   | 0         |
| 0 | 1 | 1                 | 0   | 1         |
| 1 | 0 | 1                 | 0   | 1         |
| 1 | 1 | 1                 | 1   | 1         |

• When is Sum 1?

• When is C<sub>out</sub> 1?



 $\sim C_{in} \& (A^B) \mid C_{in} \& \sim (A^B) == (C_{in} \land (A^B))$ 

• When is C<sub>out</sub> 1?



 $\sim C_{in} \& (A^B) | C_{in} \& \sim (A^B) == (C_{in} \land (A^B))$ 

• When is C<sub>out</sub> 1? (A & B) | ((A^B) & C<sub>in</sub>)

# One-bit (full) adder

• Need to include:

carry-in and carry-out

| А | В | $C_{\texttt{in}}$ | Sum | $C_{out}$ |
|---|---|-------------------|-----|-----------|
| 0 | 0 | 0                 | 0   | 0         |
| 0 | 1 | 0                 | 1   | 0         |
| 1 | 0 | 0                 | 1   | 0         |
| 1 | 1 | 0                 | 0   | 1         |
| 0 | 0 | 1                 | 1   | 0         |
| 0 | 1 | 1                 | 0   | 1         |
| 1 | 0 | 1                 | 0   | 1         |
| 1 | 1 | 1                 | 1   | 1         |





### Multi-bit Adder (Ripple-carry Adder)



#### Three-bit Adder (Ripple-carry Adder)





# Arithmetic Logic Unit (ALU)

- One component that knows how to manipulate bits in multiple ways
  - Addition
  - Subtraction
  - Multiplication / Division
  - Bitwise AND, OR, NOT, etc.
- Built by combining components
  - Take advantage of abstraction and sharing/reusing hardware when possible (e.g., subtraction using adder)

3-bit inputs A and B:



3-bit inputs A and B:





# Which of these circuits lets us select between two inputs?





# Which of these circuits lets us select between two inputs?





#### Multiplexor: Chooses an input value

<u>Inputs</u>: 2<sup>N</sup> data inputs, N signal bits <u>Output</u>: is one of the 2<sup>N</sup> input values



- Control signal c, chooses the input for output
  - When c is 1: choose a, when c is 0: choose b

#### N-Way Multiplexor

Choose one of N inputs, need log<sub>2</sub> N select bits



#### Example 1-bit, 4-way MUX

• When select input is 2 (0b10): C chosen as output





## ALU: Arithmetic Logic Unit



- Arithmetic and logic circuits: ADD, SUB, NOT, ...
- Control circuits: use op bits to select output
- Circuits around ALU:
  - Select input values X and Y from instruction or register
  - Select op bits from instruction to feed into ALU
  - Feed output somewhere

# Goal: Build a CPU (model)

Three main classifications of hardware circuits:

- 1. ALU: implement arithmetic & logic functionality
  - Example: adder circuit to add two values together
- 2. Storage: to store binary values
  - Example: set of CPU registers ("register file") to store temporary values
- 3. Control: support/coordinate instruction execution
  - Example: circuitry to fetch the next instruction from memory and decode it



# Goal: Build a CPU (model)

Three main classifications of hardware circuits:

- 2. Storage: to store binary values
  - Example: set of CPU registers ("register file") to store temporary values

Give the CPU a "scratch space" to perform calculations and keep track of the state its in.



## CPU so far...

- We can perform arithmetic!
- Storage questions:
  - Where to the ALU input values come from?
  - Where do we store the result?
  - What does this "register" thing mean?





## Memory Circuit Goals: Starting Small

- Store a 0 or 1
- Retrieve the 0 or 1 value on demand (read)
- Set the 0 or 1 value on demand (write)

#### R-S Latch: Stores Value Q

When R and S are both 1: Maintain a value R and S are never both simultaneously 0



- To write a new value:
  - Set S to 0 momentarily (R stays at 1): to write a 1
  - Set R to 0 momentarily (S stays at 1): to write a 0

## Gated D Latch

Controls S-R latch writing, ensures S & R never both O



D: into top NAND, ~D into bottom NAND WE: write-enabled, when set, latch is set to value of D

Latches used in registers (up next) and SRAM (caches, later) Fast, not very dense, expensive

DRAM: capacitor-based:



#### An N-bit Register

- Fixed-size storage (8-bit, 32-bit, 64-bit, etc.)
- Gated D latch lets us store one bit
  - Connect N of them to the same write-enable wire!







## "Register file"

- A set of registers for the CPU to store temporary values.
- This is (finally) something you will interact with!



- Instructions of form:
  - "add R1 + R2, store result in R3"

## Memory Circuit Summary

- Lots of abstraction going on here!
  - Gates hide the details of transistors.
  - Build R-S Latches out of gates to store one bit.
  - Combining multiple latches gives us N-bit register.
  - Grouping N-bit registers gives us register file.
- Register file's simple interface:
  - Read R<sub>x</sub>'s value, use for calculation
  - Write R<sub>v</sub>'s value to store result

## CPU so far...

We know how to store data (in register file).

We know how to perform arithmetic on it, by feeding it to ALU.

Remaining questions:

Which register(s) do we use as input to ALU? Which operation should the ALU perform?

To which register should we store the result?





# Goal: Build a CPU (model)

Three main classifications of hardware circuits:

- 1. ALU: implement arithmetic & logic functionality
  - Example: adder circuit to add two values together
- 2. Storage: to store binary values
  - Example: set of CPU registers ("register file") to store temporary values
- 3. Control: support/coordinate instruction execution
  - Example: circuitry to fetch the next instruction from memory and decode it



# Goal: Build a CPU (model)

Three main classifications of hardware circuits:



- 3. Control: support/coordinate instruction execution
  - Example: circuitry to fetch the next instruction from memory and decode it

Keep track of where we are in the program.

Execute an instruction, move on to the next...

#### Recall: Von Neumann Model



#### CPU Game Plan

- Fetch instruction from memory
- Decode what the instruction is telling us to do
  - Tell the ALU what it should be doing
  - Find the correct operands
- Execute the instruction (arithmetic, etc.)
- Store the result

## Program State

Let's add two more special registers (not in register file) to keep track of the program.



## To Recap:

- OR, AND, NOT, XOR gates
- Boolean expressions
- 1-N-bit adders
- 2-N-way multiplexors (MUXs not... MUK)



- ALUs
- Registers

#### Your TODO List

- HW2, Lab 2
- The next 11 weeks: Read the readings before class

## Fetching instructions.

Load IR with the contents of memory at the address stored in the PC.



## Decoding instructions.

Interpret the instruction bits: What operation? Which arguments?



## Decoding instructions.

Interpret the instruction bits: What operation? Which arguments?



## Decoding instructions.

Interpret the instruction bits: What operation? Which arguments?



#### Executing instructions.



## Storing results.

We've just computed something. Where do we put it?



### Questions so far?

We've just computed something. Where do we put it?



Why do we need a program counter? Can't we just start executing instruction at address 0 and count up one at a time from there?

- A. We don't, it's there for convenience.
- B. Some instructions might skip the PC forward by more than one.
- C. Some instructions might adjust the PC backwards.
- D. We need the PC for some other reason(s).

Why do we need a program counter? Can't we just start executing instruction at address 0 and count up one at a time from there?

- A. We don't, it's there for convenience.
- B. Some instructions might skip the PC forward by more than one.
- C. Some instructions might adjust the PC backwards.
- D. We need the PC for some other reason(s).

## Storing results.



#### Recap CPU Model

Four stages: fetch instruction, decode instruction, execute, store result



#### Fetching instructions.

Load IR with the contents of memory at the address stored in the PC.



#### Decoding instructions.



### Decoding instructions.



### Decoding instructions.



#### Executing instructions.



## Storing results.

Interpret the instruction bits: Store result in register, memory, PC.



## Clocking

- Need to periodically transition from one instruction to the next.
- It takes time to fetch from memory, for signal to propagate through wires, etc.
  - Too fast: don't fully compute result
  - Too slow: waste time

## Clock Driven System

- Everything in a CPU is driven by a discrete clock
  - clock: an oscillator circuit, generates hi low pulse



- Clock determines how fast system runs
  - Processor can only do one thing per clock cycle
    - Usually just one part of executing an instruction
  - 1GHz processor:

1 billion cycles/second  $\rightarrow$  1 cycle every nanosecond

### Cycle Time: Laundry Analogy

- Discrete stages: fetch, decode, execute, store
- Analogy (laundry): washer, dryer, folding, dresser



You have big problems if you have millions of loads of laundry to do....



(6 laundry loads per day)

# Pipelining (Laundry)



Steady state: One load finishes every hour! (Not every four hours like before.)



## Pipelining

(For more details about this and the other things we talked about here, take architecture.)

#### Up next

• Talking to the CPU: Assembly language