Data Structures and Algorithms

Test 3 Study Guide

Tests in this course may contain any material covered in lecture up until this point, although much stronger emphasis will be placed on material for which you have received graded feedback. This page contains a collection of exercises that you may wish to complete to prepare yourself for the test. Note that you do not need to complete all of the offered exercises. More questions appear here than will appear on your test.

Inductive Proofs

You should be prepared to give inductive proofs for simple mathematical summations.

  1. Prove by induction that \(\sum\limits_{i=1}^{n} i \cdot 2^i = 2^{n+1}(n-1) + 2\).

    We have \(P(n) \equiv \sum\limits_{i=1}^{n} i \cdot 2^i = 2^{n+1}(n-1) + 2\).

    The base case, \(P(1)\), is to say \(\sum\limits_{i=1}^{1} i \cdot 2^i = 2^{1+1}(1-1) + 2\). We reduce this as follows:

    \(\sum_{i=1}^{1} i \cdot 2^i\)

    =

    \(2^{1+1}(1-1) + 2\)

    \(1 \cdot 2^1\)

    =

    \(2^{2}(0) + 2\)

    \(2\)

    =

    \(0 + 2\)

    For the inductive case, we show that \(P(k)\) implies \(P(k+1)\) for positive \(k\).

    \(\sum_{i=1}^{k} i \cdot 2^i\)

    =

    \(2^{k+1}(k-1) + 2\)

    \(\left(\sum_{i=1}^{k} i \cdot 2^i\right) + (k+1)\cdot 2^{k+1}\)

    =

    \(2^{k+1}(k-1) + 2 + (k+1)\cdot 2^{k+1}\)

    \(\sum_{i=1}^{k+1} i \cdot 2^i\)

    =

    \(2^{k+1}(k-1) + 2^{k+1}(k+1) + 2\)

    \(\sum_{i=1}^{k+1} i \cdot 2^i\)

    =

    \(2^{k+1}(k-1+k+1) + 2\)

    \(\sum_{i=1}^{k+1} i \cdot 2^i\)

    =

    \(2^{k+1}(2k) + 2\)

    \(\sum_{i=1}^{k+1} i \cdot 2^i\)

    =

    \(2^{k+2}(k) + 2\)

    \(\sum_{i=1}^{k+1} i \cdot 2^i\)

    =

    \(2^{(k+1)+1}((k+1)-1) + 2\)

    As we have shown both the base case and the inductive case, \(P(n)\) must be true for all integer \(n>1\).

  2. Prove by induction that \(\sum\limits_{i=1}^{n} \frac{3}{4^i} = 1 - (\frac{1}{4})^n\).

    We have \(P(n) \equiv \sum\limits_{i=1}^{n} \frac{3}{4^i} = 1 - (\frac{1}{4})^n\).

    The base case, \(P(1)\), is to say \(\sum\limits_{i=1}^{1} \frac{3}{4^i} = 1 - (\frac{1}{4})^1\). We reduce this as follows:

    \(\sum_{i=1}^{1} \frac{3}{4^i}\)

    =

    \(1 - (\frac{1}{4})^1\)

    \(\frac{3}{4^1}\)

    =

    \(1 - (\frac{1}{4})^1\)

    \(\frac{3}{4}\)

    =

    \(1 - \frac{1}{4}\)

    \(\frac{3}{4}\)

    =

    \(\frac{3}{4}\)

    For the inductive case, we show that \(P(k)\) implies \(P(k+1)\) for positive \(k\).

    \(\sum_{i=1}^{k} \frac{3}{4^i}\)

    =

    \(1 - (\frac{1}{4})^k\)

    \(\left(\sum_{i=1}^{k} \frac{3}{4^i}\right) + \frac{3}{4^{k+1}}\)

    =

    \(1 - (\frac{1}{4})^k + \frac{3}{4^{k+1}}\)

    \(\sum_{i=1}^{k+1} \frac{3}{4^i}\)

    =

    \(1 - \frac{1^k}{4^k} + \frac{3}{4^{k+1}}\)

    \(\sum_{i=1}^{k+1} \frac{3}{4^i}\)

    =

    \(1 - \frac{4}{4^{k+1}} + \frac{3}{4^{k+1}}\)

    \(\sum_{i=1}^{k+1} \frac{3}{4^i}\)

    =

    \(1 + \frac{3-4}{4^{k+1}}\)

    \(\sum_{i=1}^{k+1} \frac{3}{4^i}\)

    =

    \(1 - \frac{1}{4^{k+1}}\)

    As we have shown both the base case and the inductive case, \(P(n)\) must be true for all integers \(n>1\).

Dictionary ADT

  1. What kind of information does a Dictionary store?

    A Dictionary stores keys and their associated values.

  2. Identify four operations provided by the Dictionary ADT. Describe what types of arguments they take and what types they return.

    The Dictionary ADT provides

    • contains, which takes a key and returns a boolean

    • get, which takes a key and returns a value

    • insert, which takes a key and a value and returns nothing.

    • remove, which takes a key and returns nothing.

  3. Which four Dictionary implementations did we discuss in class?

    The implementations that we considered were Binary Search Trees, AVL Trees, Hash Tables, and Linear Dictionaries (using vectors).

  4. Compare and contrast the various Dictionary implementations. What run times do we expect on the key operations for each implementation? When would we prefer one implementation over another.

    • Binary Search Trees

      Run time: \(O(h)\) where h is the height of the tree, which in the worst case could be \(O(n)\)

      Because BSTs do not have any height guarantees, and AVL Trees do, we would typically prefer the AVL Tree over a BST.

    • AVL Trees

      Run time: \(O(\log n)\)

      This implementation is preferred when knowing some information about the ordering of the keys is important. For instance we can easily find the min or max key within this implementation.

    • Hash Tables

      Run time: \(O(1)\), on average

      This implementation is the fastest on average, and is thus preferred in most cases. However, it trades space for time, so there might be situtations where you are not willing to make this tradeoff. It also requires the use of a hash function, which for some domains may be non-trivial to create.

    • Linear (using vectors)

      Run time: \(O(n)\)

      This implementation is preferred when the size of the dictionary is known to be small and this simpler approach will have less overhead.

Trees

  1. Define the following terms with respect to tree data structures:

    1. Leaf

      A leaf is a node in a tree that has no children.

    2. Height

      The height of a tree is the number of levels below the root, or the number of edges in the longest path from the root to a leaf, or one less than the number of levels in the tree.

    3. Binary tree

      A binary tree is a tree for which each node has at most one left child and at most one right child.

    4. Binary search tree

      A binary search tree is a binary tree in which, for each node, all keys in the left subtree are less than the node’s key and all keys in the right subtree are greater than the node’s key.

    5. AVL tree

      An AVL tree is a binary search tree in which, for every node, the height of the left subtree and the height of the right subtree differ by no more than one.

    6. Complete binary tree

      A complete binary tree is a binary tree in which all levels are full except the last, where all nodes are packed to the left and all empty spaces are packed to the right.

    7. Max-heap

      A max-heap is a complete binary tree in which every node’s prioiry is greater than or equal to the priorities of its children.

Binary Search Trees

  1. Show the BST that would result from inserting the following keys into an initially empty tree: 12, 8, 5, 17, 1, 9, 22.

    1 5 9 8 22 17 12
  2. Is this the only BST that could contain these keys? If not, give at least one example of another BST that holds these same keys. If so, explain why the resulting BST is unique.

    No, this is not the only BST that could hold these keys. For example: if the keys were inserted in sorted order from 1 through 22, then the tree would be a long line of right branches with 1 at the root and 22 as the only leaf.

  3. Is the following tree a BST? If not, why not?

    1 7 4 17 12 29 30 20 10

    This is not a BST. The key 29 is to the right of the key 30, which violates the BST invariant.

  4. If we tried to find the key 22 in the following tree, where would it have to be?

    12 17 15 44 23 87 99 90 50

    If the key 22 were in this BST, it would have to be to the right of the key 17.

AVL Trees

  1. Draw a diagram showing the "rotate left" operation on binary trees. When is it impossible to rotate a binary tree to the left?

    X Y Z b a

    X Y a Z b

    It is only possible to perform left rotation if the root has a right child; a binary tree with no right child of the root cannot be left-rotated.

  2. Write pseudocode for the AVL rebalancing algorithm.

    Method rebalance(node)
      d <- node.rightChild.height - node.leftChild.height
      If d < -1 Then
        If node.leftChild.leftChild.height < node.leftChild.rightChild.height Then
          node.leftChild <- rotateLeft(node.leftChild)
        EndIf
        Return rotateRight(node)
      Else If d > 1 Then
        If node.rightChild.leftChild.height > node.rightChild.rightChild.height Then
          node.rightChild <- rotateRight(node.rightChild)
        EndIf
        Return rotateLeft(node)
      EndIf
      Return node
    End Function
  3. Consider the following tree:

    2 5 4 10 9 6 13 16 14 19 17 12
    1. Write the heights of each node for the tree above.

      Writing the height of each node above the node itself, we have:

      2 0 5 0 4 1 10 0 9 1 6 2 13 0 16 0 14 1 19 0 17 2 12 3
    2. Consider the addition of the value 1 to the above tree. Which nodes' heights change?

      The heights down the leftmost spine of the tree change.

    3. What will the tree look like after the value 1 is inserted?

      We show the insertion of 1 and the updated heights as follows:

      1 0 2 1 5 0 4 2 10 0 9 1 6 3 13 0 16 0 14 1 19 0 17 2 12 4
    4. What will the tree look like after the value 15 is inserted into the original tree?

      The insertion of 15 into the original tree is drawn below. Note that a rebalancing is necessary because 17 was out of balance after 15 was added.

      2 5 4 10 9 6 13 15 14 19 17 16 12
    5. What will the tree look like after the value 5 is deleted from the original tree?

      The removal of 5 from the original tree is drawn below.

      2 4 10 9 6 13 16 14 19 17 12
    6. What will the tree look like after the value 9 is deleted from the original tree?

      The removal of 9 from the original tree is drawn below.

      2 5 4 10 6 13 16 14 19 17 12
    7. What will the tree look like after the value 6 is deleted from the original tree?

      The removal of 6 from the original tree is drawn below. Note that this is one of two options for how we might remove 6 from the original tree.

      2 4 10 9 5 13 16 14 19 17 12
    8. What will the tree look like after the value 19 is deleted from the original tree?

      The removal of 19 from the original tree is drawn below. After 19 was removed, 17 was out of balance and was rotated right.

      2 5 4 10 9 6 13 16 17 14 12

Priority Queues and Heaps

  1. What is the difference between a queue and a priority queue?

    A priority queue has a priority type as well as an element type. When elements are enqueued into a priority queue, they are paired with a priority (in contrast to a queue, which simply requires the element). In a queue, elements are dequeued in the order that they are enqueued; in a priority queue, they are dequeued in an order based upon their associated priorities.

  2. Consider the following lists of numbers. Write each as a complete binary tree using the representation we discussed in class.

    1. [2, 3, 9, 3, 6, 2]

      3 6 3 2 9 2
    2. [11, 4, 4, 3, 4, 6, 9, 10, 2]

      10 2 3 4 4 6 9 4 11
    3. [9, 12, 3, 14, 2, 1, 6, 8, 8, 8, 9, 14, 7]

      8 8 14 8 9 2 12 14 7 1 6 3 9
  3. Is the following statement true or false? If true, give a justification. If false, give a counterexample. "All BSTs are heaps."

    The statement that "all BSTs are heaps" is false. The following BST is a counter-example.

    3 5 4

    This binary tree is a BST because 3 < 4 (left subtree data are smaller) and 5 > 4 (right subtree data are larger). But it is not a max-heap because 4 < 5 and, in a max-heap, all parents must have a priority greater than or equal to the priority of their children. A similar argument shows that this is also not a min-heap.

  4. Give pseudocode for the bubbleDown heap algorithm.

    Method bubbleDown(index)
      left <- leftChild(index)
      right <- rightChild(index)
      If right >= size Then
        If left < size Then
          If contents[left].priority > contents[index].priority Then
            swap(contents[left], contents[index])
          EndIf
        EndIf
      Else
        If contents[left].priority > contents[right].priority Then
          biggest <- left
        Else
          biggest <- right
        EndIf
        If contents[biggest].priority > contents[index].priority Then
          swap(contents[biggest], contents[index])
          bubbleDown(biggest)
        EndIf
      EndIf
    End Function
  5. Give pseudocode for the heapify heap algorithm to turn these complete binary trees into heaps. Briefly explain why this algorithm is \(O(n)\).

    Method heapify()
      For index <- contents.size-1 Down To 0 Do
        bubbleDown(index)
      EndFor
    End Function

    This algorithm is \(O(n)\) because, although the loop runs \(O(n)\) times and bubbleDown takes \(O(\log n)\) worst-case time, most bubbleDown operations are small — that is, most nodes are closer to the leaves of the tree rather than the root. As a result, the overall number of actual swaps that occurs is \(O(n)\).