CS71 Lab 8: Git Workshop

In which, mysteries long hidden are revealed...


The goals for this lab assignment are:


Introduction

Git is an example of a version control system. Other systems are SVN, CVS, and RCS. These systems are important because they allow us to


Version control

A version control system maintains a repository that stores how a set of files have changed over time. Repositories can be local or remote.

Terminology:

A major job of any version control system is maintain the history of edits to the files in its repository. Some systems keep track of changes on a per file basis. However, this can become problematic when files are moved, renamed, or deleted. For this reason, Git keeps track of changes over the whole repository, called a snapshot. Snapshots record edits as well as additions and removals to the repository.

Additional Readings:


Creating a new repository

To start, we will create a new repository. There are two ways you can do this:

  1. Create a new repository on github and then clone it to your local machine
  2. Create a repository locally and then upload it to github

Let's use method #1 to create our repository:

git clone git@github.swarthmore.edu:<USERNAME>/git-practice.git

The following sections contain questions for you to fill out. Put your answers in README.md.


Analyzing an existing Git repository

Git represents histories as a directed acyclic graph (DAG). Each node corresponds to a commit made to the repository.

Let's look at the example from MIT's 6.005 version control lecture.

git clone https://github.com/mit6005/sp16-ex05-hello-git.git hello-git

First, let's configure Git so that logs look pretty when we print them:

git config --global color.branch auto
git config --global color.diff auto
git config --global color.interactive auto
git config --global color.status auto
git config --global color.grep auto

And next, let's create an alias git lol for showing the log.

git config --global alias.lol "log --graph --oneline --decorate --color --all"

Go into your hello-git directory and try running git lol. You should see something like the following (but in color!):

$ git lol
* b0b54b3 (HEAD -> master, origin/master, origin/HEAD) Greeting in Java
*   3e62e60 Merge
|\  
| * 6400936 Greeting in Scheme
* | 82e049e Greeting in Ruby
|/  
* 1255f4e Change the greeting
* 41c4b8f Initial commit

Each node represents a version, e.g. snapshot of the entire project. It follows that each node corresponds to a commit. Every commit has a unique ID and a pointer to its parent commit. For example,

We can also view the history using tools like gitg (try it!)

We are currently on a branch called master. It is a local repository. The remote repository is located at origin/master. When we git clone, we download a copy of the repository's history graph. Git uses origin/master like a bookmark to remember where we cloned from.

Git also needs to store information about the contents of all its files. Git does this using tree nodes. Tree nodes store diffs between versions as well as human-friendly log messages. If we visualize the tree nodes, our graph now looks like:

The last important book-keeping Git performs involves the staging area (aka index inside .git). This is where Git stores your file state with you do a git add.

The history, tree, and staging area are all pieces of the object graph that Git uses to manage repositories. So far, we've talked about what information is stored in the object graph, but not where it stores this information. The answer is that Git stores everything inside a hidden directory named .git.

hello-git$ ls -la
total 40
drwxrwxr-x 3 alinen alinen 4096 Mar 31 15:25 .
drwxrwxr-x 3 alinen alinen 4096 Mar 31 15:26 ..
-rw-rw-r-- 1 alinen alinen  223 Mar 31 15:25 .classpath
drwxrwxr-x 8 alinen alinen 4096 Mar 31 15:25 .git  <------ location of all .git state
-rw-rw-r-- 1 alinen alinen    5 Mar 31 15:25 .gitignore
-rw-rw-r-- 1 alinen alinen  211 Mar 31 15:25 Hello.java
-rw-rw-r-- 1 alinen alinen   31 Mar 31 15:25 hello.rb
-rw-rw-r-- 1 alinen alinen   36 Mar 31 15:25 hello.scm
-rw-rw-r-- 1 alinen alinen   30 Mar 31 15:25 hello.txt
-rw-rw-r-- 1 alinen alinen  368 Mar 31 15:25 .project

Exercise 1: A new commit

Edit hello.txt and commit your changes.

1a. Use the output of git lol to find the ID of your commit
1b. Use git log to find out the time stamp for your commit
1c. Where are the pointers for Head in both the local and remote repositories?

The output of git status says:

On branch master
Your branch is ahead of 'origin/master' by 1 commit.
  (use "git push" to publish your local commits)

nothing to commit, working tree clean

1d. What does this message mean: "Your branch is ahead of 'origin/master' by 1 commit."?
1e. Why does Git report: "nothing to commit, working tree clean"?

Exercise 2: Merging

We can see from the graph that Alyssa and Ben both made simultaneous commits. Read the description here to see what happened.

2a. Why is Ben unable to push his commit without merging?
2b. If Ben had pushed his code before Alyssa, how would the history graph change?

Exercise 3: Looking at files

Use the git show command to see the contents of any commit.

$ git show 1255f4e
commit 1255f4e4a5836501c022deb337fda3f8800b02e4
Author: Max Goldman <maxg@mit.edu>
Date:   Mon Sep 14 14:58:40 2015 -0400

    Change the greeting

diff --git a/hello.txt b/hello.txt
index c1106ab..3462165 100644
--- a/hello.txt
+++ b/hello.txt
@@ -1 +1 @@
-Hello, version control!
+Hello again, version control!

Note that Git only shows the diffs by default. If we want to see the full commit, add a : like so

$ git show 3e62e60:
tree 3e62e60:

hello.rb
hello.scm
hello.txt

And to see the state of an entire file

$ git show b0b54b3:hello.txt
Hello again, version control!

3a. Use git show to see the how to file hello.txt changed each commit. Copy and paste the commands in your README.md

Exercise 4: Reverting to a previous version

Suppose we wish to undo our changes and revert back to the state of the remote repository of our last pull (or clone). DANGER ZONE: THIS WILL CLOBBER YOUR LOCAL COMMITS (LOSES WORK)!

$ git reset --hard origin
HEAD is now at b0b54b3 Greeting in Java
$ git lol
* b0b54b3 (HEAD -> master, origin/master, origin/HEAD) Greeting in Java
*   3e62e60 Merge
|\  
| * 6400936 Greeting in Scheme
* | 82e049e Greeting in Ruby
|/  
* 1255f4e Change the greeting
* 41c4b8f Initial commit

Suppose we wish to revert a previous version without losing all our changes. We can use git checkout <ID>. Try the following:

$ git checkout 1255f4e
Note: checking out '1255f4e'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b <new-branch-name>

HEAD is now at 1255f4e Change the greeting

4a What is the output of git lol
4b The warning message says that we're not allowed to persistent changes to this commit, unless we create a new branch. Why might this be? If we make changes to this version, what would happen to the commits at 82e049e and 6400936?
4c If we call git checkout master, what happens? Use git lol


Submission Don't forget to checkin and push your answers!