1. Due Date
Complete lab: Due by 11:59 p.m., Thursday, May 1, 2025.. Pass all tests in the 6 unit test programs.
Checkpoint: Tuesday, April 22. Implement Tuple Nested Loop Join and
pass all the tests in the checkpt
unit tests.
Your lab partner for Lab 8 is listed here: Lab 8 lab partners
Our guidelines for working with partners: working with partners, etiquette and expectations
2. Overview
In this lab you will implement three different versions of the Join operator in the RelOps layer of the SwatDB database management system.
The RelOps layer interacts with the FileManager layer of SwatDB. Your implementations of different Join algorithms will make use of Index and Heap files to access individual records to perform the requested operation, and will store the result in a new Heap.
The Join condition is always EQUALS (e.g., all joins are equijoins). As a result, the result file’s schema will be the outer relation schema followed by the inner relation schema without the join columns duplicated.
You will implement three versions of Join: tuple nested loop, block nested loop, and index join. You will not remove any duplicate records from the result.
For this assignment you will not write large amounts of code. However, you will need to spend significant amount of time reading the starting point code and reading SwatDB documentation in order to figure out how to implement your solution; the code you write for your solution uses many SwatDB classes, some of which you have used and/or written in past assignments, but many of which are new to this assignment. In addition, reviewing some of the documentation and instructions from the Lab 7, the Select and Project Lab will help.
The primary goal of the SwatDB lab assignments is to gain an understanding of the details of how a relational DBMS works by implementing and testing parts of a relational DBMS.
The SwatDB code base is quite extensive and will require significant reading of its documentation (see Section 3).
2.1. Lab Goals
The main goals of the SwatDB RelOps Lab are:
-
Understanding how the relational operators Join can be implemented in a DBMS.
-
Understanding different ways of implementing the same operation using different access methods to the underlying relational data.
-
Gaining some practice evaluating different implementations of the same relational operation.
-
Developing a testing strategy for a large, complex system; making use of a provided unit testing framework, and debugging tools.
-
Practice working with part of a large code base, much of which you have access to only through its interface definition (i.e.,
.h
files and generated documentation).
2.2. Starting Point Code
If you have not already done so, first create a course directory for this course, and add a lab subdirectory for your lab repos:
mkdir -p cs44
mkdir -p cs44/labs
cd cs44/labs
We will be using git
repos hosted on the college’s
GitHub server for labs in this class.
If you have not used git
or the college’s GitHub server before, here
are some detailed instructions on
using
git for CS44 labs.
Next find your git repo for this lab assignment off the GitHub server for our class: CS44-s25
Clone your git repo (Lab8-userID1-userID2
) containing starting point files into your
labs directory:
cd ~/cs44/labs
git clone [the ssh url to your your repo]
cd Lab8-userID1-userID2
If all was successful, you should see the following files:
2.2.1. Lab 8 Files
-
Makefile
- pre-defined. You may edit this file to add extra source files or execution commands. -
README.adoc
- some directions about how to compile and run test programs for the RelOps layer.
The test programs create relation files and these can be
be corrupted if your programs exit unexpectedly. There is a
cleanup.sh
file you can run to clean these up in this case.
Running make clean
will also run the clean up script.
2.2.2. RelOps Manager layer Files
Most of the RelOpsManager is implemented for you (most of these
files are for your reference). However, you do need to
implement one method function in the RelOpsManager
class.
You should not add any new data members to
these classes or public methods. You may add private helper
methods for good modular code design.
-
relopsmgr.h
: interface to RelOpsManager, the relational operations manager layer of SwatDB. Defined are interface functions to initiate different operations (specific versions of select, project, and join) on specific relation operand(s). You do not need to modify this file. -
relopsmgr.cpp
: contains some of the mainRelOpsManager
methods, including the_createResultFile
method that you call in your implementation of_createJoinRes
that creates the result relation file. You do not need to modify this file. -
relopsmgr_join.cpp
: join-specific functions in theRelOpsManager
class you will complete the__createJoinRes
method that creates a file to store JOIN operation results.It also contains the
join
method the picks a join operation type to perform. You do not need to modify thejoin
method, but we suggest you read through it to understand what it is doing, and to understand the values it passes to the specific join operation.
2.2.3. Join Operation Files
-
operation.[h,cpp]
: the Operation base class. You do not need to modify this, but is here for your reference. -
join.[h,cpp]
: the Base Class for join operations. You do not need to modify this, but is here for your reference.You may add private helper method functions to this (and you may want to add some that are common to all join methods). Do not add any additional data fields to this class.
-
tupleNLJ.cpp
: the tulple nested loop version of the Join operation. you will implement therunOperation
method that implements the tuple nested loop version of Join. -
blockNLJ.cpp
: the block nested loop version of the Join operation. you will implement therunOperation
method that implements the block nested loop version of Join. -
indexNLJ.cpp
: the index nested loop version of the Join operation. you will implement therunOperation
method that implements the index nested loop version of Join.
2.2.4. Join Test Files
We are giving you a lot of unit tests for this assignment.
You are not required to add more tests as part of this assignment. However, you still may want to add some more for further stress testing your solution. If you add new tests, please add verbose comments describing what your test is testing.
Unit tests files:
-
chkpt.cpp
- unit testing code for tuple nested loop join on the small DB (this is one that you can print out to see tuples in relations and results to check your answers). -
smalltests.cpp
- unit testing code for the small DB (this is one that you can print out to see tuples in relations and results to check your answers). -
tnljoin_tests.cpp
- unit testing code for tuple nested loop join. -
bnljoin_tests.cpp
- unit testing code for block nested loop join. This also contains tests of the join result file schema (check the output for these to see that you are correctly not duplicating the join column from the inner relation in the result). -
indexjoin_tests.cpp
- unit testing code for index nested loop join. -
exception_tests.cpp
- unit testing code for exceptions with join operations.
2.2.5. Scripts (.sh files) and Test Databases
This lab creates and uses some swatDB instances with some relations files.
The instances and files are created in
/scratch/<your user name>/cs44swatDBfiles/
. After you run make
,
you can cat out the .db files to see the relations and their schema.
For example, user tnas
would do this:
cat /scratch/tnas/cs44swatDBfiles/small.db
-
README.adoc
: documentation about the scripts -
Makefile
: make command run these scripts for you (you can run them through make command vs. running a.sh
file at the command line. -
Make sure the
.sh
files are executable:ls -l # should list x permission with executable files chmod 700 *.sh # set permission to rwx for owner if not
-
./cleanup.sh
: cleanup created DB and files from incomplete runs. (make clean
will also do this) -
./mktestconf.sh
followed by./getfiles.sh
will build the examples SwatDB databases for testing. However, just runmake
to build the .db files (easier than running the scripts by hand).
2.3. Deliverables
The following will be evaluated for your lab grade:
-
A complete and robust implementation of the assigned join operations and and RelOps manager methods. This includes adding complete comments, and removing all my TODO comments to you.
-
Passes all unit tests in:
chkpt.cpp
,smalltests.cpp
,tnljoin_tests.cpp
,bnljoin_tests.cpp
,indexjoin_tests.cpp
, andexception_tests.cpp
. The unit test.cpp
are programs for testing your implementation of the relational operators. You are not required to add additional tests for this lab, but you may want to add some to test your solution, and are welcome to add more to these files (please add descriptive comments to any that you add). -
The class definitions in
.h
files. Only add private helper methods to class definitions to support good modular design of your solution. Do not add public methods or data members to any classes defined in these file. Also note where you can and cannot add private data members to classes (look for comments in.h
files and in lab write-up). -
Your TBA[Lab 8 Questionnaire] to be completed individually (This will open on the due date and close after 3 days)
2.4. Checkpoint
Before the checkpoint due date, you should complete the
functionality to pass all the unit tests in the chkpt
program.
The checkpoint functionality includes:
-
Implementation of the tuple nested loop version of Join
While we recommend dealing with exceptions as you implement the methods, we will not require that exceptions are implemented for the checkpoint.
Also, the tnljoin_tests.cpp
unit tests stress test this join methods more.
As a result, after passing the checkpoint you may still need to go back and
debug these methods to pass these later unit tests.
3. SwatDB
This assignment implements parts of the RelOps Manager part of SwatDB, including implementing specific relational operators.
For information about SwatDB, including a link to its on-line code documentation, see this page:
In addition to the .h
files distributed with this lab, the
SwatDB documentation that will be particularly helpful for this lab
includes:
-
schema.h: create a
Schema
object for relation file results -
record.h: get
Record
data from file tuples, create result Record to add to result file. -
key.h: create and use search keys.
-
searchkeyformat.h: need to create a
SearchKeyFormat
object that you use to initialize aKey
used for the select part of a Join operation (finding matching tuples on the join fields). -
filemgr.h: create
HeapFile
relation file for the operation result. -
heapfilescanner.h: scan through records in a HeapFile.
-
hashindexscanner.h: use this to scan through index to get find
Rid
values that match the selection criteria. -
blockhashfilescanner.h: use this to scan through blocks of pages of HeapFile to match blocks of pages of records in the outer relation to a pages of tuples from the inner relation for the block nested loop join algorithm.
-
Common SwatDB type definitions, defined in swatdb_types.h
-
The Exceptions classes are defined in swatdb_exceptions.h. You may need to catch some, and throw others.
4. Lab Details
All relational operations are invoked through the RelOpsManager
object
(defined in relopsmgr.h
, relopsmgr.cpp', and `relopsmgr_joins.cpp
)
that implements the interface to the relational operations manager layer
of SwatDB. The RelOpsManager has interface methods for each of the
main operations: select, project, and join. These methods take parameters
that specify the relation(s) on which to perform the operation, and
some take a parameter that specifies the specific type of algorithm to
use to perform the operation.
Specific relation operations are implemented as classes derived from the
Operation
base class (defined in operations.[h,cpp]
). Start by
investigating the class hierarchy for different versions of relations
operators to understand what different join classes inherit. For
example, IndexNLJ
(in indexNLJ.[h,cpp]
is derived from Join
(in
join.[h,cpp]
, which is derived from Operation
(in operation.[h,cpp]
).
The on-line documentation is useful for seeing the class hierarchy. You can
view .h
files there, or we have included several of the related .h
files
for you with the lab starting point code that you can view them in an editor,
for example: vim join.h
What to implement
For this lab, you will not implement a large amount of code, but you will need to spend a fair amount of time reading starting point code and SwatDB class documentation to determine how and where to implement the different join operations.
4.1. RelOpsManager
-
RelOpsManager class
: this class implements the interface to the relational operators level of SwatDB; it has public methods for performing select, project, and join operations. It also has a method,checkFilesEqual
, that can be used to test the results of operations.In addition to public methods, it has several private methods that create the correct result file for the operation. For this assignment, you will implement one of these method functions for the join result,
There is nothing for you to implement in the relopsmgr.cpp
parts of
the RelOpsManager
class, but you do need to implement some join-specific
parts in relopsmgr_joins.cpp
.
4.2. relopsmgr_joins.cpp
This file contains the implementation of the join-specific methods of
the RelOpsManager
class (delcared in relopsmgr.h
). The two main
methods that invoke join operations (one for non-hash based joins,
and the other for hash-based joins) are implemented for you.
However, you need to implement the function that creates the result
file for join operations:
-
`FileId RelOpsManager::_createJoinRes(FileId outer_fid, FileId inner_fid, std::vector<FieldId> o_fields, std::vector<FieldId> i_fields)': creates a new HeapFile to store the result of a join operation. Note that the result file’s schema is a subset of the relation file’s on which the join operation is performed (all fields from both Relations, but without the join fields of the second (inner) relation duplicated).
Some requirements about this:
-
The schema for the result relation is the combination of the outer relation file’s schema followed by the inner relation file’s schema without the join fields duplicated; the result schema should include the join fields of the outer relation’s schema, and not the join fields of the inner relation’s schema.
Note: the
std::unordered_set
class may be useful in your implementation to help you identify the inner relation’s join columns and not add them to the result file’s schema. -
The primary key of the result is the primary key field(s) of the outer relation.
-
The field names of the result schema need to be unique. As a result, if the inner relation has fields with the same name as the outer relation, then those result schema field names are prefixed with the inner relation name to make them unique (and only in this case are they prefixed with the inner relation’s name)
-
after setting up Schema for result relation, your code can call the
_createResultFile
method (which we give you inrelopsmgr.cpp
) to create the result relation file
Note that the Also note that |
4.3. Operation
The Operation
class is the base class of all relational operations.
There is nothing for you to implement, but you should look through the
operation.[h,cpp]
files to see the class definition and what some
of its methods do. In particular, note:
-
struct fileState
(defined inoperation.h
). This structure stores state for the operand files and indices and for the result file of the particular operation. Some of the fields in this struct can be used to store and manipulate record data as part of the operation.Since every operation has a result file, the
fileState
for the result relation file is a field of theOperation
base class (result_state
), and it is initialized by the constructor of theOperation
class.A
fileState
struct for the source relation files are initialized theJoin
constructor (injoin.[h,cpp]
). Look at the starting point code that contains calls to_initState
that performs this operation---theOperation::_initState
method is used to initialize afileState
structure for the source relations (look forfile_state
data fields in the class hierarchy of the different Join operation classes). -
The
_initState
method is called by theOperation
constructor and by the constructor of the derivedJoin
classes. This method initializes thefileState
structs associated with the result and the source files for the operation. -
The
_delState
method cleans-up andfileState
structs created for the operation.
4.4. Join
-
Join class
: this is the base class for specific join operations. You do not need to modify this class, but look at its delclaration (injoin.h
) to see the data members that are part of the base join class, and which you will use to implement different join algorithms.
you do not need to remove duplicates from the result relation.
SQL allows duplicates in result relations. Only when the query
includes DISTINCT , or is a set operation like Union, are
duplicates removed.
|
+
You will also need to make use of the Key
and SearchKeyFormat
classes to
create objects that can be passed to and compared with index file entries.
Be sure that any temporary objects created in this, are deleted before exit.
4.5. TupleNLJ
-
runOperation
: implement tuple nested loop join. The algorithm performs the join operation such that for each tuple in the outer relation, a scan of the whole inner relation is done to find matching result tuples.
Some useful methods:
-
HeapFileScanner
: create one for each relation to scan all records usinggetNext
. You can delete the inner file scanner after each full inner file scan and create a new one for the next outer tuple. -
Key::setKeyFromRecord
: create search key fields from a record (used to compare records on their join key fields. -
Key::compareDifferent
: to compare key fields from the inner and outer relation records -
Record::combineRecords
: create a result record from an outer and an inner relation. -
File::insertRecord
: add a result record to the result HeapFile
comparing join field values
We suggest that you construct A For implementing join operations, you can use a Review the |
4.6. BlockNLJ
-
runOperation
: implement block nested loop join. A block of tuples (multiple pages of tuples) of the outer relation should be compared to all tuples of the inner relation to find matches. The outer relation loadsblock_size
number of pages in the buffer pool at a time, and the inner relation is scanned a single block (1 page) at a time. Records on all pages in the outer relation’s block are compared with records on each page of the inner relation (one page at a time), and a new result records are created and inserted in the result relation as matches are found on the join condition.You should use the
BlockHeapFileScanner
class to create block-sized scanners over the HeapFiles of the outer and the inner relations. The outer should scanblock_size
blocks at a time, and for each one, the inner should be scanned completely, one block (one page) at a time. Records on all the current blocks of the outer relation should be compared to all the records on each block of the inner relation to find matches before the next block of the inner block is scanned.The
combineRecords
method of theRecord
class will be helpful for creating result records.The
setKeyFromRecord
andcompareDifferent
methods of theKey
class will be helpful.You will use the
BlockHeapFileScanner
object to access records from each of the two relation’s blocks of pages of records to find and create join results. The methodsgetNext
,resetBlock
.
4.7. indexNLJ
-
IndexNLJ
constructor: implemented for you. Note that it initializes theindex_file
field of theIndexNLJ
object. -
runOperation
: implement index nested loop join. On the outer relation create aHeapFileScanner
object to scan each tuple. Use thesetKeyFromRecord
method of theKey
class to extract out the key fields to use as look up in the index. Create a newHashIndexScanner
associate with all index entries with a matching key value (the index key file may not contain a primary key, so there may be duplicate matching values for a given outer tuple). With all matching RIDs from the index, and then use thegetRecord
method on the inner relation file to get the record data.The
combineRecords
method of therecord
class will be helpful for creating result records.The
setKeyFromRecord
andcompareDifferent
methods of theKey
class will be helpful.
5. Lab Requirements
In addition to correctly implementing the Join operations, and adding code to test your implementation, you should also:
-
Declare and use variables of the types defined in
swatdb_types.h
as opposed to their underlying type definition. Also use constants and enum types defined in this file - they help make the code more readable. For example, if a method returns aFileId
, declare a variable of typeFileId
rather thanstd:uint32_t
orint
to store its return value:FileId result_fileid;
-
Write good C++ code design, and good modular design in your solution. This includes using defined constants and types.
-
Ensure you code is robust to errors, in particular, be sure to test for error handling for exceptions that should be thrown and caught by the buffer manager.
-
Ensure your code is free of valgrind errors.
-
Make sure your code is well-commented, and there is no line wrapping. (See our C++ Style guide link from the Handy Links section.
-
Your code should be free of all compiler warnings. The one exception is that there is a known deprecation warning with
SHA1_
functions that SwatDB uses for hashing. If you see these, you may ignore them. -
Your submitted code should have all of our TODO comments removed…as you implement a TODO, remove it. These (as well as NOTE comments) are also helpful to find parts of the given code that you need to implement.
6. Testing your code
There are several test files in the starting point code. They use the same unittests framework you used in CS35, and test various relational operator functionality and exceptions:
6.1. unit tests
-
checkpt.cpp
: tuple nested loop join on a very small DB (one that you can print out all relations and examine results) -
smalltest.cpp
: tuple and block nested loop joins on a very small (one that you can print out all relations and examine results) -
tnljoin_tests
: tuple nested loop join tests on larger DB -
bnljoin_tests
: block nested loop join tests on larger DB. This also has a test suite for testing that the result schema corrrectly projects out the inner relation’s join columns. Check the output of these runs to see if your join result file is correct. -
indexjoin_tests
: index join tests on larger DB -
exception_tests
: exception join tests on larger DB
You can add additional tests to any of these files by following the examples
in this the files (add them as a new test SUITE
separate from the ones
we give you.
6.2. Test DB relations
When you type make
, along with building the unittest executables, the
Makefile rule runs the getfiles.sh
script which creates .db
and
relations files in /scratch/yourusername/cs44swatDBfiles/
directory.
Two DBs are created, small.db
and tables.db
, and are used in the
unit test programs.
When you type make clean
the Makefile is set up to run the cleanup.sh
script to remove these DB files.
The .db
files created are ascii files and are readable in an editor
program. The unit test code also has commented out calls for printing
out the Catalog and relation files in the test code (note: for large
relation files, only the first 50 records are printed).
You should not need to ever run either by hand, but you can.
See the README.adoc
files form more information.
6.3. To run unit test programs:
# run all of the unittest test suites
make runtests
# run individual tests
./chkpt
./smalltests
./tnljoin_tests
./bnljoin_tests
./indexjoin_tests
./exception_tests
# or you can run individual test suites alone using -s testSuiteName
# to list the test suites names run with -h, for example:
./bnljoin_tests -h
6.4. Cleaning up corrupted files
Run make clean
or you can explicitly run the ./cleanup.sh
to
remove the DB files.
./cleanup.sh
7. Tips and Hints
7.1. General Tips
-
Spend some time reading the starting point source code, and looking at SwatDB docs to get an idea of how the methods you need to implement are called starting from the
RelopsManager
. There is a fair amount of inheritance here. In addition to some of the File and Index interface functions, take time to understand theRecord
,Schema
, andKey
class interfaces as well. -
Implement at test incrementally. Use the
chkpt.cpp
and thesmalltests.cpp
to help guide the order and testing that you do. -
Make use of gdb and valgrind to help you as you go.
-
Look at past weekly lab page for help with C++, gdb, and valgrind.
-
Make use of the
cleanup.sh
script to clean up state from incomplete previous runs. You can also runmake clean; make
to clean-up and to regenerate the source test files.Look at the information in Section 2.2 about the script files and how to use them (or implicitly use them with make command), and how to view the
.db
file contents for the test SwatDB database instances used by the test code. -
Read the
README.adoc
file about some of the scripts. Also, look at theMakefile
to see what is being built (and cleaned up) and where. See Week 8 for more information about Makefiles to help you read it.
7.2. Suggested Order
Here is a suggestion for an order in which to implement the Join operations:
-
Start by completing the implementation of the
RelOpsManster::_createJoinRes
method inrelopsmgr_joins.cpp
This create a new HeapFile result file for the join operation. You will need to create a newSchema
for this file that consists a combination of the outer and inner relations’s schema with the join fields only appearing once (don’t duplicate the join fields from the inner relation), and possible field name conflicts resolved. Thestd::unorderd_set
class may be useful in your implementation to help you identify the inner relation’s join columns and not add them to the result file’s schema. -
Next, start with the tuple nested loop version of Join (in
tupleNLJ.[h,cpp]
, and get it to work with thechkpt.cpp
and then with thesmalltests.cpp
unit tests. You will want to refer to the base classjoin.[h,cpp]
andoperation.[h,cpp]
as you implement. The main steps are:-
Create a new HeapFile for the result (this step is done for you in the starting point).
-
Create new HeapFileScanner objects on the HeapFile sources.
-
Scan each record in the outer relation, and for each one scan every record in the inner relation, and if it matches the join criteria, insert the resulting crated join result from the record and insert that record in the result file. You can use the
combineRecords
method of theRecord
class to create the result record. Note that Join does some things that are similar to project and other things that are similar to select as it idenfities and constructs result records. -
Test on the
chkpt.cpp
and then on thesmalltests.cpp
unit tests first. -
Come back later to test on
tnljoin_tests.cpp
-
-
Next, implement the block nested loop version of join (in
blockNLJ.[h,cpp]
.-
Create a new HeapFile for the result (this step is done for you in the starting point).
-
Create new
BlockHeapFileScanner
objects on the inner and outer relation files (this step is done for you). Note that the outer relation is usingblock_size
blocks of pages, while the inner relation is using a single page size block (1
). -
Implement the block nested loop join such that for records in each outer_block are combined with records in every inner block. As records are found that matches the join criteria, insert the resulting created join result record and insert that record in the result file.
-
Test on the
smalltests.cpp
unit tests first. -
Come back later to test on
bnljoin_tests.cpp
-
-
Next, implement the index nested loop version of join (in
indexNLJ.[h,cpp]
).-
Create a new HeapFile for the result (this step is done for you in the starting point).
-
Create a
HeapFileScanner
object on the outer relation, and a `HashIndexScanner on the the index file. -
For each outer record, use the index scanner to find matches, and fetch the records from the inner file. For each matching record, create a result record from the outer and the inner record (use
combineRecords
method), and add the resulting record to the result file. -
Test on
bnljoin_tests.cpp
-
-
Finally, revisit all implementations and stress test on larger test files and with exception testing. Test with all unittest programs:
chkpt.cpp
,smalltests.cpp
,tnljoin_tests.cpp
,bnljoin_tests.cpp
,indexjoin_tests.cpp
, andexception_tests.cpp
.
8. Submitting your lab
Review the lab deliverables to ensure you have completed all of your work. Before the due date, push your solution to github from one of your local repos to the GitHub remote repo.
From your local repo (in your ~/cs44/labs/Lab8-userID1-userID2
subdirectory)
make clean
git add *.h *.cpp
git commit -m "my correct and well commented solution for grading"
git push
Be careful not to add binary files to your repo (executable or .o files
that are compiled when you run To avoid adding these files to your repo,
NEVER run these commands: Instead add only the files you want to add to your repo by explicitly
listing them ( Here are three alternative ways to submit a group of changes to lab files
using some
Run |
Verify that the results appear (e.g., by viewing the the repository on CS44-s25). You will receive deductions for submitting code that does not run or repos with merge conflicts. Also note that the time stamp of your final submission is used to verify late days, so please do not update your repo until after the late period has ended.
If that doesn’t work, take a look at the "Troubleshooting" section of the Using git for CS44 labs and the Using git pages. At this point, you should submit the required TBA[Lab 8 Questionnaire] (each lab partner must do this).
9. Handy References
-
Information about SwatDB
-
Review in lab exercises from Week 9
-
Some C++ Programming Resources and Links including the C++ Style Guide
-
C++ programming tools compiling, linking, debugging C++
-
C references in Dive into Systems (some useful for C++ programming too) Chapter 2: C pointers, command line arguments; Chapter 3: debugging tools (valgrind, gdb for C)
-
gdb Guide also in Chapt. 3 of Dive into systems
-
Valgrind Guide also in Chapt. 3 of Dive into systems
-
my CS help pages (Unix tools, programming links)
-
Appendix 2: Using Unix from Dive into Systems textbook
-
Using Git more complete Git guide