1. Due Date
Complete lab: Due by 11:59 p.m., Friday, November 20, 2020.
Checkpoint: Wednesday, November 11 by noon.
Your lab partner for Lab 4 is listed here: Lab 4 lab partners
Our guidelines for working with partners: working with partners, etiquette and expectations
2. Overview
In this lab you will implement the SwatDB HeapFile
class for storing
pages of variable-sized records. This is a part of the File Management
layer of SwatDB that implements the abstraction of a heap file (an unordered
collection of records). In the previous lab, you implemented a
HeapPage
for storing one page of records in a heap file. For this
lab you will implement an entire heap file. This implementation of
a heap file organizes pages of records in
a single linked list of HeapPage
pages.
The primary goal of the SwatDB lab assignments is to gain an understanding of the details of how a relational DBMS works by implementing and testing parts of a relational DBMS. The SwatDB code base is quite extensive and will require close reading of its documentation (see Section 3).
2.1. Lab Goals
The main goals of the SwatDB Heap Page Lab are:
-
Understanding the structure of a DBMS heap file, and details of implementing its interface.
-
Understand how the file layer interacts with the buffer manager and disk manager layers of a DBMS.
-
Further practice with manipulating low-level data structures in C++, and mapping types onto raw bytes of memory.
-
Developing a thorough testing methodology for a large system.
-
Understanding the role of abstraction in large systems.
-
Practice working with part of a large code base, much of which you have access to only through its interface definition (i.e.,
.h
files and generated interface documentation).
2.2. Starting Point Code
Find your git repo for this lab assignment off the GitHub server for our class: cs44-f20
Here are some detailed instructions on using git for CS44 labs.
Clone your git repo (Lab4-userID1-userID2
) containing starting point files into your
labs directory:
cd ~/cs44/labs
git clone [the ssh url to your your repo]
cd Lab4-userID1-userID2
If all was successful, you should see the following files (highlighted files require modification):
Lab 4 Files
-
Makefile
- pre-defined. You may edit this file to add extra targets. Here are a few of the most common commands that you will use:make make clean # rm all built files and any DB files created in test code make gcov # build a gcov version of untittests (gcovunit)
-
README.md
- some directions about how to compile and run test programs for theHeapFile
. -
heapfile.h
- the SwatDBHeapFile
class, and related struct definitions. Do not add any new data members to these classes or public methods. You can add private helper methods for good modular code design. -
heapfilescanner.h
- the SwatDBHeapFileScanner
class. Defines an object for scanning (or iterating) over all records in aHeapFile
. Its main method isgetNext
. This class may be useful in your testing code. -
heapfile.cpp
- the SwatDBHeapFile
class implementation. Most of the code you implement will be in this file. All methods defined inheapfile.h
should be implemented here. Make sure to include good function comments in the this file too (function comments in.h
files alone is not sufficient). For any missing functions, start by copying and pasting function comments from the .h file into here. Then, modify the comments to provide information to a reader of the C++ implementation). -
unittest.cpp
- unit testing code forHeapFile
. We are not giving you many complete unit tests with this lab. Instead, use this file as a starting point for adding more complete tests of your implementation. Use the design of the test suites in this file as an example for how to add others. -
checkpt.cpp
- unit testing code for passing the checkpoint. -
checksaved.cpp
- test code to test the persistance of heap files. This is written in the style of the sandbox.cpp program rather than using unittests. -
sandbox.cpp
- another way to test your code. This is a more application code style vs. using the unit-test infrastructure. You can use this to add your own testing code. This is meant to complement the unit tests and to help with implementation, debugging, or designing new tests. -
runchkpt.sh
,runtests.sh
- scripts to run checkpoint and test programs. They include calls to thecleanup.sh
script to clean up any files that are created and left behind from crashed runs.
2.3. Deliverables
The following will be evaluated for your lab grade:
-
The
HeapFile
class inheapfile.cpp
file. This is the primary file in which you will add code. The class and struct definitions to implement are defined in theheapfile.h
file. -
The class definition in
heapfile.h
. Only add private helper methods to class definitions to support good modular design of your solution. Do not add public methods or data members to any classes or structs defined in this file. -
The
sandbox.cpp
andunittests.cpp
are two programs for testing theHeapFile
implementation.You must add code to
unittests.cpp
, and you can add to both to fully test and debug your solution. Test code added should be well commented, clearly explaining the specificHeapFile
functionality it tests. -
Your Lab 4 Questionnaire to be completed individually (This will open on the due date and close after 3 days)
2.4. Running Test Code
The unit test programs and the sandbox test programs create SwatDB heap files that are stored on disk. The test code is designed to clean-up these files before it exits. However, if your code crashes these files may stick around, and they will cause subsequent runs to crash.
With your starting point code, is a clean-up script that you can run to clean-up these files:
./checkpt # run the checkpoint unittests
./cleanup.sh # clean-up any state not cleaned up from this program
If the .sh
scripts do not run, make sure they are executable, and
run chmod
to add executable permissions if not:
ls -l *.sh
-rwx------ cleanup.sh # should see x permission (read,write,execute)
chmod 700 cheanup.sh # set to rwx if not
ls -l *.sh
We also have run scripts that run test code programs, and clean-up their state afterwards. These may be easier to use to run your test code:
./runchkpt.sh
./runtests.sh
You can also create your own run scripts, using ours as a starting point:
cp runtests.sh runmine.sh
chmod 700 runmine.sh # set this file to executable
vim runmine.sh # edit the contents to do what you want
Here are some bash shell programming links
2.5. Checkpoint
Before the checkpoint due date, you should complete the HeapFile
functionality necessary to pass all the unit tests in the checkpt
program,
and in the checksaved
program. We recommend that you run and
test these using the run scripts that cleanup state after crashed
runs:
./runchkpt.sh
You can also run each by hand, and use the cleanup.sh
script to clean
up any files left over on incomplete runs (these files end in .rel
or
.db
):
./checkpt # run the checkpoint unittests
./cleanup.sh # clean-up any state not cleaned up from this program
./checksaved # run the check saved test
./cleanup.sh # clean-up any state not cleaned up from this program
While we recommend dealing with exceptions as you implement the methods, we do not require that all exceptions are implemented for the checkpoint.
The checkpoint functionality includes:
-
Most
HeapFile
methods fully implemented, except for deleteRecord.
Much of the functionality of HeapFile
should be implemented for
the checkpoint. Because deleteRecord
is not included, you do not
have to handle the case of removing pages from the file, only adding
new ones.
3. SwatDB
In this assignment you will implement the
the HeapFile
class part of the File Management layer of SwatDB.
Implementing a relation file requires support for creating and
deleting a file,
and for inserting, deleting, and updating records in the file. These
operations may result in requests to the Buffer Manager layer to get,
release, allocate or deallocate a Page for the file. The implementation
also makes use of the HeapPage
class' interface to perform
operations on individual pages of records.
For information about SwatDB, including a link to its on-line code documentation, see this page:
SwatDB documentation that will be particularly helpful for this lab includes:
-
The Buffer Manager class defines the interface to the buffer manager layer of SwatDB. Note: when calling buffer manager method functions, be sure to appropriately handle or pass through any exceptions it throws.
-
The File Class defines that base class for all File objects in the system.
HeapFile
is derived from the class. Look at its documentation to see methods and data members that theHeapFile
inherits. -
The Page Class defines that base class for all pages in the system. Its
getData
method provides access to the raw page data: thePAGE_SIZE
bytes of memory space for a page. No derived class ofPage
can add additional data methods, they can only map structure on top of the raw page data. -
The Record class is a structure for storing record information. It has a data member of type Data class is a simple structure that stores the actual record data.
Many
HeapFile
methods are passed pointers toRecords
as parameters. Looking at the test code examples and at the documentation to help understand how to access and set information for records inserted or retrieved from the page. -
The HeapPage Class is used by
HeapFile
. You should call methods of this class to perform operations on individual pages of the file. -
Type and constant definitions in swatdb_types.h.
PageNum
,PageId
,RecordId
,SlotId
, and other types are defined here. -
The Exceptions classes defined at the HeapFile, BufferManager and DiskManager layers in exceptions.h that
HeapFile
methods may need to catch or throw. Check the method function comments inheapfile.h
andheapfilescanner.h
to see if a particular method needs to throw an exception(s). Exceptions withBufMgr
orDiskMgr
suffixes are thrown by those layers and are passed through to callers ofHeapFile
methods.Look at the SwatDB documentation for the exception classes, and look at the SwatDB info page for examples of how to throw and catch exceptions.
4. Lab Details
The HeapFile
class is defined in the
heapfile.h
header file included with the lab starting point code.
Open this file in an editor to read its contents:
vim heapfile.h
NOTE: the SwatDB implementation of HeapFile is different from the version you are implementing here. Thus, we suggest you avoid using SwatDB (web) documentation and stick to reading the .h files to understand the interfaces you need to implement.
The HeapFileScanner
class is defined in the heapfilescanner.h
header file. It is also different from the SwatDB version, which
is why we are giving you its .h
file here. You do not need to implement
this class for this lab assignment, but you may want to use it in your
test code. You can see its interface:
vim heapfilescanner.h
4.1. Implementation Overview
You will implement a heap file that stores all of its pages of records in a single linked-list of HeapPage pages. This is different from the two linked-list that we discussed in class that used two linked-lists of pages, one list of full pages and another of pages with free space. In your version, there is a single linked-list of pages, some of which may be full and some of which may have space.
The records in the heap file are stored on HeapPage
type pages.
The next_page
and prev_page
fields in the HeapPageHeader
of
each page link pages together into a doubly-linked list of the
file’s pages.
You will complete the implementations of
the HeapFile
class,
and implement more tests in unittests.cpp
to
test both correctness and robust error handling. The
testing part is described in
more detail in the Testing your code section.
A heap file is organized as a single header page of meta-data about
the heapfile and a linked list of HeapPage
pages that store records.
The heap file’s single header page of metadata includes:
-
head
: thePageNum
of the first page on the file’s linked list ofHeapPages
. -
num_pages
: the number of pages in the file -
num_records
: the number of records in the file
as shown in this figure:
HeapFile is derived from the File base class that defines a generic interface to all file types in the system.
Note that the HeapFile
object only
exists when SwatDB is running. However, its underlying set of pages
persists between runs (i.e. its header page and linked list of pages of
HeapPage
structured page data are stored as a file on disk,
managed by the DiskManager
layer).
HeapFile objects are created in SwatDB in response to one of two actions:
-
When SwatDB starts up, any DB relation files that are stored on disk are opened and a new
HeapFile
object is created and associated with each one, -
During SwatDB runtime, if new relation file is created, a
HeapFile
object is created and associated with the new file.
In the test code with this lab, we provide the code needed to
create HeapFile
object(s) in response to both of these scenarios.
4.2. Heap File Header Page and type re-casting
The header page of the HeapFile should be allocated on a separate
Page
that stores only the heap file meta data.
A struct HeapFileHeader
is
mapped onto the first bytes of the raw data of the Page
.
You
must not add additional data members to the Page base class;
any additional
data members would increase the total number of bytes in the derived class,
but every Page in the system must have exactly the same number of bytes
(the number of bytes declared in the Page
class’s data
array).
Do not store any records in the header page of the heap file. The first
record stored in the file should be stored on a newly allocated HeapPage
.
4.3. HeapFile Class
HeapFile
inherits the following data fields from the File
class:
-
header_id
: this is thePageId
of the heap file header page. You will allocate a newPage
for a HeapFile’s header page, initialize it, and set this field to the new page’sPageId
value. -
buf_mgr
: a pointer to the SwatDBBufferManager
object that manages the buffer pool. HeapFile methods will need to make calls to allocate/deallocate/get/release file pages usingBufferManager
methods. -
file_id
: the file’s unique identifier in the system. This is set by the File Manager when theHeapFile
object is created. -
catalog
: a pointer to the SwatDB Catalog object that contains information about all files in the system. (you do not need to use this object in your implementation, but it is used in the sandbox test codeprint_fids
function). -
schema
: a pointer to the Schema object describing the relation stored in this file (you do not need to use this in your implementation beyond some error checking).
In heapfile.h
is a struct definition that defines the header page
fields. The header page stores heap file-specific metadata about
the file and its structure. It is mapped onto the underlying Page
data bytes.
The heap file (shown in [HeapFileFig]) is organized so that:
-
Page 0 is the file’s header page.
-
Remaining Pages are stored in a doubly linked list of
HeapPage
pages that store record data. Records can be inserted on any page in the file, and individualHeapPages
can be in any order in the linked list (i.e. it is not sorted byPageNum
or record field values). -
Both full pages and pages that have some space are stored in any order in the linked list of pages.
The header page and the record data pages are stored on disk as a file. The DiskManager manages how they are stored on disk.
HeapFile
methods will call
BufferManager
method functions to allocate, deallocate, get, release, and flush pages
from the buffer pool to handle operations on the file. NOTE the
buffer manager flushPage method should not be called on regular heapfile
operations like inserting or removing a record from the file. This method
forces a page to be written to disk, which is useful for rare DB operations,
but for most operations the higher levels should just let the buffer
manager manage the buffer pool, and let its replacement policy decide
when a page in the buffer pool gets written out to disk.
HeapFile
methods will call
HeapPage
to insert/delete/update records on individual pages of the file,
and to set page’s prev_page
and next_page
fields to link pages
together in the singly linked list of pages of record data.
4.3.1. Type recasting
You will need to use type-recasting in a few ways in this assignment.
-
The
BufferManager
methods returnPage *
, but your code needs to manipulate the pages as specific types. You can re-cast return values as a pointer to the specific type to do this. For example:HeapPage *pg; ... // recast the return type of getPage to (HeapPage *) pg = (HeapPage *)buf_mgr->getPage(pg_id);
-
We do not give you methods that recast the raw page data of the header page to a
HeapFileHeader *
in a similar way that we did with theHeapPage
lab. However, we suggest that you add one as a private helper method function:private: /** * @return address of header page recast as * a (HeapFileHeader *) * TODO: what are the pre and post conditions? * TODO: does this throw any exceptions (or pass any through)? */ HeapFileHeader *_getHeaderPage();
We recommend that you define and implement this as the first method, and use it throughout your code. Look at the
HeapPage
lab starting point code for examples of similar methods.
4.3.2. HeapFile interface methods
These methods are implemented for you (look at their implementation for some hints at how to to call Buffer Manager layer methods):
-
constructor: invokes File base class constructor and sets header page to
INVALID_PAGE_ID
. -
createHeader
: allocates and initializes a new header page for the heap file. This method is only invoked when a new heap file is created in the system vs. when an existing heap file is loaded from disk. You do not need to create a header page in your HeapFile code: the SwatDB FileManager creates aHeapFile
object, and it sets the HeapFile’s header page or calls this method to create one. -
flushHeader
: flushes the header page to disk.
The following are HeapFile interface methods you need to implement (you may also want to add some private helper method functions too). NOTE: most of these methods call BufferManager and HeapPage interface method functions to accomplish some of the listed subtasks. Be sure you review the interfaces of those two classes before beginning implementation.
-
insertRecord: inserts a record into the file and returns the
RecordId
of the inserted record.HeapFile
meta data is updated to reflect that the record has been inserted, or an error exception is thrown if the insert fails.The passed record is inserted into the first page on the doubly linked list of pages that has enough free space to store it. If there are no existing pages that have enough space to store the record, then a new page is allocated for the file, added to the head of the linked list of pages, and the record is stored on that page.
Each Page that needs to be accessed to handle an
insertRecord
(including the header page and allHeapPages
) need to utilize theBufferManager
to first bring the page into the buffer pool. To access a Page, it needs to be pinned (either through a call togetPage
orallocatePage
BufferManager methods), and should be unpinned (via a call toreleasePage
BufferManager method) when the method is done using the page. The dirty bit should be set appropriately for all pages that are modified on an insert record. Any page pinned by this method must be unpinned before the method returns (or before it throws an exception). You will need to keep careful track of when a page is still pinned and when it is no longer needed. Any exceptions in between those two points need to unpin the page before throwing/rethrowing the exception.This method throws exceptions on errors, some of which may come from calls to HeapPage methods or to BufferManager methods that are thrown by the buffer manager or disk manager layer. Look at the function comments for more specific information about these. Any time you access either the HeapPage or BufferManager methods, be sure to check the documentation to make sure you are aware of potential exceptions you may need to handle.
Here are the main steps of this method (note: calls to BufferManager layer to
getPage
,allocatePage
andreleasePage
need to be made in implementing some of these steps):-
Start the implementation of
insertRecord
by checking for some error conditions and throwing exceptions when the checks fail.-
If the passed record is too big, throw an InsufficientSpaceHeapPage exception. HINT:
MAX_RECORD_SIZE
is the largest record aHeapPage
can store. -
The passed record schema must match this files’s schema. This just requires comparing the two Schema pointer values for equality. Throw an InvalidSchemaHeapFile exeception if they do not. Examine the class data members and the Record interface to access the schemas.
-
-
Search for a page in the doubly linked list that has space to insert the record. There are two main cases:
-
If such a page is found, insert the record on the page (call
insertRecord
on the HeapPage). Think carefully about pinning/ unpinning and handling exceptions. -
If the list is exhausted with no suitable candidate, insertRecord will allocate a new
HeapPage
that is appended to the beginning of the linked list. Insert the record into this newHeapPage
.Remember that a call to the
BufferManager
allocatePage
method, just returns a pointer to a Page of the buffer pool (a pointer to a Page-size chunk of memory data). The buffer manager has no idea what type of page the caller wants to map on top of the allocated Page of memory space (it could be an index page, it could be a HeapPage, it could be a header page, it could be …). If you want to use the Page of buffer pool space to store particular state (HeapPage data), you need to initialize that Page of memory space to the appropriate values before you start accessing its contents and interpreting their values as having heap page meaning (the Page returned by the buffer manager is just a page of garbage values in memory). See theHeapPage
interface for any useful methods for this.Again, you will need to carefully think about pinning/unpinning/handling exceptions. Also examine [HeapFileFig] to identify all of the various updates you need to make to insert a new
HeapPage
at the beginning of a doubly-linked list.Note: A
HeapFile
can only store up toMAX_PAGE_NUM
pages of data; throw InsufficientSpaceFilePage exception if adding a new page exceeds this amount.
-
-
Update HeapFile header information with results of the insert.
-
-
getRecord: given a
RecordId
value and a passedRecord
to fill, this method constructs aPageId
from the passedRecordId
and from the files'FileId
value, requests the page from the BufferManger (via a call to thegetPage
method), and calls theHeapPage
getRecord
method to copy the record data from the heap page into the passedRecord *
. The page unpinned from the buffer pool before this function returns or throws an exception (via calls toreleasePage
BufferManager method).This method throws exceptions on errors, some of which may come from calls to
HeapPage
methods or toBufferManager
methods that are thrown by the buffer manager or disk manager layer. Look at the function comments for more specific information about them. -
updateRecord: updates an existing record to its new passed value. Note: an updated record must retain its
RecordId
value. This means that an update cannot move a record from one page of the heap file to another.Like other HeapFile methods, any pages of the file that are accessed by the method needs to be pinned before accessed (via getPage/allocatePage) and should be unpinned (via releasePage) when the method is done using them. All pages pinned by this method must be unpinned before the method returns (or before it throws an exception). This method throws exceptions, some of which may come from calls to HeapPage or BufferManager methods.
-
deleteRecord deletes a record, given its
RecordId
, from the file. This method could shrink the heap file by one page.Here are some of the big steps
deleteRecord
must take (you need to refine these):-
Get the page corresponding to this record from the buffer manager.
-
Determine if the
RecordId
is valid. -
Delete the record from the page (using the correct
HeapPage
method) -
Determine if the page containing the deleted record can be removed from the file and remove it from the linked list of pages
-
Update
HeapFile
metadata in the header to reflect a successful record delete.
Like other methods, all pages accessed need to be pinned in the buffer pool and all pages pinned by this method need to be unpinned when the method returns (or throws an exception). It may throw exceptions, which may result from its calls to
BufferManager
orHeapPage
method functions. -
4.3.3. HeapFile debugging methods
There are several methods already implemented that you can use
for debugging purposes only. See the sandbox.cpp
file for
some examples of their use, you can also call these from gdb
or
use them in debugging printout in unittest.cpp
.
Search for THIS METHOD IS FOR DEBUGGING ONLY
in the comments
to find them in the file.
4.4. Exceptions
For information about how to throw and catch different SwatDB exception
objects, look at the Exceptions Section of
the SwatDB information page. You will need to think carefully about the
cause of these exceptions in order to know how to handle them. Use the
method function comments @throw
as a guide for which exceptions methods
may need to throw or to pass through. These comments will also help you
think about some of the error conditions your code may need to check for
and handle appropriately.
For this lab Buffer Manager and Disk Manger layers may throw exceptions in response to calls from HeapFile method functions. Your code needs to determine if it needs to catch the exception or let it pass through.
Remember that if a method throws (or passes through) an exception, it should clean-up any internal state it has partially modified. For this lab, this may include unpinning pages that the method has pinned prior to throwing (or re-throwing) an exception.
5. Lab Requirements
In addition to completing the implementation of the HeapFile
class, and adding many more tests to
unittest.cpp
, you should also:
-
Declare and use variables of the types defined in
swatdb_types.h
as opposed to their underlying type definition. Also use constants and enum types defined in this file - they help make the code more readable. For example, if a method returns aFrameId
, declare a variable of typeFrameId
rather thanstd:uint32_t
orint
to store its return value:FrameId frame_num; PageId pg_id; //get returns a FrameId, storing it as an int compiles but loses valuable // information about the purpose of frame_num frame_num = buf_map->get(pg_id);
-
Write good C++ code design, and good modular design in your solution. This includes using defined constants and types.
-
Ensure you code is robust to errors, in particular, be sure to test for error handling for exceptions that could be thrown by the buffer manager or disk manager layers, and determine if they need to be caught or passed through.
-
Ensure your code is free of valgrind errors.
-
Make sure your code is well-commented, and there is no line wrapping. (See our C++ Style guide link from the Handy Links section).
-
Your submitted code should have all of our TODO comments removed…as you implement a TODO, remove it. These are also helpful to find parts of the given code that you need to implement.
6. Testing your code
There are four test files in the starting point code:
-
checkpt.cpp
: unit tests for the checkpoint# run all of the checkpt test suites ./checkpt # or you can run individual test suites alone using -s testSuiteName ./checkpt -s X # run just the X test suite # to list the test suites names run with -h ./checkpt -h
You can also use the runscripts to run (and cleanup) checkpt and checksaved tests:
./runchkpt.sh
See Section 2.4 for more information about running test code and scripts.
-
checksaved.cpp
: checks the persistence of your heapfile operations. Any modifications to theHeapFile
, like inserting records, should persist between runs of SwatDB. This test checks that your solution is implemented so that changes to theHeapFile
in memory translated to changes on disk. This is accomplished by your solution callingBufferManager
methods correctly. -
sandbox.cpp
: some heappage test code written in a more programatic way than the unittest framework. -
unittest.cpp
: the start of a set of unit tests for the heap page. You are required as part of this lab to develop more unit tests that you must add in this file.
You may use any or all of these to start your HeapFile
implementation,
but you will need to use all of them to verify your program is working
correctly. sandbox.cpp
, checkpt.cpp
, and checksaved.cpp
are
useful for testing early on.
6.1. Required Unit Tests
The unittest.cpp
implements some Heap File test
code using the unit tests framework from CS35 (as does checkpt.cpp
).
You must add additional tests to unittest.cpp
by following the
code examples in this file to help you structure your code.
unittest.cpp
contains a very incomplete set of test functions.
Use this file to add tests beyond those tested by the checkpt.cpp
.
Here are some to consider:
-
tests for exceptions
-
tests that consider boundary testing or stress testing each main operations: insert, delete, update
-
tests that stress test different cases of combinations of operations.
-
tests that produce file sizes that are larger than the size of memory (the file has more total pages the the buffer pool).
unittest.cpp test program
In unittest.cpp
, you can add additional tests to each test suite.
We also have two empty test suites into which you can
add your test code (you don’t have to use these, but these are here
for to use and as example syntax if you want to add more test suites):
/*
* An empty SUITE for to add some heappage tests.
*/
SUITE(studentTests1){
//TODO: add tests
}
/*
* An second empty SUITE to add additional tests.
*/
SUITE(studentTests2){
//TODO: add tests
}
You are welcome to add additional test suites beyond these two, and it will make your testing easier if you do. Follow the same structure as these and the other test SUITES in this file to do so.
sandbox test program
sandbox.cpp
is a more programatic way of designing test code. It includes
an example of calling some of the HeapFile
debugging functions that
are already implemented for you in the starting point of heapfile.cpp
.
Read the code in this file to understand what it is doing, and add your own
tests that follow this model to test a sequence of operations on
your buffer manager.
You may also want to look at
checksaved.cpp
, which is a test program written in the same style
as sandbox.cpp
. It has some helper functions for inserting and checking
records that you may want
to copy into sandbox.cpp
and use, or use as an example of similar
helper functions you may want to add to sandbox.cpp
.
cleaning up corrupted files SwatDB maintains several files to allow for persistent storage of data. While that is not a central feature of this lab, a consequence of a program crash is that temporary files do not get properly cleaned up since the DBMS did not shutdown cleanly. This can cause problems when you try to rerun the program, so you will need to clean this up. Use the two options below:
./cleanup.sh
make clean # or make clean also runs cleanup.sh then just re-build
make
Also see Section 2.4 for running using the runscripts that include calls to the cleanup script for you.
7. Tips and Hints
The following are some tips to help you implement the HeapFile:
-
Run
checkpt
andunittests
often to see which tests you are passing and which fail. This will help you find missing functionality in your code, and some missing cases. These tests are incomplete and part of this assignment is designing and writing more unittests to test your implemenation more thoroughly. -
Implement and test incrementally. Use the checkpt.cpp tests as a guide for what functionality to implement and test first. We suggest this implemenation order (as you add post-checkpoint functionality, you should also add more testests to unittests.cpp to completely test functionality):
-
insertRecord
: just inserting some records to fill up just one Heap page of records.-
need to allocate a new page for the first insert
-
call
insertRecord
on theHeapPage
to insert the record on the page. -
update the header page information to reflect a successful insert.
-
make sure all file pages are unpinned after the call to insertRecord
-
-
getRecord
: implement all functionality, at least all without possibly complete exception/error handling. -
updateRecord
: implement all functionality, or at least all perhaps missing some error handling.-
call
updateRecord
onHeapPage
to update. -
make sure all file pages are unpinned after the call to insertRecord.
-
-
insertRecord
: add support for inserting records that span more than one heap page of records.-
This requires traversing the linked list of pages to find a page with enough free space to insert the record.
-
This may require adding in a new page to the linked list of file pages (a new page should be added to the front of the list), and updating appropriate next and prev fields in pages.
-
ensure that header page information is updated correctly.
-
ensure that all pages modifed are unpinned before this method returns.
-
-
Run the
checksaved
program to ensure that file changes are persistant. -
deleteRecord
:-
first try a delete that does not result in removing a page from the file
-
check file header information is correct after a delete and that succeeds
-
check that none of the pages accessed in deleteRecord, are pinned in buffer pool after the call.
-
next, try delete that deletes the last record on a page. The page should be removed from the linked list of pages,
deallocatePage
should be called to remove it from the underlying file.
-
-
-
Refer to Wednesday in-lab code examples for C++, gdb, gcov, and valgrind reminders.
-
Remember the
&
operator returns the address of its argument (this is C-style operator, not&
used to specify reference parameter types in C++). Its argument must be an lvalue (a storage location, such as the name of a variable or an array bucket). For example, to get the address of the 3rd bucket in and array of ints:int array[20];
you would use the
&
operator like this (in this example I’m assigning its value to an int * variable):int *ptr; ptr = &(array[3]);
-
Refer to the recasting part of Section 4 for some example type recastig code. Also look at the private method fundtions in the starting point for the
HeapPage
lab as another example.
8. Submitting your lab
Review the lab deliverables to ensure you have completed all of your work. Before the due date, push your solution to github from one of your local repos to the GitHub remote repo.
From your local repo (in your ~/cs44/labs/Lab4-userID1-userID2
subdirectory)
make clean
git add *.h *.cpp # only add .h and .cpp file: DO NOT DO git add *
git commit -m "my correct and well commented solution for grading"
git push
Verify that the results appear (e.g., by viewing the the repository on cs44-f20). You will receive deductions for submitting code that does not run or repos with merge conflicts. Also note that the time stamp of your final submission is used to verify late days, so please do not update your repo until after the late period has ended.
If that doesn’t work, take a look at the "Troubleshooting" section of the Using git for CS44 labs and the Using git pages. At this point, you should submit the required Lab 4 Questionnaire (each lab partner must do this).
9. Handy References
-
Information about SwatDB
-
debugging unitttests and using gcov from weekly labs Week 7, and Week 8.
-
Some C++ Programming Resources and Links including the C++ Style Guide
-
C++ programming tools compiling, linking, debugging C++
-
C references in Dive into Systems (some useful for C++ programming too) Chapter 2: C pointers, command line arguments; Chapter 3: debugging tools (valgrind, gdb for C)
-
my CS help pages (Unix tools, programming links)
-
Using Git more complete Git guide