SwatDB
Loading...
Searching...
No Matches
Public Member Functions | Protected Member Functions | Protected Attributes | List of all members
HashJoin Class Reference

#include <hashjoin.h>

Inheritance diagram for HashJoin:
Inheritance graph
[legend]
Collaboration diagram for HashJoin:
Collaboration graph
[legend]

Public Member Functions

 HashJoin (FileId outer_id, FileId inner_id, FileId result_id, std::vector< FieldId > outer_fields, std::vector< FieldId > inner_fields, std::uint32_t num_buckets, std::string temp_path, Catalog *catalog, BufferManager *buf_mgr, FileManager *file_mgr)
 Constructor for the join operater using a hash join algorithm. Sets up the state for a single join operation using the specified state.
 
 ~HashJoin ()
 Destructor for hashjoin class.
 
void runOperation ()
 Performs the join operation using the hash join alogrithm.
 
- Public Member Functions inherited from Join
 Join (FileId outer_id, FileId inner_id, FileId result_id, std::vector< FieldId > outer_fields, std::vector< FieldId > inner_fields, Catalog *catalog)
 Constructor for Join operation. Join subclasses use this constructor.
 
 ~Join ()
 Destructor for the Join Operation.
 
- Public Member Functions inherited from Operation
 Operation (FileId result_id, Catalog *catalog)
 Constructor for the Operation class. Because Operation is an abstract class, an object cannot be created, but this constructor is used by derived classes.
 
virtual ~Operation ()
 Destructor for the Operation class. Cleans up dynamic memory in result state.
 

Protected Member Functions

std::uint32_t hash1 (Record *rec, bool is_outer)
 Performs the first hash function on the inputted record.
 
void _createTempFiles ()
 Performs the set up phase of hash join, creating temporary hashed files to be used by hash join algorithm.
 
void _firstHash (bool is_outer)
 Performs the initial hashing of each relation for the first step of hash join.
 
void _secondHash ()
 Performs the second hashing of each relation for the second step of hash join.
 
RecordId _part1 (Record *record, bool is_outer, BlockHeapFileScanner *scanner)
 Performs the main looping functionality for first hash.
 
void cleanup ()
 Function that cleans up state and deletes all allocated memory. It will be called once by one thread when any function throws an error.
 
- Protected Member Functions inherited from Operation
void _initState (FileId file_id, std::vector< FieldId > fields, fileState *state)
 Performs the file and temporary record setup for relational operators.
 
void _delState (fileState *file_state)
 Deletes objects created in relop structs.
 

Protected Attributes

BufferManagerbuf_mgr
 
FileManagerfile_mgr
 
std::vector< HeapFile * > outer_partitions
 
std::vector< HeapFile * > inner_partitions
 
std::uint32_t num_buckets
 
std::vector< std::pair< HeapPage *, PageId > > hash_table
 
std::string temp_path
 Holds the path to which the temp files should be saved. "/local/" is recommended for performance.
 
std::uint32_t result_num
 
- Protected Attributes inherited from Join
fileState outer
 
fileState inner
 
std::vector< FieldIdouter_fields
 
std::vector< FieldIdinner_fields
 
- Protected Attributes inherited from Operation
fileState result_state
 
Catalogcatalog
 

Detailed Description

SwatDB HashJoin Class.

Constructor & Destructor Documentation

◆ HashJoin()

HashJoin::HashJoin ( FileId  outer_id,
FileId  inner_id,
FileId  result_id,
std::vector< FieldId outer_fields,
std::vector< FieldId inner_fields,
std::uint32_t  num_buckets,
std::string  temp_path,
Catalog catalog,
BufferManager buf_mgr,
FileManager file_mgr 
)

Constructor for the join operater using a hash join algorithm. Sets up the state for a single join operation using the specified state.

Precondition
The types of the fields being joined on have the same type.
Postcondition
The set up for the hash join is complete.
Parameters
outer_id.FileId of the outer relation file.
inner_id.FileId of the inner relation file.
result_id.FileId of the result relation file.
outer_field_ids.Vector of FieldIds corresponding to the join field in outer_rel
inner_field_ids.Vector of FieldIds corresponding to the join field in inner_rel
num_buckets.Amount of partitions for the hashing
catalog.Pointer to the catalog of the Swatdb object
buf_mgr.Pointer to the buffer manager of the dbms
temp_path.filename for temporary files

Member Function Documentation

◆ _createTempFiles()

void HashJoin::_createTempFiles ( )
protected

Performs the set up phase of hash join, creating temporary hashed files to be used by hash join algorithm.

Parameters
tid.Thread id of the thread performing the function

◆ _firstHash()

void HashJoin::_firstHash ( bool  is_outer)
protected

Performs the initial hashing of each relation for the first step of hash join.

Parameters
tid.Thread id of the thread performing this specific round of \ computation.
is_outer.Boolean indicator of which relation is being hashed

◆ _part1()

RecordId HashJoin::_part1 ( Record record,
bool  is_outer,
BlockHeapFileScanner scanner 
)
protected

Performs the main looping functionality for first hash.

Parameters
record.Pointer to the record object which holds the current rec
is_outer.Boolean indicator of current relation
scanner.Scanner object which is iterating over the relation
Returns
rid. Record id

◆ cleanup()

void HashJoin::cleanup ( )
protected

Function that cleans up state and deletes all allocated memory. It will be called once by one thread when any function throws an error.

Postcondition
All allocated state has been deleted.

◆ hash1()

std::uint32_t HashJoin::hash1 ( Record rec,
bool  is_outer 
)
protected

Performs the first hash function on the inputted record.

Returns
std::uint32_t corresponding to the bucket.

◆ runOperation()

void HashJoin::runOperation ( )
virtual

Performs the join operation using the hash join alogrithm.

Precondition
Join state correctly initialized
Postcondition
HeapFile contains all matching results. The object should be destroyed, state is invalid.

Implements Operation.

Reimplemented in ParallelHashJoin.

Member Data Documentation

◆ buf_mgr

BufferManager* HashJoin::buf_mgr
protected

Stores a pointer to the buffer manager

◆ file_mgr

FileManager* HashJoin::file_mgr
protected

Stores a pointer to the file manager for file creation

◆ hash_table

std::vector<std::pair<HeapPage *, PageId> > HashJoin::hash_table
protected

Stores the hash table

◆ inner_partitions

std::vector<HeapFile *> HashJoin::inner_partitions
protected

Vector of HeapFile pointers for temporary files corresponding to the inner relation.

◆ num_buckets

std::uint32_t HashJoin::num_buckets
protected

The number of partitions the relations are being hashed into.

◆ outer_partitions

std::vector<HeapFile *> HashJoin::outer_partitions
protected

Vector of HeapFile pointers for temporary files corresponding to the outer relation.


The documentation for this class was generated from the following file: