Building simple hash table

I have this helper function to return truth value of whether a duplicate itm exists before insert into list but it crashes. It's hardcoded insert, but I just want to figure out why my is_duplicate fcn crashes before moving on. The manual insertion works as the for loop outputs each items' key (jersey number), name, team, just the boolean is_duplicate function not working...

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
struct hockey_player
{
   hockey_player* next;
   hockey_player* back;
   int num;//NB: this is key (jersey number)
   string name;
   string team;
};

//helper fcn to check if an itm to insert is unique (i.e. diff number(key) AND diff data)
bool is_duplicate(hockey_player* bucket, int cur_bucket, int size, int num_, string name_, string team_)
{
   bool isDuplicate = false;
   if ( bucket[cur_bucket] == NULL )//if list is empty, of course not duplicate
      return isDuplicate;
		
   for ( hockey_player* cur = bucket[cur_bucket]; cur != NULL; cur = cur->next )/**Else traverse bucket to find if itm to insert is a duplicate or not*/
	if ( cur->num == num_ && cur->name == name_ && cur->team == team_ )
	   isDuplicate && false;
	return isDuplicate;
}

int main()
{

	hockey_player* p_1_front = NULL;//pointer to the items in each bucket (ie. the address in each array index to a list)
	hockey_player* p_2_front = NULL;
	hockey_player* p_3_front = NULL;
	hockey_player* p_4_front = NULL;
	hockey_player* p_5_front = NULL;
	hockey_player* p_6_front = NULL;
	hockey_player* p_7_front = NULL;
	hockey_player* p_8_front = NULL;
	hockey_player* p_9_front = NULL;
	hockey_player* p_10_front = NULL;
	hockey_player* p_11_front = NULL;
	hockey_player* p_12_front = NULL;
	hockey_player* p_13_front = NULL;
	hockey_player* p_14_front = NULL;
	hockey_player* p_15_front = NULL;
	
	
	hockey_player* bucket[15] = {p_1_front, p_2_front, p_3_front, p_4_front, p_5_front, p_6_front, p_7_front, p_8_front, p_9_front, p_10_front,
		p_11_front, p_12_front, p_13_front, p_14_front, p_15_front};
	
	int size_ = 15;
	
	hockey_player* newGuy = new hockey_player;
	newGuy->num = 99;
	newGuy->name = "Wayne Gretzky";
	newGuy->team = "EDM";
	newGuy->next = NULL;
	newGuy->back = NULL;
	
	int i = newGuy->num % size_;
	
	bucket[i] = newGuy;
	
	hockey_player* newGuy_2 = new hockey_player;
	newGuy_2->num = 9;
	newGuy_2->name = "Nail Yakupov";
	newGuy_2->team = "EDM";
	newGuy_2->back = NULL;
	newGuy_2->next = bucket[i];
	bucket[i]->back = newGuy_2;
	bucket[i] = newGuy_2;

	
	for ( hockey_player* cur = bucket[i]; cur != NULL; cur = cur->next )
	{
	   cout << "Name: " << cur->name << endl;
	   cout << "Jersey Number: " << cur->num << endl;
	   cout << "Team: " << cur->team << endl;
	   cout << endl;
	}
	
	bool isDuplicate = is_duplicate(&bucket[0], i, size_, 99, "Wayne Gretzky", "EDM");
	
	if ( isDuplicate == 1)
		cout << "Already signed up this player" << endl;
 return 0;
}
Last edited on
Here is the hash table insertion function, it also doesn't run:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
hockey_player* sc_put(hockey_player* bucket, int size, int num_, string name_, string team_)//return NULL if inserting (putting) duplicate itm into bucket
{
	int cur_bucket = num_ % size;
	
	bool isDuplicate = is_duplicate(&bucket[0], cur_bucket, size, num_, name_, team_);
	
	//now insert @ front of list if item to insert is not a duplicate
	if ( !isDuplicate )
	{
		cout << "Unique itm" << endl;
		hockey_player* newPlayer = new hockey_player;
		newPlayer->num = num_;
		newPlayer->name = name_;
		newPlayer->team = team_;
		newPlayer->back = NULL;
	
		if ( !bucket )//if list is empty
		{
			bucket[cur_bucket] = newPlayer;
			newPlayer->next = NULL;
		}
		else//else insert @ front of list for this bucket (i.e. index)
		{
			newPlayer->next = bucket[cur_bucket];
			bucket[cur_bucket]->back = newPlayer;
			bucket[cur_bucket] = newPlayer;
		}
		return newPlayer;
	}
	
	return NULL;
}


Also, why is insert worst case O(1) b/c wouldn't it have to search one bucket with a long linked list? And:
Also, if each key is represented by a small enough number of bits, then, instead of a hash table, one may use the key directly as the index into an array of values. Note that there are no collisions in this case.

(this is quoted from Wikipedia). What does the number of bits have to do w/ performance?
Last edited on
closed account (3TXyhbRD)
I looked a bit on your code, your container isn't really well designed as it accepts only hokey_player entries, when designing containers keep in mind that it should work for any type. STL uses templates to work this out, if you don't want to use classes, but structs instead, pointer to void is a good alternative (and it works in C as well, if implemented correctly).

Also, if each key is represented by a small enough number of bits, then, instead of a hash table, one may use the key directly as the index into an array of values. Note that there are no collisions in this case.

Any type is represented with the use of bytes (that are made out of bits) inside a machine. Having a small amount of bytes to represent all keys means that |U| = constant and in this case your hash table can be a |U| length array (with indexes from 0), assuming the keys also start from 0. If a key already exists in your hash table then the index equal to the key in your hash table (that array) is true (meaning occupied), otherwise false (meaning free), if you're using a bool array.

There isn't a collision because every element contained in the U set (key universe, all possible keys) have a reserved index in the hash table. Also the check if a key exists is done in Θ(1).

See count sort, it uses the same idea.
Last edited on
I'm just experimenting with hash tables since I never build them before, but I'll keep that in mind to not restrict it to hold a certain type (hockey_player). Does anyone know what causes my trivial sc_put (separate chaining put/insert) function to not compile when I call it?

For my second post, I'm building a hash table via separate chaining using linked lists ( I read it can be done with trees since its using pointers but I'll stick w/ linked lists for now).

My hash table insert (i.e. sc_put) function seems like a trivial implementation using separating chaining, but I can't figure out what's wrong with the code preventing it from running. My hash table insert function simply inserts @ front of list btw. As you can see (from the struct in first post), I'm using a doubly linked lists. Can someone also explain why using a dll (doubly linked list) is more efficient than a sll (singly linked list). I know there is extra "work" to have to keep track of a previous pointer in order to use it to "link past" the current pointer like in a delete operation, but isn't there the obvious overhead of the memory occupied by need to reserve space for a back pointer for each dll node? In addition, for bidirectional, it's possible w/ a sll to just keep track of a tail pointer so two pointers (head pointer and tail pointer) and tail pointer will initially point to what head pointer points to when only one node in list, but > 1 nodes, only head pointer gets updated (assuming simply insert @ front of list (not ordered in other words) ).

Sorry to be longwinded, but I know performance is based on the load factor: num elems/table size, BUT is is num elems for a given bucket/table size OR is it ALL elems/table size that dictates performance?

Here is my bool is_duplicate function and it's not working too as I commented out my sc_put function (hash table insert in other words) and the compiler complains as such:


hash_fcn_and_tables.cpp: In function 'bool is_duplicate(hockey_player*, int, int
, int, std::string, std::string)':
hash_fcn_and_tables.cpp:50: error: no match for 'operator==' in '*(bucket + ((un
signed int)(((unsigned int)cur_bucket) * 20u))) == 0'
hash_fcn_and_tables.cpp:53: error: cannot convert 'hockey_player' to 'hockey_pla
yer*' in initialization


Note: below I have placed the code lines from my actual file, why does it complain about the operator '==' which I am using properly in the if statement below...
1
2
3
4
5
6
7
8
9
10
11
/*47*/bool is_duplicate(hockey_player* bucket, int cur_bucket, int size, int num_, string name_, string team_)
/*48*/{
/*49*/	bool isDuplicate = false;
/*50*/	if ( bucket[cur_bucket] == NULL )//if list is empty, of course not duplicate
/*51*/		return isDuplicate;
/*52*/		
/*53*/	for ( hockey_player* cur = bucket[cur_bucket]; cur != NULL; cur = cur->next )/**Else traverse bucket to find if itm to insert is a duplicate or not*/
/*54*/		if ( cur->num == num_ && cur->name == name_ && cur->team == team_ )
/*55*/			return isDuplicate && false;
/*56*/	return isDuplicate;
/*57*/}


I have checked online and it seems the implementations are either using an existing library like STL maps for C++ or Java's built in, but I would like to learn to practice building trivial hash table via separate chaining in this case just as I realize linked lists also have libraries but I can easily create singly and doubly linked lists etc.

I hope someone can take a look, the code is very trivial for a hash table implementation and that's all I want, to get experience before using C++ STL maps for eg.
Last edited on
I tried to code the bool is_duplicate function in main and it works, so why would the same code not run inside the function. So this is what I tested in main (I kept the newGuy_1 and newGuy_2 and newGuy_3 I added is identical item to newGuy_2 to test if the code works, it's very trivial so I can't see why the same code fails to run inside separate function, rather than in main.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
int main()
{
   //try to insert newGuy_3 
   hockey_player* newGuy_3 = new hockey_player;
   newGuy_3->num = 9;
   newGuy_3->name = "Nail Yakupov";
   newGuy_3->team = "EDM";
   newGuy_3->back = NULL;
	
   int cur_bucket = newGuy_3->num % size_;
   bool isDuplicate = false;
	
   if ( bucket[cur_bucket] == NULL )//if list is empty, of course not duplicate
	cout << "Empty list so cannot be a duplicate" << endl;
		
   for ( hockey_player* cur = bucket[cur_bucket]; cur != NULL; cur = cur->next )/**Else traverse bucket to find if itm to insert is a duplicate or not*/
	if ( cur->num == newGuy_3->num && cur->name == newGuy_3->name && cur->team == newGuy_3->team )
	{
		cout << "Duplicate" << endl;
		isDuplicate = true;
		break;
	}
 return 0;


I had thought maybe it was to do with having to pass pointers to pointers, but seeing as bucket (in main) is an array of pointers, I am modifying what the pointers point to which are within the original bucket array...
Last edited on
Topic archived. No new replies allowed.