Hash join

I have two relations A and B, both with all integer attributes (A {a1,a2,a3,...} B{b1,b2,b3,..}. How would I hash-join these two in C++? The user will pick the two joining attributes.

What data structure should I use for this? I don't know how to start it.

I played with an example only and try to do it with arrays but I don't see it working like this.

Note: the below code wasn't [4][4] in size

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
  #include <iostream>
#include <string>
using namespace std;

int main()
{
	string R1[5][2] = {
		{"27", "Jonah"},
		{"18", "Alan"},
		{"28", "Glory"},
		{"18", "Popeye"},
		{"28", "Alan"}
	};

	string R2[5][2] = {
		{"Jonah", "Whales"},
		{"Jonah", "Spiders"},
		{"Alan", "Ghosts"},
		{"Alan", "Zombies"},
		{"Glory", "Buffy"}
	};

	system("PAUSE");
	return 0;
}
Last edited on
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
#include <iostream>
#include <vector>

using namespace std;

  int main()
{
    vector<int> A {1, 2, 3, 4, 5};

    vector<int> B {6, 7, 8, 9, 10, 11, 12};

    vector<pair<int, int>> A_merge_B(A.size() < B.size() ? A.size() : B.size());

    for(size_t i = 0; i < A_merge_B.size(); i++)
    {
        A_merge_B[i] = std::make_pair(A[i], B[i]);
    }
    for (auto& elem : A_merge_B)
    {
        std::cout << elem.first << '\t' << elem.second << '\n';
    }

}
@gunnerfunner,

Can you explain the code for me please? If you don't mind.
2 vector<int> of varying size, A and B, constructed (lines 8, 10)

1 vector<pair<int,int>>, A_merge_B, whose size is the lower of the sizes of A and B is declared (12)

pairwise numbers are drawn from A, B and made into pair<int,int> using make_pair() (16, rhs)

These <int,int> pairs are then assigned to A_merge_B (16, lhs)

Each of the pairs is printed out (20)

An alternative that might be cheaper in as much as it avoids copying objects, and if you don't care about the final states of A and B, is to std::move the elements of A and B directly into A_merge_B: A_merge_B[i] = std::make_pair(move(A[i]), move(B[i])); A, B are left in an undefined state after the move operation however
Topic archived. No new replies allowed.