Hash Function Help

Alright, I am stuck. I need a hash function for this class:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
#ifndef CONNECTOR_H
#define CONNECTOR_H
#include <memory>
#include <utility>
#include <unordered_set>
#include <unordered_map>

class Vertex;

class Connector
{
    public:
        Connector();
        Connector(Vertex*,Vertex*);
        ~Connector();
        Connector(const Connector& other);
        Connector& operator=(const Connector& other);
        std::pair<Vertex*,Vertex*> vertex_pair;
        unsigned int get_state() const {return state;}
        void set_state(unsigned int state) {this->state = state;}
        void coboundary(const unsigned int);
        void update_pointers(std::unordered_multimap<Vertex*,Connector*>,Vertex*,Vertex*);
    protected:
    private:
        unsigned int state;
};

namespace std
{
    template <>
        class hash<Connector>{
        public :
        size_t operator()(const Connector &c ) const{
            return hash<Vertex*>()(c.vertex_pair.first) ^ hash<Vertex*>()(c.vertex_pair.second);
        }
    };
}
#endif // CONNECTOR_H 


The current one, which uses vertex_pair, will not work (it causes a segmentation fault when you attempt to delete a different class that contains an unordered_set of connectors). In other words, I need a way to set up this hash function based on something else- perhaps an integer of some sort given to it randomly? I have no idea.
> perhaps an integer of some sort given to it randomly?

Give a unique integer id to each vertex and each edge; provide a lookup mechanism by which we can get from id to object. Get rid of these pointers which are difficult to maintain (How does one manage to reset all these scattered pointers to an object, when he object's life-time is over?); identify objects by their unique numeric id.

Something along these lines:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
#include <iostream>
#include <unordered_map>
#include <unordered_set>
#include <stdexcept>

template < typename T > class id_helper ;
template < typename T > struct id_
{
    operator std::size_t() const { return value ; }
    const std::size_t value ;

    private:
        id_( std::size_t value ) : value(value) {}

    friend id_helper<T> ;
};

template < typename T > struct id_helper
{
    public:
        std::size_t id() const { return unique_id ; }
        static T* look_up( std::size_t id )
        {
            auto iter = object_map.find(id) ;
            return iter == object_map.end() ? nullptr : iter->second ;
        }

    private:
        id_helper() { object_map[unique_id] = static_cast<T*>(this) ; }
        ~id_helper() { object_map.erase(unique_id) ; }

        static std::unordered_map< std::size_t, T* > object_map ;
        static std::size_t next ;
        id_<T> unique_id{ ++next };

    friend T ;

};

template < typename T > std::unordered_map<std::size_t,T*> id_helper<T>::object_map ;
template < typename T > std::size_t id_helper<T>::next = 0 ;

class Vertex ;

struct Connector : id_helper<Connector>
{
    Connector( std::size_t vertex_from, std::size_t vertex_to ) : vertex_pair{vertex_from,vertex_to} {}
    const std::pair<std::size_t,std::size_t> vertex_pair;

    // ...

};

struct Vertex : id_helper<Vertex>
{
    ~Vertex() ;
    void add_connection( std::size_t connector_id ) ;
    void remove_connection( std::size_t connector_id ) ;
    std::size_t valency() const { return connection_set.size() ; }

    // ...

    private:
        std::unordered_set<std::size_t> connection_set;
        Vertex* adjacent( std::size_t connector_id ) const ;
        // ...
};

Vertex* Vertex::adjacent( std::size_t connector_id ) const
{
    const auto pc = Connector::look_up(connector_id) ;
    if(pc)
    {
        Vertex* v1 = look_up( pc->vertex_pair.first ) ;
        Vertex* v2 = look_up( pc->vertex_pair.second ) ;
        if( v1 != this && v2 != this ) throw std::logic_error("bad connector") ;
        return v1 == this ? v2 : v1 ;
    }
    return nullptr ;
}

void Vertex::add_connection( std::size_t connector_id )
{
    Vertex* adjacent_vertex = adjacent(connector_id) ;
    if(adjacent_vertex)
    {
        connection_set.insert(connector_id) ;
        adjacent_vertex->connection_set.insert(connector_id) ;
    }
}

void Vertex::remove_connection( std::size_t connector_id )
{
    connection_set.erase(connector_id) ;
    Vertex* adjacent_vertex = adjacent(connector_id) ;
    if(adjacent_vertex) adjacent_vertex->connection_set.erase(connector_id) ;
}

Vertex::~Vertex()
{
    for( auto connector_id : connection_set )
    {
        auto adjacent_vertex = adjacent(connector_id) ;
        if(adjacent_vertex) adjacent_vertex->connection_set.erase(connector_id) ;
    }
}

int main()
{
    Vertex v1, v2, v3 ;
    std::cout << v1.valency() << ' ' << v2.valency() << ' ' << v3.valency() << '\n' ;

    Connector c1( v1.id(), v2.id() ) ;
    v1.add_connection( c1.id() ) ;
    std::cout << v1.valency() << ' ' << v2.valency() << ' ' << v3.valency() << '\n' ;

    Connector c2( v1.id(), v3.id() ) ;
    Connector c3( v1.id(), v3.id() ) ;
    v3.add_connection( c2.id() ) ;
    v1.add_connection( c3.id() ) ;
    std::cout << v1.valency() << ' ' << v2.valency() << ' ' << v3.valency() << '\n' ;

    v1.remove_connection( c1.id() ) ;
    v3.remove_connection( c2.id() ) ;
    v3.remove_connection( c3.id() ) ;
    std::cout << v1.valency() << ' ' << v2.valency() << ' ' << v3.valency() << '\n' ;
}

http://coliru.stacked-crooked.com/a/31fc156cbe68bebb
You... managed to do what I was trying to do in a much better way. Thank you- I hope you don't mind me using it. I still have to add one or two modifications- give the vertex and connector objects states, and make this all able to be displayed and be interactive- but otherwise, damn this is far better than mine.
Instead of using a hashtable, why not use array?
> Instead of using a hashtable, why not use array?

I can enumerate a few reasons for favouring a hashtable over an array:
resizeable, elides duplicate entries, amortised constant-time look up.

What are your reasons for favouring an array over a hashtable?
Last edited on
Since OP wants a hash function, therefore implying hashtable, this answer is not really applicable to the question, but as for my reasons for favoring array:

Your method simply involves assigning an incrementing value to each vertex, you can just use those values to index into an array to identify each vertex. To handle connections, depending on if the graph is directed or not, you can choose between a balanced tree or disjoint-set data structure ( http://en.wikipedia.org/wiki/Disjoint-set_data_structure ).

I try to stay away from hashtables because they tend to trade space for efficiency. In some cases this might not be noticeable and actually encouraged but in a something like a graph, hashing every vertex and every connection is calling for trouble as the density of the graph increases.

I suggest balanced tree for directed graph connections but this could be replaced by arrays for small graphs. If performance becomes an issue (i.e. density of graph increases), the storage can be upgraded to a balanced tree (RB tree for example).
Disjoint-set is good for undirected graphs not only because of the performance (with optimizations can perform at amortized O(1)) but because it uses the same space as an array but can perform as well as (if not better than) a balanced binary tree. Another reason why I suggest this is because everything in a connected component (a subset of a disjoint-set) is connected (has an edge) to each other, and this is not desirable in a directed graph

Implementation of disjoint-set (in python)
https://drive.google.com/file/d/0B3YhWUWL5lcgOG9XcU8tTDRiZVU/edit?usp=sharing
Last edited on
> Since your method simply involves assigning an incrementing value to each vertex,
> you can just use those values to index into an array to identify each vertex.

And what would I do when existing vertices and edges are destroyed and new ones are created? Let this array keep growing ad infinitum?


> I try to stay away from hashtables because they tend to trade space for efficiency.

The ones in the standard library do provide max_load_factor() and rehash().
And I would refrain from using rehash() till I know that memory is actually a constraint.

How much memory do you think a hashtable of about one million vertices would take? Without any rehash?
32 MB or so on a 64-bit architecture. http://coliru.stacked-crooked.com/a/0bc96efe4d5dd160


> hashing every vertex and every connection is calling for trouble as the density of the graph increases.

Precisely what kind of trouble?


Last edited on
Topic archived. No new replies allowed.