Use of container of list iterators

Often, it is necessary to search a list. The list does not allow binary search or hash search. However, we can store list iterators in a unordered_map (e.g. C++0x) and do the search there.

Here goes a simplified implementation, which I hope it is useful:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
#pragma once
#include <map>
#include <unordered_map>
#include <vector>
#include <list>
#include <iostream>
using namespace std;
/* this is a const_iterator to travel along a bmap */
template<typename T>
struct slistit
{
public:
    slistit(const typename list<T>::const_iterator& rhs):index(rhs){}
    slistit(const slistit<T>& rhs):index(rhs.index){}
    bool operator==(const slistit&rhs) const
    {
        return index==rhs.index;
    }
    bool operator!=(const slistit&rhs) const
    {
        return index!=rhs.index;
    }
    T operator*() const
    {
        return *index;
    }
    typename list<T>::const_iterator & operator++() // prefix
    {
        ++index;
        return index;
    }
    typename list<T>::const_iterator operator++(int gash) // postfix
    {
        typename list<T>::const_iterator tmp=index;
        index++;
        return tmp;
    }
protected:
    typename list<T>::const_iterator index;
};
/* this is a bmap: a combination of a sorted and non-sorted list */
template<typename T>
struct slist
{
protected:
    friend struct slistit<T>;
    unordered_map<T,typename list<T>::iterator > m;
    list<T> n;
public:
    typedef slistit<T> const_iterator;
    bool empty() const
    {
        return n.empty();
    }
    unsigned size() const
    {
        return n.size();
    }
    slist(){}
    ~slist()
    {
        n.clear();
        m.clear();
    }
    slist(const slist<T>&rhs)
    {
        n=rhs.n;
        for(auto it=n.begin();it!=n.end();++it)
            m[*it]=it;
    }
    slist<T>& operator=(const slist<T>&rhs){
        if(this!=&rhs)
        {
            n.clear();
            m.clear();
            n=rhs.n;
            for(auto it=n.begin();it!=n.end();++it)
                m[*it]=it;
        }
        return(*this);
    }
    bool operator==(const slist<T>&rhs)
    {
        if(this!=&rhs)
        {
            if(n.size()==rhs.n.size())
            {
                auto it=m.begin();
                auto jt=rhs.m.begin();
                for(;(it!=m.end())&&(jt!=rhs.m.end());it++,jt++)
                {
                    if(it->first!=jt->first)return(false);
                }
            }
            else
                return(false);
        }
        return(true);
    }
    bool operator!=(const slist<T>&rhs)
    {
        return(!*this==rhs);
    }
    bool operator<(const slist<T>&rhs)
    {
        if(this!=&rhs)
        {
            if(n.size()==rhs.n.size())
            {
                auto it=m.begin();
                auto jt=rhs.m.begin();
                for(;it!=m.end()&&jt!=rhs.m.end();it++,jt++)
                {
                    if(it->first<jt->first)return(true);
                    if(it->first>jt->first)return(false);
                }
            }
            else
                return(n.size()<rhs.n.size());
        }
        return(false);
    }
    void insert(const T& t)
    {
        if(m.find(t)==m.end())
        {
            n.push_back(t);
            auto it=n.end();
            it--;
            m[t]=it;
        }
    }
    void erase(const T& t)
    {
        if(m.find(t)!=m.end())
        {
            auto it=m[t];
            n.erase(it);
            m.erase(t);
        }
    }
    void replace(const T& told,const T& tnew)
    {
        if(m.find(told)!=m.end())
        {
            auto it=m[told];
            *it=tnew;
            m[tnew]=it;
            m.erase(told);
        }
    }
    void clear()
    {
        n.clear();
        m.clear();
    }
    const_iterator begin() const
    {
        return const_iterator(n.begin());
    }
    const_iterator end() const
    {
        return const_iterator(n.end());
    }
    vector<T> tovector() const
    {
        vector<T> res;
        for(auto it=n.begin();it!=n.end();++it)
            res.push_back(*it);
        return(res);
    }
    vector<T> tosortedvector() const
    {
        vector<T> res;
        for(auto it=m.begin();it!=m.end();++it)
        {
            res.push_back(it->first);
        }
        return res;
    }
};
template<typename T>
ostream& operator<<(ostream&os,const slist<T>& rhs)
{
    os<<rhs.size()<<" ";
    for(auto it=rhs.begin();it!=rhs.end();++it)
        os<<*it<<" ";
    os<<endl;
    return(os);
}
template<typename T>
istream& operator>>(istream&is,slist<T>& rhs)
{
    rhs.clear();
    unsigned size;
    is>>size;
    for(unsigned it=0;it!=size;++it)
    {
        T val;
        is>>val;
        rhs.insert(val);
    }
    return(is);
}
Wrong. A std::list most certainly does allow a binary search. You an use the binary_search and lower_bound algorithms with list iterators. Those algorithms do not require random access iterators, only forward iterators. Although the iterations must be linear the comparisons are still logarithmic as the documentation of those algorithms clearly articulates.
No! you are wrong. A list only allows a binary search IF it is sorted.

If it is NOT sorted, you have to sort it, which can become expensive if there are a lot of insertions and removals. You have to sort every time you insert or remove OR when you need to perform a binary search. In addition, you lose the initial ordering of insertion, which may be undesirable. These two problems disappear with the above implementation.


The manuals are very explicit:

http://www.sgi.com/tech/stl/binary_search.html

1
2
3
4
5
Preconditions

For the first version:
[first, last) is a valid range.
[first, last) is ordered in ascending order according to operator<. That is, for every pair of iterators i and j in [first, last) such that i precedes j, *j < *i is false.



In addition, I've inserted both map and unordered_map since some operations are faster in one or the other.
Last edited on
See boost::multi_index_container.

Yes, I am aware of that (infinitely better) solution in Boost. Mine is just a hack, and actually a poor one. However, I tend to have a hard time understanding the headers of Boost and feel that C++ is becoming a very elitist and often ridiculously complex language. I start to be attracted by the Objective-C language and its dynamism and syntax simplicity. However, my pathetic hack would probably be impossible with Objective-C. In addition, I tried it but miss the overloads and the self-contained completion of C++. Perhaps a hybrid language will emerge soon. C#4.0 has now the "dynamic" features, but I am very uncomfortable about the syntax.
Has anyone thought about this discussion of inserting multimethods OR binary search in the member selection (currently going on on comp.lang.c++.moderated) OR highly advanced visitor pattern is just exposing lacunae in C++? Adding more syntax to C++ will be a suicide, I suspect.


Returning to my hack, please just use std::map above. The tosortedvector member function doesn't make sense with the std::unordered_map.
Last edited on
<troll mode on>
Objective-C. LOL. A language for writing applications that "fart" for $1. If Apple didn't make it the preferred choice for iOS development, almost nobody would use it.
</troll mode off>

Dynamic languages scale worse than statically typed languages. It is fast to write a simple 100 line app in them, but the larger the application is, the slower it is to develop, and finally it is much slower to develop than a properly designed C++ app. Maybe Objective-C is different, but in Python this is a huge problem (and from what I've heard Python is much cleaner and better designed language than Objective-C - because it didn't aim at being compatible with something older, e.g. C).

Lack of static typing means you usually have to provide unit tests for every case the compiler would normally check for you (or type errors will manifest as runtime errors, and there will be many, providing good half of errors detected by my compiler in my programs *are* type errors). This is lots of work, and slows down the development. Additionally Objective-C is not a speed-daemon. The OOP part of the language is probably an order of magnitude (or sometimes two) slower than in statically typed languages. I doubt a C++ programmer would like Objective-C (would you leave the messy, but powerful C++'s template type system, behind?).

Last edited on
I touched Objective-C once.

The next time I touched it was with an 8 light-minute pole. [/personalfeelings]

-Albatross
Last edited on
Yes, the boost headers can be formidable to understand, but it's not really any fault of the library developer.

The best way to learn the boost library is to read the documentation on www.boost.org. Usually you get enough information to understand how to use the library without having to resort to looking at the source.
I doubt a C++ programmer would like Objective-C

I like it. :0P

I also like C# but I hate the .net framework.
I agree, C# is actually a rather nice language in terms of its syntax and such, and with the addition of Mono, it actually becomes usable as a language.
Yeah I used C# a bit and it wasn't horrible, it was only horrible that we had to mix C++/C#/.net. Ugh that was a huge pain converting crap back and forth...
IMHO C# got better generic type system than C++. E.g. there is no way to express higher-kinded types in C++. I mean saying
"this function takes a vector<T> where T is a subclass of some other class." Due to this fact, you have almost no support for code completion when editing generic code (the IDE sees only some abstract T, it does not know what are its fields and methods).
If you had such IDE support for C++ as there is in C#, coding in C++ would be much more pleasant.
Last edited on
I'm going to play with C#/Mono for a bit...
No! you are wrong. A list only allows a binary search IF it is sorted.


As do all sequence containers. In your initial post you made it seem like a list can never be searched using a binary_search. I recommend doing a better job of explaining what your solution is for and how it is used. There is practically no reasonable documentation that would allow someone to know what this is for and sorting a list isn't always inefficient. It depends on what is in the list. If you use the lower_bound algorithm you can insert into the list such that it is always maintained in the proper order. You can also use a std::set which takes care of that automatically.
Last edited on

a list can never be searched using a binary_search.


I think he claimed that a list can never be searched efficiently using a binary_search. The complexity is still O(n) compared to searching in a tree O(log n) or a hash table O(1).
Kempofighter:

Perhaps this is clear:

A set is ordered by construction (usually a Red-Black tree) but the order of insertion becomes unknown.
A map is also ordered by construction (the keys are) but the same applies.

A list or a vector are generally not ordered by construction but can be filled to retain the order of insertion, and this is what our friend push_back makes . In a list you can insert using lower_bound. But you lose the intended position.

Before using binary search, you have to order the list or vector.
All Standard Library manuals explicitly state that requirement.

That is why, our good friends contributing to boost have this nice boost::multi_index container, which can be overkill for some applications.

An example:

You have a polygon p with N_p nodes. Each node is identified by an unsigned number I_i with i=0,N_p-1. For each node I_i you have 2 coordinates X[I_i] and Y[I_i].

A specific quadrilateral can be given as (e.g.): I_0,I_1,I_2,I_3 with I_0=3, I_1=1, I_2=5, I_3=99. Since the order of insertion is important (some orders 3,1,99,5, 3,99,1,5, etc give rise to different quadrilaterals with crossed edges) you cannot build it using a set since it may result in a different quadrilateral.

However, you MAY want to know if a quadrilateral with nodes 1,3,5,99 exists, regardless of the positions. You may also want to know if node 99 belongs to a given quadrilateral using binary search etc.

You may wish to remove a node and obtain a triangle. Or add a node and obtain a pentagon.

To achieve this, you can (or should) keep two lists: one corresponding to your intended order and another to perform the binary search.

So what do you do? Use a list and a set. However, if you just duplicate the quantities, you still have to perform a LINEAR search in the list or perform an ordering of the list before the search every time a new value is inserted or removed. The solution is to store the list iterators in the set. Yes, they keep their values with insertions and removals.

That way you keep the values ordered for searching AND the original intended order.

As stated before, that is why boost::multi_index exists.

And every manual mentions this property of list iterators and we take advantage of it.

P. Areias
Last edited on
I think he claimed that a list can never be searched efficiently using a binary_search. The complexity is still O(n) compared to searching in a tree O(log n) or a hash table O(1).


The complexity of iteration might be linear but the complexity of comparisons is still logarithmic. Anyway I appreciate the updated explanation by the OP. The original post seemed vague to me and I was not clear on the purpose of the program. The original statement that a list doesn't allow binary search is incorrect by itself. You have to qualify that statement with some details in order for it to be accurate which the OP eventually did.
Thank you. I am still improving the C++ abridged card, thanks to the feedback. A lot of what is going on on C++.moderated is also very useful since we can "feel" that C++ seems to be syntactically dividing in two: modern C++0x with auto, decltype, etc and the ancient C++98. The sensation is similar to what happened to Fortran in the 77/90 transition era. Guessworking, I would be rather surprised if a new source extension wouldn't come up to facilitate the parsing.
Topic archived. No new replies allowed.