Question about STL

Pages: 12
Hi all,

I want to get numbers from vector SrcVec, and store in a new vector DestVec,
but if the remove the duplicated numbers.
like SrvVec: 1,2,3,4,5,2,3
And the DestVec should like this: 1,2,3,4,5

How to implement this by using STL?
Thanks!

Here is what I did:
1
2
3
4
5
6
7
8
for (vector<int>::iterator it = inputNumbers.begin(); it != inputNumbers.end(); it++)
{
vector<int>::iterator  it2 = find(targertNumbers.begin(), targertNumbers.end(), *it);
	if (it2 == targertNumbers.end())
	{
	  targertNumbers.push_back(*it);
	}
}


I need your help to implement more elegantly.
Last edited on
You can always do a few things like using a range based for loop to make it more readable:
1
2
3
4
5
for(auto& x: inputNumbers) {
    if(!std::count(targetNumbers.begin(), targetNumbers.end(), x)) {
        targetNumbers.push_back(x);
    }
}

Also, please wrap your code in code tags (highlight it and use the <> format button), it makes it more readable. For the record, !std::count() and std::find == vector.end() are the same, but I prefer count because it is shorter and doesn't require an entire comparison statement.
Last edited on
If you don't mind the order being sorted you could first sort the vector and then use std::unique to get rid of the repeated values.

1
2
3
targertNumbers = inputNumbers;
std::sort(targertNumbers.begin(), targertNumbers.end());
targertNumbers.erase(std::unique(targertNumbers.begin(), targertNumbers.end()), targertNumbers.end());

http://en.cppreference.com/w/cpp/algorithm/sort
http://en.cppreference.com/w/cpp/algorithm/unique

shadowmouse wrote:
For the record, !std::count() and std::find == vector.end() are the same, but I prefer count because it is shorter and doesn't require an entire comparison statement.

The disadvantage of using std::count is that it always has to search the whole vector.
Last edited on
@shadowmouse:
Thanks for your advice. And are there any other solutions?
@Peter87
Thank you all the same,but I need to keep the order of the numbers.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#include <iostream>
#include <vector>
#include <unordered_set>

int main()
{
    const std::vector<int> srce { 1, 4, 1, 3, 5, 4, 2, 3, 1, 2, 3 } ;
    std::vector<int> dest ;

    {
        // retain the order of non-duplicated elements: try inserting each number in srce into a set
        //                          if successful (the number was not seen earlier) append it to dest
        std::unordered_set<int> set ;
        for( int v : srce ) if( set.insert(v).second ) dest.push_back(v) ;
    }

    for( int v : dest ) std::cout << v << ' ' ;
    std::cout << '\n' ;
}

http://coliru.stacked-crooked.com/a/22e71fc3e14517a8
Last edited on
There are other ways, such as using an std::set or various other, fairly large functions, but I'd say they're overkill, though if you want to look into them, just google c++ remove duplicates from vector, there are a lot of posts on forums like stack overflow about it. And as Peter87 said, while count and find have approximately the same performance when the object you're looking for doesn't exist, find is faster when it does, so I'd stick to using find, although there's no need to create an iterator to store the result and then check it, rather than just checking the result in the if statement itself.
1
2
3
4
5
for(auto& x: inputNumbers) {
    if(std::find(targetNumbers.begin(), targetNumbers.end(), x) == targetNumbers.end()) {
        targetNumbers.push_back(x);
    }
}
Last edited on
> while count and find have approximately the same performance
> when the object you're looking for doesn't exist, find is faster when it does

I got curious about this, and got interesting results: while both take quadratic time,
std::count may be faster than std::find for this (because of loop vectorisation?)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
#include <iostream>
#include <vector>
#include <unordered_set>
#include <algorithm>
#include <ctime>

int main()
{
    const int N = 100'000 ;
    std::vector<int> srce ;
    for( int i = 0 ; i < N ; ++i ) srce.push_back( i % (N/2) ) ;

    {
        const auto start = std::clock() ;
        std::vector<int> dest ;
        std::unordered_set<int> set ;
        for( int v : srce ) if( set.insert(v).second ) dest.push_back(v) ;
        const auto end = std::clock() ;
        std::cout << "    set O(N): " << (end-start) * 1000.0 / CLOCKS_PER_SEC << " milliseconds\n" ;
    }

    {
        const auto start = std::clock() ;
        std::vector<int> dest ;
        for( int v : srce ) if( std::find( dest.begin(), dest.end(), v ) == dest.end() ) dest.push_back(v) ;
        const auto end = std::clock() ;
        std::cout << " find O(N*N): " << (end-start) * 1000.0 / CLOCKS_PER_SEC << " milliseconds\n" ;
    }

    {
        const auto start = std::clock() ;
        std::vector<int> dest ;
        for( int v : srce ) if( std::count( dest.begin(), dest.end(), v ) == 0 ) dest.push_back(v) ;
        const auto end = std::clock() ;
        std::cout << "count O(N*N): " << (end-start) * 1000.0 / CLOCKS_PER_SEC << " milliseconds\n" ;
    }
} 

--------- clang++/libc++ -----------

    set O(N): 0 milliseconds
 find O(N*N): 2910 milliseconds
count O(N*N): 2240 milliseconds

--------- g++/libstdc++ ------------

    set O(N): 10 milliseconds
 find O(N*N): 1960 milliseconds
count O(N*N): 1570 milliseconds

http://coliru.stacked-crooked.com/a/bbbdf230b1e86d51

Microsoft:
    set O(N): 14 milliseconds
 find O(N*N): 2207 milliseconds
count O(N*N): 1254 milliseconds

http://rextester.com/EAOW59435
Last edited on
Apparently depends on where you run it.. Here on a Xeon L5520 @ 2.27GHz

g++ 5.3.1 (-O3 -march=native -std=c++14)
    set O(N): 0 milliseconds
 find O(N*N): 1250 milliseconds
count O(N*N): 1820 milliseconds

clang++ 3.8.0 (-O3 -march=native -std=c++14, with GNU libstdc++)
    set O(N): 0 milliseconds
 find O(N*N): 1260 milliseconds
count O(N*N): 1390 milliseconds

intel 16.0.0 (-Ofast -xHost -std=c++14, same GNU libstdc++)
    set O(N): 0 milliseconds
 find O(N*N): 760 milliseconds
count O(N*N): 1370 milliseconds


and on a E5-2660 @ 2.20
gcc:
    set O(N): 0 milliseconds
 find O(N*N): 1190 milliseconds
count O(N*N): 1610 milliseconds
clang:
    set O(N): 0 milliseconds
 find O(N*N): 1180 milliseconds
count O(N*N): 1290 milliseconds
intel:
    set O(N): 0 milliseconds
 find O(N*N): 630 milliseconds
count O(N*N): 1280 milliseconds


(nice demo for why Intel compilers aren't extinct)
Last edited on
[edit -- oops!]
You know, there's an O(N) standard algorithm that does this: unique_copy().
Time-wise, it blows everything else here out of the water.

1
2
3
4
5
6
7
8
    {
        const auto start = std::clock() ;
        std::vector<int> dest( srce.size() ) ;
        auto dend = std::unique_copy( srce.begin(), srce.end(), dest.begin() ) ;
        dest.erase( dend, dest.end() ) ;
        const auto end = std::clock() ;
        std::cout << "unique O(N): " << (end-start) * 1000.0 / CLOCKS_PER_SEC << " milliseconds\n" ;
    }
Last edited on
Except for the little detail were the container needs to be sorted.
[edit -- oops, understood]
Except for the little detail that the OP did not specify any need to sort.

Peter87 suggested it in his solution, and his is the only one presented here that performs any sorting.
Last edited on
`unique_copy()' requires sorting, and the OP specified
I need to keep the order of the numbers.
unique_copy:
3 3 3 1 1 1 4 4 4 2 2 2
unique_copy
3 1 4 2


3 1 4 2 1 2 3 3 4 4 1 2
unique_copy
3 1 4 2 1 2 3 4 1 2

http://coliru.stacked-crooked.com/a/902addd4fe580fca
Hi all,

I try to implement the function like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
void GetTargetNumbers(vector<int> & inputNumbers, vector<int> & targetNumbers)
{
	for_each(inputNumbers.begin(), inputNumbers.end(),
		[&, targetNumbers](int i) {
		auto  it = find(targetNumbers.begin(), targetNumbers.end(), i);
		if (it == targetNumbers.end())
		{
			cout << "Not Found,Add " << i << endl;
			targetNumbers.push_back(i);
		}
		else
		{
			cout << "Found,Skip " << i << endl;
		}
	});		
}


But the compiler(vs2015) says there is an error in targetNumbers.push_back(i);
Did I make something wrong?
Thanks!

Last edited on
You want to modify targetNumbers inside the lambda expression, capture it by reference.

1
2
3
4
5
6
7
// void GetTargetNumbers(vector<int> & inputNumbers, vector<int> & targetNumbers)
void GetTargetNumbers( const vector<int> & inputNumbers, vector<int> & targetNumbers)
{
	for_each(inputNumbers.begin(), inputNumbers.end(),
		// [&, targetNumbers](int i) {
                [&](int i) { // capture all variables by reference
 // ... 
@JLBorges
I want pass targetNumbers to the lambda expression, and deal with it like an argument.
But I don't know how make it work.
1
2
3
4
5
6
7
8
9
void GetTargetNumbers( const std::vector<int>& inputNumbers, std::vector<int>& targetNumbers )
{
    std::for_each( inputNumbers.begin(), inputNumbers.end(),
                    [&](int i)
                    {
                        const auto it = std::find( targetNumbers.begin(), targetNumbers.end(), i );
                        if( it == targetNumbers.end() ) targetNumbers.push_back(i);
                    } ) ;
}
Hi JLBorges,
You modify [&, targetNumbers](int i) to [&](int i),then the lambda expression can use targetNumbers freely.
And you still add const before const std::vector<int>& inputNumbers.
> then the lambda expression can use targetNumbers freely.

The lambda expression should be able to modify targetNumbers (use targetNumbers freely) for targetNumbers.push_back(i) to be viable.


> And you still add const before const std::vector<int>& inputNumbers.

The lambda expression (or the function itself) does not need to modify inputNumbers in any way.
Then what's the difference between [&, targetNumbers](int i) and [&](int i),
Does [&, targetNumbers](int i) pass targetNumbers like const targetNumbers ?
I am confused about this.
Thanks!
Pages: 12