check vector element within another vector

I had to write a code which checks elements of vector A within vectors B
such as I choose int just demonstration, it might be string, object etc:
the content of vector B has :

4,4,3,1,2,4,2,0,5,8,4,4,3,1,1,1,1,8,4,4,3,0,0,8,4

and the content of vector A is : 8,4,4,3

the problem here is to find the position of vector A within vector B
so the output should be as follows :
output :
0 = 4,4,3
9 = 8,4,4,3
17 = 8,4,4,3
23 = 8,4

thank you
Given your example, wouldn't the follow also be outputs?
0 = { 4 }
1 = { 4 }
5 = { 4 }
11 = { 4 }
...
0 = { 4, 4 }
11 = {4, 4 }

You're asking to find the position of every substring of vec_A that's in vec_B?
Is that correct?
Last edited on
the problem here is to find the position of vector A within vector B
so the output should be as follows :

The 4,4,3 and 8,4 do show that you allow partial matches.
However, the 2,4,2 does not produce: 5 = 4

The question is thus, what is the exact rule for a partial match?

Furthermore, if the A has a repeat, like 4,3,4,3 and the B has:
4,3,4,3,4,3,4,3
Which of these are correct:
0 = 4,3
0 = 4,3,4,3
2 = 4,3,4,3
4 = 4,3,4,3
6 = 4,3


Nevertheless, lets put the vectors side-by-side -- "aligned", if you will -- with a little "offset":
*,*,*,4,4,3,1,2,4,2,0,5,8,4,4,3,1,1,1,1,8,4,4,3,0,0,8,4
8,4,4,3

then move A forward by 1:
  *,*,4,4,3,1,2,4,2,0,5,8,4,4,3,1,1,1,1,8,4,4,3,0,0,8,4
  8,4,4,3

and repeat moving forward, until:
4,4,3,1,2,4,2,0,5,8,4,4,3,1,1,1,1,8,4,4,3,0,0,8,4,*,*,*
                                                8,4,4,3

Among those B.size() + A.size() - 1 "alignments" there were the four expected ones.

You could make a predicate function:
pair<bool,int,int,int> match( const vector<T>& A, const vector<T>& B, int offset );
It returns false, if there is no match.
If a match, then the true offset, and range in A are also interesting.

For example:
1
2
3
4
5
auto foo = match( A, B, -1 );
if ( std::get<0>(foo) ) {
  cout << std::get<1>(foo) << " = ";
  for ( int e = std::get<2>(foo); e < std::get<3>(foo); ++e ) cout << ' ' << A[e];
}

Should print:
0 =  4 4 3



There are probably more efficient approaches.
Reminds me of doing a convolution with zero-padding, nice catch keskiverto, I didn't even notice that only the endpoints are where azdoud's length drops below 4.

So clearly what I"m about to post is not a complete solution for azdoud, but might help him a bit:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
// Example program
#include <iostream>
#include <vector>
#include <algorithm>

template <typename T>
std::ostream& operator<<(std::ostream& os, const std::vector<T>& vec)
{
    if (vec.size() == 0)
        return os;
        
    for (size_t i = 0; i < vec.size()-1; i++)
    {
        os << vec[i] << ",";   
    }
    return os << vec.back();
}

int main()
{
    std::vector<int> vec_B { 4,4,3,1,2,4,2,0,5,8,4,4,3,1,1,1,1,8,4,4,3,0,0,8,4 };
    std::vector<int> vec_A { 8,4,4,3 };
    
    std::vector<int>::iterator it;
    it = std::search(vec_B.begin(), vec_B.end(), vec_A.begin(), vec_A.end());

    if (it != vec_B.end())
    {
        std::cout << it - vec_B.begin() << " = " << vec_A << std::endl;
    }
    else
    {
        std::cout << "vector not found" << std::endl;   
    }
}
I'm sure someone can do better, but...

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
// Example program
#include <iostream>
#include <vector>
#include <algorithm>

using std::size_t;

template <typename T>
std::ostream& print_elements(std::ostream& os, const std::vector<T>& vec, size_t start_index, size_t end_index)
{
    if (vec.size() == 0 || end_index == 0)
        return os;
        
    for (size_t i = start_index; i < end_index-1; i++)
    {
        os << vec[i] << ",";
    }
    return os << vec[end_index-1];
}

template <typename T>
std::ostream& print_elements(std::ostream& os, const std::vector<T>& vec)
{
    return print_elements(os, vec, 0, vec.size());
}

template <typename T>
bool match_begin(std::vector<T> vec_A, std::vector<T> vec_B, size_t A_offset)
{
    for (size_t i = 0; i < vec_A.size() - A_offset; i++)
    {  
        if (vec_A[i + A_offset] != vec_B[i])
            return false;
    }
    return true;
}

//
template <typename T>
bool match_end(std::vector<T> vec_A, std::vector<T> vec_B, size_t A_offset)
{
    size_t start_offset = vec_B.size() - vec_A.size() + A_offset;
    for (size_t i = start_offset; i < vec_B.size(); i++)
    {
        if (vec_A[i - start_offset] != vec_B[i])
            return false;
    }
    return true;
}

template <typename T>
bool match_middle(std::vector<T> vec_A, std::vector<T> vec_B, size_t A_offset)
{
    for (size_t i = 0; i < vec_A.size(); i++)
    {
        if (vec_A[i] != vec_B[i + A_offset])
            return false;
    }
    return true;
}

void print_matches(const std::vector<int>& vec_A, const std::vector<int>& vec_B)
{
    for (size_t i = 1; i < vec_A.size(); i++) // i [ 1, 2, 3 ]
    {
        if (match_begin(vec_A, vec_B, i))
        {
            std::cout << 0 << " = ";
            print_elements(std::cout, vec_A, i, vec_A.size()) << "\n";
        }
    }
    for (size_t i = 0; i < vec_B.size() - vec_A.size(); i++)
    {
        if (match_middle(vec_A, vec_B, i))
        {
            std::cout << i << " = ";
            print_elements(std::cout, vec_A) << "\n";
        }
    }
    for (size_t i = 1; i < vec_A.size(); i++)
    {
        if (match_end(vec_A, vec_B, i))
        {
            std::cout << vec_B.size() - vec_A.size() + i << " = ";
            print_elements(std::cout, vec_A, 0, vec_A.size() - i) << "\n";
        }
    }
}

int main()
{
                         //  0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
    std::vector<int> vec_B { 4,4,3,1,2,4,2,0,5,8, 4, 4, 3, 1, 1, 1, 1, 8, 4, 4, 3, 0, 0, 8, 4 };
    std::vector<int> vec_A { 8,4,4,3 };
    
    print_matches(vec_A, vec_B);
}


0 = 4,4,3
9 = 8,4,4,3
17 = 8,4,4,3
23 = 8,4
Last edited on
Instructions unclear... something something got caught in a ceiling fan ;D

Why indeed is he arbitrarily ruling out substring-like subsets with length 1?
And why ignore perfectly good subsets like {4, 4} or {4, 3} which are present in B?
Last edited on
I think keskiverto's interpretation is correct, it nails it down pretty well. OP should be more clear in the future, though.
The question is thus, what is the exact rule for a partial match?


- here the rule is the order of the elements in vector A does not change, if I look for 8,4,4,3 I have to find the exact order.

Furthermore, if the A has a repeat, like 4,3,4,3 and the B has:
4,3,4,3,4,3,4,3
Which of these are correct:
0 = 4,3
0 = 4,3,4,3
2 = 4,3,4,3
4 = 4,3,4,3
6 = 4,3


- here the output should be
0 = 4,3,4,3
4 = 4,3,4,3
Instructions unclear... something got caught in a ceiling fan ;D

Why indeed is he arbitrarily ruling out substring-like subsets with length 1?
And why ignore perfectly good subsets like {4, 4} or {4, 3} which are present in B?


subsets like {4, 4} or {4, 3} we consider them as noise information
I have to accept
0 = 4,4,3 ===> I get it because it is in the front
23 = 8,4 ==> I get it cause it is located at the end

but if they 4,4,3 and 8,4 are in the middle I have to ignore them because they do not match the vector 8,4,4,3

I used vector<int> here just for the explanation but in real life, I deal with objects.
That implies that overlaps of matches are not allowed.

Why is
0 = 4,3,4,3
4 = 4,3,4,3

more acceptable than
0 = 4,3
2 = 4,3,4,3
6 = 4,3

Your first example did show that partial matches are acceptable at least at the beginning and end.

"input X should show Y" is not a rule. It is merely the result of the rule in one case. A rule applies to every input. A program must implement the rule.
azdoud wrote:
0 = 4,4,3 ===> I get it because it is in the front
23 = 8,4 ==> I get it cause it is located at the end

Why do you accept these? 4,4,3 in the front is not necessarily 8,4,4,3. What's the rule ? What if first character was just "3"? Would you accept it because it could be 8,4,4,3 ?

As far as I can tell, the last a.size()-1 characters are misleading for sure, and can lead to false matches. Partial substring matches at the front seem misleading too. For the rest, yeah, std libraries can provide various search/match/find methods.
keskiverto wrote:

That implies that overlaps of matches are not allowed.


imagine vector A like a sequence of frames
vector<frame> = {f1,f2,f3,f4}
and vector B like a stream
vector<frame> = {fr1,fr2,.....,frN}

I accept this because we can consider it as a repetition of the sequence within the stream,

icy wrote :

Why do you accept these? 4,4,3 in the front is not necessarily 8,4,4,3. What's the rule? What if the first character was just "3"? Would you accept it because it could be 8,4,4,3?


I accept them, it's like a part the last frames of the sequence of vector A caught at the beginning of the stream, similarly for the second case, the first frames of the sequences of vector A are caught at the end of the stream.

If the character "3", otherwise the object "frame" in this example we had to catch it too, as I said it represents the last frame of the sequence within the stream, I don't know if you get the idea.

that's why the order of the object is important.

by the way, I enjoy this interaction with you guys, thank you.
;)
Last edited on
I accept them, it's like a part the last frames

Please define last elements. How many is that?

Your A has A.size() elements. How many elements from end of A are required for a match? 1? 2? 50% of A.size()?

How many elements from begin of A are required for a match against the last elements of a stream? 1? 2? 50% of A.size()?

You must have exact, computable rules for these. A number.


Lets say that a match has been found. It starts from element k of B. The last element of A, the A[A.size()-1] therefore pairs with B[k+A.size()-1].
What is the next element of B where the next match could start?
Is it B[k+1]?
Is it B[k+A.size()]?
Is it something different?

The answer is obviously B[k+N], but please state that N.


You say that B is a stream. Do you copy B to a vector and then compare A against the vector, or do want to keep only minimal amount of data in memory and process each element directly from the stream?
is this an oversimplified example problem or the real thing? If its single digit elements (or similar small number of possible values for one element?) ... there is a better way.
keskiverto wrote:

How many elements from end of A are required for a match? 1? 2? 50% of A.size()?


if we find at least one element of A from the beginning of B, that's enough, we can output that we have found the last frame in A at the beginning of B. the elements are unique and the match verification will not be just a simple == operator. in fact, it is a couple of if conditions checking.

How many elements from begin of A are required for a match against the last elements of a stream? 1? 2? 50% of A.size()?


if we find just one frame at the beginning of vector A in the last element of B, we have to output the last frame is located at the end of B.

Lets say that a match has been found. It starts from element k of B. The last element of A, the A[A.size()-1] therefore pairs with B[k+A.size()-1].
What is the next element of B where the next match could start?


for this N, I have to explain

vector A is a small stream, let's say of 50 frames, in vector B I put a stream of 1000 frames

vector<frames> B = {F4,ST1,ST2,ST3,F1,F2,F3,F4,...ST,ST,ST,ST,F1,F2,F3,F4,......,ST,ST,ST,F1,F2}
vector<frames> A = {F1,F2,F3,F4}

I have to find the small stream in which place it occurs fully and partially within the large one
did you get the idea ?


Last edited on
can you do it with strings, via substring find type logic? That code is pretty well optimized for doing what you ask.

So... does my post ( http://www.cplusplus.com/forum/beginner/237656/#msg1061297 ) work for you if you replace int with <your type>? If not, what behavior is wrong?
To my:
"How many elements? You must have exact, computable rules for these. A number."
The answer seems to be "... one ..."
1
Now that part is clear.

azdoud wrote:
for this N, I have to explain

Explanation is not a number. 1 would be a number. A.size() would be a number. "I have to find" is not a number. The N must be an integer.
  0123456789
B 1000000001

A = 000
First match is at index 1. Where does the second match start from? Index 2? k+A.size(), i.e. 1+3, index 4?

azdoud wrote:
small stream in which place it occurs fully and partially within the large

Partially? Up to now we were talking about full matches. The begin and end are special cases, so lets ignore them here.

Do you now state that partial matches within B are ok?
B 1234567890
   ||| |  |
A  2349
C      6328

The A matches 3/4 and the C 2/4.


A is a stream? Should it not be a vector so that you can read it over and over again as your search progresses along B?


@Ganado:
Looks sensible, except for the N (that I try to pry), and if B is indeed a stream and not a vector.
Last edited on
Topic archived. No new replies allowed.