Need Approach with better time complexity?

Pages: 12
Given 2 sorted arrays A and B, if we have to select A[i] and B[j] such that i <= j, find the Kth minimum value of (A[i] + B[j]).
Can anybody tell the approach better than O(n^2)?

can the kth min val repeat or are they distinct?

if both arrays only had the value 1 in each of 100 locations, what is K=3 value, or does that exist?

if the answer is that it exists and its 2, then you exploit the fact that both arrays are sorted.
wouldnt it just be A[0] + B[k]?
or literally, the first K values is just {B[0]+A[0], B1+A0, B2+A0, ...} right? This is O(N) or really O(1) for a single value, O(N) for K values?

If this is true, you probably mis-stated the problem, because its too easy.
Last edited on
Didn't test (shame shame), but the scheme hopefully makes sense. You basically advance an offset in the array, maintaining the rule that the offset into A is <= the offset into B. Do that k times, and make the advancement that makes the smallest impact to the sum.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
// treat Kth value as 1-based
// i.e. 1st means first value
//      2nd means second value
// "0th" value not given meaning here
int FindKthMin(int* A, int* B, int szA, int szB, unsigned k)
{
	int aOff = 0; //offset into Array A
	int bOff = 0; //offset into Array B
	
	for(int 1; i < k; i++) //perform the offset advancement scheme k times
	{
		if(aOff == bOff) // the offset is equal, must advance B offset to maintain aOff <= bOff property
		{
			bOff++;
			if(bOff >= szB) // oops, we've advanced out of bounds
			{
				return -1; //not enough array items to find a Kth solution
			}
		}
		else if(aOff < bOff) // we can advance either offset and maintain the aOff <= bOff property
		{
			/**** scenarios where an advancement could leave us out of bounds ****/
			if(aOff + 1 >= szA && bOff + 1 >= szB) // advancing either offset puts us out of bounds
			{
				return -1; //not enough aray items to find a Kth solution
			}
			else if(aOff + 1 >= szA) // advancing A would leave us out of bounds, so advance B
			{
				bOff++;
				continue;
			}
			else if(bOff + 1 >= szB) // advancing B would leave us out of bounds, so advance A
			{
				aOff++;
				continue;
			}
			/**** end of scenarios where an advancement could leave us out of bounds ****/
			
			/**** scenario where any advancement leaves us in bounds ****/
			// let's advance the offset for the array that has the smallest item up next
			if(A[aOff + 1] < B[bOff + 1])
			{
				aOff++;
			}
			else
			{
				bOff++;
			}
			/**** end of scenario where any advancement leaves us in bounds ****/
		}
	}
	
	return A[aOff] + B[bOff];
}
@booradley60, That looks about right. It can be simplified a little. (Still not tested. :-( )

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
int FindKthMinSum(int* A, int* B, int szA, int szB, int k) {
    int a = 0, b = 0;
    
    for (int i = 1; i < k; ++i) {       // k is 1-based
        if (a == b || a + 1 >= szA) {   // must incrememt b
            if (++b >= szB)
                return NO_ANSWER;       // return an unused value (e.g, INT_MIN)
        }
        else if (b + 1 >= szB)          // must increment a
            a++;
        else if (A[a + 1] < B[b + 1])   // otherwise choose smallest
            a++;
        else
            b++;
    }
    
    return A[a] + B[b];
}

Last edited on
@dutch @booradley60

We have to select an index from A such that this selected index is always less than or equal to index j from B.

For ex:A=[1,2,3,4] (1 based indexing)
B=[5,6,7,8]
Acoording to your solutions guyz, kth min occur at index 3 of A and index of 1 at B i.e. Element 3 from A and 5 from B which leads to 8.

But this will not fulfill the constraint as index(3) of A>index(1) of B.But we have to select i<=j.

Can you guyz ellaborate it?
Firstly, never use 1-based indexing for arrays unless it's absolutely necessary.
We are interpreting k as "1-based", not the arrays.

Secondly, that's the exact opposite of what happens. The answer will be index 0 of A and index 2 of B, which is presumably what you want.
It seems that you don't understand the problem
1
2
3
4
5
for a in range(len(A)):
	for b in range(a, len(B)):
		C.append(A[a]+B[b])
sort(C)
return C[K] //¿what's this value? 
you may notice that len(C) ~= len(A)*len(B), that is, O(n^2)
so K is in range [0; n^2]

now look at the code that you've proposed
1
2
3
4
5
6
7
8
9
10
11
12
    for (int i = 1; i < k; ++i) {
        if (a == b || a + 1 >= szA) {
            if (++b >= szB)
                return NO_ANSWER;
        }
        else if (b + 1 >= szB)
            a++;
        else if (A[a + 1] < B[b + 1])
            a++;
        else
            b++;
    }
¿how many times would that loop execute?
in each iteration you increment either `a' or `b', and when you reach szA or szB you'll break the loop
so at most szA+szB, wich is a lot less than szA*szB

> wouldnt it just be A[0] + B[k]?
> or literally, the first K values is just {B[0]+A[0], B1+A0, B2+A0, ...} right?
no
¿how are you so sure that B2+A0 < B1+A1?


@OP: ¿what are the limits? ¿are A[i], B[i] integers?
I think I've got a O(n lg n lg(AN+BN)) solution
hints:
- given one number x, ¿in which position will end in the sorted C array? (O(n lg n))
- binary search (O(lg(AN+BN)))

actually, no need for integers, just make x an element of the array, do a binary search on the index a, and then on the index b.
Last edited on
C.append(A[a]+B[b]) <- where did that come from?
the OP's question does not indicate this at all. it just says find the k smallest sum, where the index in a is < index in b. that is absolutely a[0] + b[k-1] (k-1 in c arrays). Are you reading the real problem on some site and the OP gave a bad summary?
when you reach szA or szB you'll break the loop

Think again.

And we definitely seem to disagree on problem interpretation.

However, that code is not my idea. It's a simplification of the code proffered by booradley60.
I thought it seemed reasonable, but jonnin is suggesting that the solution is much simpler indeed.

And I don't see anything wrong with his idea.

EDIT: Actually, what about this example. What is the 3rd smallest?
A: 1 2 3
B: 1 2 99
Sums in order: 2 3 4 100 101 102
Last edited on
1
2
3
for a in range(len(A)):
	for b in range(a, len(B)):
		C.append(A[a]+B[b])
gives you all possible sums where `a' <= `b' *

sort(C) now all the possible sums are in order
so C[0] is the smallest, C[1] is the second smallest, C[2] is the third smallest, ..., C[K-1] is the kth smallest

no, you cannot simply do a[0] + b[k-1]
as you may see in dutch's example A[1]+B[1] < A[0]+B[2]
smallest: A[0]+B[0] = 2
second smalles: A[0] + B[1] = 3
third smallest: A[1] + B[1] = 4
fourth smallest: A[0] + B[2] = 100
fifth smallest: A[1] + B[2] = 101
sixth smallest (aka biggest): A[2] + B[2] = 102


* something that I didn't consider before, ¿do you need to count only unique sums?


> Are you reading the real problem on some site and the OP gave a bad summary?
no, that's my reading of the op
but I agree, the original problem and some testcases will help.
Last edited on
I don't know about the duplicates. I'm ignoring them for now.

This gets the right answer on my little example.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
#include <iostream>
#include <limits>

constexpr auto NoAnswer = std::numeric_limits<int>::min();

int FindKthMinSum(int* A, int* B, int szA, int szB, int k) {
    int a = 0, b = 0;
    
    for (int i = 1; i < k; ++i) {       // k is 1-based
        if (a == b || a + 1 >= szA) {   // must incrememt b
            if (++b >= szB)
                return NoAnswer;
        }
        else if (b + 1 >= szB)          // must increment a
            a++;
        else if (A[a + 1] < B[b + 1])   // otherwise choose smallest
            a++;
        else
            b++;
    }
    
    return A[a] + B[b];
}

int main() {
    int A[] {1, 2,  3};
    int B[] {1, 2, 99};
    
    std::cout << FindKthMinSum(A, B, 3, 3, 3) << '\n';  // prints 4
}


Edit: Actually, it doesn't get the 4th one right. For 4 it prints 101 but it should be 100.
Last edited on
@dutch,
I think the problem is that you are always INCREASING a and b ... for the 4th min sum you actually need to go backwards in a: you can't keep increasing them. When a "catches" b then a new B[b] value goes into the mix, so you need to reconsider A[0] again.

k=1, sum is 2,  from a=0, b=0
k=2, sum is 3,  from a=0, b=1
k=3, sum is 4,  from a=1, b=1    <==== can't see any a priori reason to exclude a check on a=0, b=2
k=4, sum is 100,  from a=0, b=2  <==== a needs to go back here
Last edited on
@dutch @ne555 @lastchance

Actually i faced this problem during my interviews and whatever i understood i came up with a solution as ne555 suggested above but the interviewer asked me to suggest the solution better than O(n^2).

For the time being I am considering both the arrays of same size.

The solution that i suggested to him is:
A=[1,8,13,15]
B=[4,12,17,20]

And create a new array C=[5,13,18,21,20,25,28,30,33,35] in O(n^2)
then i sort the array and extract the C[k-1] as the answer.

TC of solution=O(n^2)+klogk where k=No of elments in C.

According to dutch's solution,the 3rd minimum will be 20 in my example.
But the answer is 18

Am i wrong?
Last edited on
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
#include <iostream>
using namespace std;

int kthsum( int A[], int na, int B[], int nb, int k )
{
   int a = 0, b = 0;
   for ( int nth = 2; nth <= k; nth++ )
   {
      bool ok1 = ( a + 1 < na ) && ( a < b );
      bool ok2 = b + 1 < nb;
      if ( !( ok1 || ok2 ) ) return -1;  // whatever
      if ( !ok1 )
      {
         a = 0;
         b++;
      }
      else if ( !ok2 )
      {
         a++;
      }
      else
      {
         if ( A[a+1] + B[b] <= A[0] + B[b+1] )
         {
            a++;
         }
         else
         {
            a = 0;
            b++;
         }
      }
   }
   return A[a] + B[b];
}

int main()
{
   int A[] = { 1, 2, 3 };
   int B[] = { 1, 2, 99 };
   for ( int k = 1; k <= 7; k++ ) cout << kthsum( A, 3, B, 3, k ) << '\n';
}
EDIT: Actually, what about this example. What is the 3rd smallest?

ah. That makes it harder.
1: 1+1 =2
2: 1+2 = 3
3: 2+2 = 4. //my approach isnt sufficient here. good to show.

It still should not be N*N. But its not O(1) either!
Last edited on
SparkXV wrote:
According to dutch's solution

It's not my solution. It's a cleanup of booradley60's attempt.
In fact, it isn't a solution at all.
It doesn't work.

@lastchance,
Your attempt doesn't work in general. (And is it purposely ignoring duplicates?) Try:

1
2
    int A[] { 3, 5, 13, 14, 22, 23};
    int B[] { 1, 2,  3,  6,  9, 10, 11, 13, 15, 18};

The 4th minimum is 7. Your code returns 8.
And returning -1 for "no answer" is not going to cut it since negative values (and therefore negative answers) are presumably allowed ... but that's nitpicking. :-)

Look at the output of this. It shows how the required indices change. The parts where the sum is repeated could be in a different order.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
#include <iostream>
#include <iomanip>
#include <vector>
#include <algorithm>

template<typename T, size_t Sz> size_t SizeOf(T(&)[Sz]) { return Sz; }

struct Record {
    int i, j, ai, bj, sum;
    Record(int i, int j, int ai, int bj, int sum)
        : i(i), j(j), ai(ai), bj(bj), sum(sum) {}
};

std::ostream& operator<<(std::ostream& os, const Record& o) {
    auto W = std::setw;
    return os << W(5) << o.i  << W(5) << o.j
              << W(5) << o.ai << W(5) << o.bj
              << W(6) << o.sum;
}

int main() {
    int A[] { 3, 5, 13, 14, 22, 23};
    int B[] { 1, 2,  3,  6,  9, 10, 11, 13, 15, 18};

    std::vector<Record> v;
    for (int i = 0; i < (int)SizeOf(A); ++i)
        for (int j = i; j < (int)SizeOf(B); ++j)
            v.push_back(Record(i, j, A[i], B[j], A[i] + B[j]));

    std::sort(v.begin(), v.end(),
        [](const Record& a, const Record& b){
            if (a.sum == b.sum) return a.i < b.i;
            return a.sum < b.sum;
        });

    std::cout << "  k    i    j  A[i] B[j]  sum\n";
    int k = 0;
    for (const auto& o: v)
        std::cout << std::setw(3) << ++k << o << '\n';
    std::cout << '\n';
}

  k    i    j  A[i] B[j]  sum
  1    0    0    3    1     4
  2    0    1    3    2     5
  3    0    2    3    3     6
  4    1    1    5    2     7
  5    1    2    5    3     8
  6    0    3    3    6     9
  7    1    3    5    6    11
  8    0    4    3    9    12
  9    0    5    3   10    13
 10    0    6    3   11    14
 11    1    4    5    9    14
 12    1    5    5   10    15
 13    0    7    3   13    16
 14    1    6    5   11    16
 15    2    2   13    3    16
 16    0    8    3   15    18
 17    1    7    5   13    18
 18    2    3   13    6    19
 19    1    8    5   15    20
 20    3    3   14    6    20
 21    0    9    3   18    21
 22    2    4   13    9    22
 23    1    9    5   18    23
 24    2    5   13   10    23
 25    3    4   14    9    23
 26    2    6   13   11    24
 27    3    5   14   10    24
 28    3    6   14   11    25
 29    2    7   13   13    26
 30    3    7   14   13    27
 31    2    8   13   15    28
 32    3    8   14   15    29
 33    2    9   13   18    31
 34    4    4   22    9    31
 35    3    9   14   18    32
 36    4    5   22   10    32
 37    4    6   22   11    33
 38    5    5   23   10    33
 39    5    6   23   11    34
 40    4    7   22   13    35
 41    5    7   23   13    36
 42    4    8   22   15    37
 43    5    8   23   15    38
 44    4    9   22   18    40
 45    5    9   23   18    41
Last edited on
> TC of solution=O(n^2)+klogk where k=No of elments in C.
k is n^2, so you have O(n^2 lg n)

for a less than O(n^2) solution you cant actually construct C
given one number x, ¿in which position will end in the sorted C array? O(n lg n)
¿did you think of something?
Considering that C starts as {A0+B0, A0+B1}, ¿in which position does A1+B1 end? ¿what about A2+B2? ¿and A42+B42?
@Dutch was right - my previous one didn't always work because it wasn't allowing b to get smaller.

I don't know whether this is either right or observes asymptotic limits.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;


int kthsum( const vector<int> &A, const vector<int> B, int k )
{
   int na = A.size(), nb = B.size();
   int largest  = A[na-1] + B[nb-1];
   int smallest = A[0] + B[0];      

   // Reject impossible k values
   if (   ( na <= nb && na * ( nb + nb - na + 1 ) / 2 < k )   ||
          ( na >  nb && nb * ( nb + 1           ) / 2 < k )      ) return smallest - 1;

   int amax, sum;

   // Current "front" ... the largest values of b currently used for a given a
   vector<int> lastb(na);
   for ( int a = 0; a < na; a++ ) lastb[a] = a - 1;

   // First sum
   sum = smallest;
   lastb[0] = 0;
   amax = 0;

   for ( int nth = 2; nth <= k; nth++ )
   {
      // Find minimum sum on the current "front" moved up 1 in the b direction
      sum = largest + 1;
      int aval;
      for ( int a = 0; a <= min( amax + 1, na - 1 ); a++ )
      {
         int b = lastb[a] + 1;   if ( b >= nb ) continue;
         int test = A[a] + B[b];
         if ( test <= sum )
         {
            aval = a;
            sum = test;
         }
      }
      if ( aval > amax ) amax = aval;
      lastb[aval]++;
   }
   return sum;
}


int main()
{
   vector<int> A = { 3, 5, 13, 14, 22, 23};
   vector<int> B = { 1, 2,  3,  6,  9, 10, 11, 13, 15, 18};
   int smallest = A[0] + B[0];

   for ( int k = 1; k <= 46; k++ )
   {
      int ksum = kthsum( A, B, k );
      if ( ksum >= smallest ) cout << k << '\t' << ksum         << '\n';
      else                    cout << k << '\t' << "Impossible" << '\n';
   }
}


1	4
2	5
3	6
4	7
5	8
6	9
7	11
8	12
9	13
10	14
11	14
12	15
13	16
14	16
15	16
16	18
17	18
18	19
19	20
20	20
21	21
22	22
23	23
24	23
25	23
26	24
27	24
28	25
29	26
30	27
31	28
32	29
33	31
34	31
35	32
36	32
37	33
38	33
39	34
40	35
41	36
42	37
43	38
44	40
45	41
46	Impossible
hmm if that is right, then is this right?
you don't need any logic.
generate a0 * {all of b}, a1*{b1-bn}, a2*{b2-bn} and sort that.
you need asize(asize+1)/2 - (asize-bsize)(asize-bsize+1)/2 (10+11)/2 - (4+5)/2 == 45)

if N were bsize, 10, n*n is 100.
if N were asize, 6, n*n is 36, and we did more work.
Not sure what to call N here. It feels like significantly less than N*N work ... my gut feeling is N should be asize+bsize but I am rusty on how to pick it here.

whole program is just (shameless lifting from above code)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
 #include <iostream>
#include <ctime>
#include <string>
#include <cstdlib>
#include <vector>
#include <algorithm>
using namespace std;
int main()
{  
  
   vector<int> A = { 3, 5, 13, 14, 22, 23};
   vector<int> B = { 1, 2,  3,  6,  9, 10, 11, 13, 15, 18};
   vector<int> res; //reserve space for performance, ignored here. 

   for(int i = 0; i < A.size(); i++)
    for(int j =i; j < B.size(); j++)	   
      res.push_back(A[i]+B[j]);   
    
    sort(res.begin(), res.end()); 
	for(int i = 0; i < res.size(); i++)
		cout << i+1 << " " << res[i] << endl;
   
}


gives same output as above sans the impossible.
technically the sort is doing more 'work' than is strictly necessary. If the concern is to eliminate the sort and optimize that effort down to the least possible effort, then what is already here is a better crack at that. But I would bet that sort is cheaper than the effort to avoid it costs in real clock time, even if the sort is doing more actual things. ?? What say you guys..?
Last edited on
@jonnin, That's the brute force method, like I used to generate the table above. The complexity is, I think, n + nlogn. The idea is to try to do better than that. And I agree that N is asize + bsize (actually, if asize is > bsize then it is basically clamped to bsize).

@lastchance, Looks like it solves it. Some extra machinery was needed -- an extra vector (of asize) and an inner loop. I'm not sure what the complexity is.
Last edited on
Pages: 12