Need Approach with better time complexity?

Pages: 12
From the loops the complexity is order k^2. Guess it depends how big k can be compared with n.
Yea, that was what I thought too. But I suspect it takes a very large pair of vectors before the extra logic can pull ahead of the sort on the wall clock.

you *have* to generate the values, all 45 in this case, so the double for loop is unavoidable, but the loop generates them out of the desired order. How to order them without excessive logic is the key.

and possibly, as it gets bigger, find the kth without the previous ones ... generating them all may not be the goal.
Last edited on
A sort that terminates as soon as the kth element is in place would do somewhat better on average.

In practice this is pretty quick and I think easy to understand, although it's probably N2log(n) worst case. This uses the same test data as lastchance.

I create a set that contains the K smallest sums seen so far. It fills it with the first K sums, then after each insert:
- remove the last item in the set. This keeps the size at K.
- if the removed value is the same as the inserted value, then break out of the inner loop because all subsequent sums will be larger.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
#include <iostream>
#include <vector>
#include <set>

using namespace std;

int kthsum(const vector<int> &a, const vector<int> &b, unsigned k)
{
    multiset<int> sums;
    for (size_t i=0; i<a.size(); ++i) {
	for (size_t j=i; j<b.size(); ++j) {
	    sums.insert(a[i]+b[j]);
	    if (sums.size() > k) {
		auto iter = sums.end();
		--iter;
		int removed = *iter;
		// cout << "removing " << removed << '\n';
		sums.erase(iter);
		if (removed == a[i]+b[j]) break;
	    }
	}
    }
    if (sums.size() != k) return -1; // not enough sums
    auto iter = sums.end();
    --iter;
    return *iter;
}


int
main()
{
   vector<int> A = { 3, 5, 13, 14, 22, 23};
   vector<int> B = { 1, 2,  3,  6,  9, 10, 11, 13, 15, 18};

   for ( int k = 1; k <= 46; k++ )
   {
      int ksum = kthsum( A, B, k );
      cout << k << '\t' << ksum << '\n';
   }
}

Last edited on
A sort that terminates as soon as the kth element is in place would do somewhat better on average.

yes.
but another thing to note is that the array generated with the dumb loop is generally going to be 'sort of sorted'. A couple of the sorting algorithms perform much better under this condition; I believe shell is one of them. So we are looking at some sort of K*K+ "a wee bit more than N and a fair bit less than NlgN". (Yes, that is horribly stated, but the numbers are going to be one of those weirdos and I don't know what it will be).

I feel like I am missing a big something though. It looks like sorting is what we want to do conceptually but not mechanically. Can we get down to just K*K...
@lastchange: iirc, you are doing a merge-sort with several lists

> From the loops the complexity is order k^2. Guess it depends how big k can be compared with n.
for the third time, k is the order of n^2
I don't know how you've got that complexity, outher loop is k (aka n^2), inner loop is n, so you end up with O(n^3)

> you don't need any logic.
> generate a0 * {all of b}, a1*{b1-bn}, a2*{b2-bn} and sort that.
I don't know why you are multiplying, but yes

> Not sure what to call N here. It feels like significantly less than N*N work
> ... my gut feeling is N should be asize+bsize but I am rusty on how to pick
> it here.
take whatever you want, you'll get the same order (it would only change the constant)
and again, the complexity is O(n^2 lg n)
i don't know why i said * either.
if you generate the loop into a 2d vector instead, each of those is sorted.
then you can do dumb-merge (not merge sort, but 'take the smallest off the top of each bucket' ) fairly cheaply. That cuts the effort way down, your 'sorting' never moves any data, and only compares a relatively small # of times. Still does not feel right for optimal.
Sorry, @ness, k is the k in "kth minimum sum". Is n the size of a[] or b[] or k: no idea.

I have no idea why you say k is the order of n^2. k is a quantity defined in the question (as far as I can see).

Within kthsum() the outer loop is O(k) and the inner loop is O(k), making O(k^2). I could have avoided repeatedly computing the sums on what I called the "front" (by storing them), but that doesn't change the complexity.

The loop in main() is just to check answers against @dutch's master list and play no role in the complexity. I'm well aware that successive sums are calculated in kthsum() anyway.

Last edited on
let's say size(a) == size(b) == n
going back to
1
2
3
for a in range(len(A)):
	for b in range(a, len(B)):
		C.append(A[a]+B[b])
or jonnin's code
1
2
3
for(int i = 0; i < A.size(); i++)
	for(int j =i; j < B.size(); j++)	   
		res.push_back(A[i]+B[j]);
you may see tht there are n*(n+1)/2 sums
if you want the kth minimum sum, then k is in the range [0; n*(n+1)/2), that is, is in the order of n^2
(as an example, if both arrays have 100 elements, then k may be bigger than 500'000)

> Not sure that I recognise "merge sort" in my last code, either.
as jonnin pointed, it's actually just merge
*result++ = (*j<*k)? *j++ : *k++;
you store in `lastb' all the iterators
dereferencing the iterator is doing A[a]+B[b]
then the conditional *j<*k is your loop in lines 33--42
and so you increase the proper iterator in line 44
you don't bother with `result' until you reach the target.
Thank-you for that; I begin to see what you meant. I thought n referred to the size of a[] and b[] and k was a fixed number independent of those. But, yes, there are meaningful sums up to O(n^2).

But in that case I have to admit defeat - I definitely can't beat O(n^2) and that was the OP's original question.
Last edited on
my approach in, I think, O(n lg^3 n) (pseudocode)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
array a, b
sorted_sum C(a,b)
print(C[k])


class sorted_sum:
    //"generates" a sorted array with the values (a_j+b_k) such that j<=k

    def sorted_sum(a, b):
        this.a = a
        this.b = b

    def value(j, k): //O(1)
        if j<=k:
            return a[j]+b[k]
        return NaN

    def operator[](index): //O(n lg^3 n)
        //returns the value in the index position of the sorted array
        f = lambda j,k: where_am_i(value(j,k))
        [j,k] = triang_matrix_binary_search(f, 0, a.size(), 0, b.size(), index)
        return value(j,k)

    def where_am_i(value): //O(n lg n)
        //gives the "lower_bound" position of value in the sorted array
        position = 0
        for x in a:
            position += b.end() - lower_bound(array=b, val=value-x)
        return position

    def triang_matrix_binary_search(m, rbeg, rend, cbeg, cend, target): //O(lg^2 n)
        //returns the coordinates of target on the m "matrix"
        if rbeg == rend or cbeg == cend: //value not found
            return nil

        rmid = rbeg + (rend-rbeg)/2
        f = lambda x: m(rmid, x)
        column = lower_bound(f, lower=cbeg, upper=cend, val=target)

        if m(rmid, column) == target:
            return rmid, column
        //touching the diagonal
        //discard the bottom because they are all greater (or invalid with j>k)
        if column == rmid:
            return triang_matrix_binary_search(m, rbeg, rmid, cbeg, cend, target):
        
        //recurse
        //here we part the matrix in 4
        //the br quadrant has values that are all greater that what we want
        //the ul quadrant has values that are all greater that what we want
        //so we look in bl and ur
        return
            triang_matrix_binary_search(m, rbeg, rmid, column, cend, target): //ur
            or triang_matrix_binary_search(m, rmid, rend, cbeg, column, target): //bl 


the core is the where_am_i() function, kind of like the opposing function that we are looking for,
C[K] = value, given a `value', we obtain `K' (or a lower_bound)
first thought on simply doing a binary search on [a0+b0, an+bn], but that may not give me elements in the matrix (unless a and b hold integers)
so did a matrix binary search based on this https://ronzii.wordpress.com/2012/03/27/binary-search-in-a-matrix/
our matrix here is C_{jk} = a_j + b_k but with the bottom triangle filled with zeros
Last edited on
Registered users can post here. Sign in or register to post.
Pages: 12