Dynamic Sorting Predicate

Pages: 12
Hello,

I'm working on a type of sequencing problem, where the ideal position of an element is decided by nearby elements. As a type of experiment, I'm trying to use a built-on sorting algorithm to do this.

To do this, I need the sorting predicate to access elements other than "i and j". The only thing I know about this other elements is their position in the array/vector relative to i and j. For example, given (i, j), I wish to access the element located in the position before i.

My first thought was to wrap the elements into a class that keep a pointer to the elements I need. Following the example above, each element would keep a pointer to the previous item. The problem is that I can't update these pointers, as I wish to use a built-in sorting algorithm [for "good" reasons].

Thus, to fix my problem, I need either:
a) a way to keep "relative" pointers: *prev is locked to the position BEFORE *this. Given that the vector containing the elements won't change, any and all pointers should remain valid during the sorting. I'm thinking of something along the lines of:
element* getPrevious() { return (this-(sizeof(element))); }
I have no idea how valid this is, but it looks terrible and probably is.

b) a data structure that, given only an element i, can find the predecessor of i. A doubly-linked list seems perfect for this, but I can't figure out how the iterator magic would work.

Anyone have an idea?
std::list. It has the "iterator magic" you need. Also, about this code:
element* getPrevious() { return (this-(sizeof(element))); }

If element is an int, this would give the element 4 places befind, if it were a double, it would be 8 places before. Why? The compiler has acces to RTTI, which allows it to determine the size of a pointer's element. So mypointer-n isn't n bytes before mypointer, but n*sizeof(*mypointer) bytes before!
Last edited on
I may be looking at this the wrong way, but it looks to me like you're overcomplicating this. Why not use a 2D array to store i and j, and to access the predecessor of i do:
myElementArray[i-1][j]
That uses to much memory. A std::list is just fine!
If you already have the iterator, use static_cast<_List_node<T>*>(it._M_node->_M_prev)->_M_data, if not, use static_cast<_List_node<T>*>(find(a.begin(), a.end(), x)._M_node->_M_prev)->_M_data if T is the type, it is the iterator, a is the list and x is the element.
@Major Tom
I may be looking at this the wrong way, but it looks to me like you're overcomplicating this. Why not use a 2D array to store i and j, and to access the predecessor of i do:
myElementArray[i-1][j]

I'm not sure what you're suggesting here; either I'm misinterpreting your suggestion, or you've misunderstood my problem.

In short, given a sequence of elements (represented by ints here):
5 9 20 1 4 6
I wish to reorganize ("sort") these elements, based on a "relation" function. For example, if (relation(5, 20) > relation (5, 9)), then 9 and 20 are swapped. The actual function is a bit longer, but the idea is the same. The relation function isn't guaranteed to be symmetric (relation (5, 9) != relation (9, 5)) and isn't even guaranteed to be consistent. Technically, it shouldn't give any problems, but the result might vary depending on the starting order.

The problem is that:
a) The predicate takes two elements ("9" and "20"), but needs more information. The other information is stored in nearby cells of the array that is being "sorted". Thus, starting from an element, I need to find a position, which is then used to access a nearby element.
b) The nearby elements change too. When an element changes position, its nearby cells contain different elements.

I think viliml's suggestion might get me there; I'm trying to work it out as we speak.
Right. So why not store the elements in an array (or list)? For instance, store all the elements in the array, and loop through them, running code recursively on them until they are all sorted.

Yeah, it's very possible I'm misunderstanding your problem. It's rather confusing :P
The problem is that I wish to use a built-in sorting method (just humor me on this; I have my odd reasons).

A built-on sorting method (std::sort) uses a predicate function, being the "objective function" of how they need to be sorted. This predicate takes two elements from the array (i and j) and returns a bool.

The difficulty lies in that the predicate takes i and j "out of context"; therefore, the predicate doesn't know anything about the elements before and after i and j, because by definition an ordinary predicate should only know i and j to decide which goes where.

My problem needs a way to, for example, find array[i+1] given array[i] (you don't know the index; you only have the element!).
your problem is finding array[i+1] given array[i]? LOL! array[i]=*(find(array, array+n, array[i+1])-1)
I'm happy to be corrected but I don't think the C++ standard dictates that the sorting has to be done in place, therefore, there may be times during the sort (ie during a predicate call) where there is no such thing as a next or previous element.
std::sort DOES do it in place!
@viliml
your problem is finding array[i+1] given array[i]? LOL! array[i]=*(find(array, array+n, array[i+1])-1)


The point is: I don't have 'array'; I have (a copy of?) array[i] and array[j]. That's the entire point of this question. Given ONLY two elements, IS it possible to find the rest of the array?

My reason for doubting the pointer magic is that the predicate most likely gets a temporary copy, not the actual element, so I'm not sure taking the address is a legal operation. Is there a way around this (e.g. does passing a reference keep the address of the original element)?

I was thinking there isn't, hence why I was asking for a possible data structure to get around it.

An std::list wouldn't work (I think), because it's an element wrapped in a list-node. As only the element is passed to the predicate, it doesn't have access to the connecting pointers of the std::list.

A self-made list where the elements are list-nodes might work, but then std::sort won't know how to work with it.
OK, how about *(&array[i]+1)
The point is: I don't have 'array'; I have (a copy of?) array[i] and array[j]. That's the entire point of this question. Given ONLY two elements, IS it possible to find the rest of the array?


You can use a functor for this. Something like:

1
2
3
4
5
6
7
8
9
10
class WeirdPredicate
{
private:
    const std::vector<int> &data_;
public:
    WeirdPredicate(const std::vector<int> &data) : data_(data) {}
    bool operator()(const int i, const int j) const {
        //implement your predicate here, i and j are the relevant elements, data_ is the data being sorted
    }
};


Obviously you want to template the container and use Container::value_type as the parameter types to the call operator in your implementation.

1
2
3
4
//To use
std::vector<int> myData = getSomeData();
WeirdPredicate wp(myData);
std::sort(myData.begin(), myData.end(), wp);


Note that we're holding a reference to a stack object here - which is possibly bad practice. I do this a lot, but normally put the declaration of the wp object in an inner scope to guarantee it is destroyed before the myData object is. In this instance you could use a temporary instead of wp, but that doesn't always work.

std::sort DOES do it in place!


Do you have a reference to guarantee that, I know different c++ libraries do some pretty funky things.
Just my 5 cents worth!!

With your predicate function, can you get it to take args that are iterators to describe your inclusive range of objects. Then you should be able to manipulate these to get at the other obj's you need to decide on what is swapped.

As I understand it, you can write whatever you want in your custom compare function.

Maybe you can do this using the object as a comparison, rather than a function as a comparison.

Not sure whether this was any help at all.
Messed around a bit and ran into trouble.

The predicate's operator() MUST take (int i, int j) as arguments to work in std::sort. This means no iterator magic.
No iterator magic means it's impossible (?) to find the previous of i, as even with the predicate object holding the list, you can't find the actual location of element i in that list, aside from actually running find() (which would run into troubles in case of duplicates).
> The relation function isn't guaranteed to be symmetric (relation (5, 9) != relation (9, 5))
> and isn't even guaranteed to be consistent.

> The problem is that I wish to use a built-in sorting method (just humor me on this; I have my odd reasons).
> A built-on sorting method (std::sort) uses a predicate function

The predicate function used with standard sorting algorithms must be one that imposes a strict weak ordering. Would your "relation" function yield a relation that is guaranteed to be asymetric, irreflexive and transitive?
http://www.sgi.com/tech/stl/StrictWeakOrdering.html
I assume you are only trying to do this purely to demonstrate that a sort algorithm requiring strict weak ordering is NOT SUITABLE for your problem. As JLBorges says above.

Anyway...

The predicate's operator() MUST take (int i, int j)


That's not true, the predicate can also take a const reference to the object, and you can take the address of the reference which is the address of the actual object.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#include <vector>
#include <iostream>

struct MyLt
{
	bool operator()(const int &i1, const int &i2) const {
		std::cout << "comparing: " << &i1 << " " << &i2 << std::endl;
		return i1<i2;
	}
};

int main() {
	std::vector<int> i;
	i.push_back(35);
	i.push_back(93);
	i.push_back(22);
	i.push_back(35);
	i.push_back(93);
	i.push_back(22);

	std::cout << "vector: " << &i << std::endl;

	for(int x=0;x!=i.size();++x) {
		std::cout << x << " " << &(i[x]) << std::endl;
	}
	std::cout << std::endl;

	MyLt lt;
	std::sort(i.begin(), i.end(), lt);

	return 0;
}


Interestingly, when running this, at least with my STL, the sorting is clearly not done completely in place, see thee address of i1, which is obviously a stack address (it's near the address of the stack allocated vector). This brings you back to the first issue I highlighted of there not necessarily being a next and previous element.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
vector: 0x7fff6ceffb68
0 0x10d4008d0
1 0x10d4008d4
2 0x10d4008d8
3 0x10d4008dc
4 0x10d4008e0
5 0x10d4008e4

comparing: 0x7fff6ceff9c4 0x10d4008d0
comparing: 0x7fff6ceff984 0x10d4008d0
comparing: 0x7fff6ceff9c4 0x10d4008d0
comparing: 0x7fff6ceff9c4 0x10d4008d0
comparing: 0x7fff6ceff984 0x10d4008d8
comparing: 0x7fff6ceff984 0x10d4008d4
comparing: 0x7fff6ceff9c4 0x10d4008d0
comparing: 0x7fff6ceff984 0x10d4008dc
comparing: 0x7fff6ceff9c4 0x10d4008d0
comparing: 0x7fff6ceff984 0x10d4008e0
comparing: 0x7fff6ceff984 0x10d4008dc
comparing: 0x7fff6ceff984 0x10d4008d8
comparing: 0x7fff6ceff984 0x10d4008d4
comparing: 0x7fff6ceff984 0x10d4008d0
Actually, my goal would be to bring some variability in the result, depending on the starting order of the elements. I could rework the predicate to be SWO, but it would lose some of its value.

Anyway, problem remains: can I access elements beyond the two provided as arguments to the predicate?

I'm thinking of making the predicate do some work (e.g. alter states of the elements), but it would defeat the purpose of using built-in sort methods.
Anyway, problem remains: can I access elements beyond the two provided as arguments to the predicate?


Yes and no. As there may be times when the data being compared isn't in the array. Assuming you can deal with this situation then you can create a map that gives the position of an element based on it's address, so you could say:

i1idx = lookupindex[&i1];

and then if &i1 is in the map, then the previous element and next elements to i1 can be found at i1idx-1 and i1idx+1.
Pages: 12