Sort algorithm on vector of structures

Pages: 12
Hello people!

I've found a few posts here and there about people having the same problem as me but I couldn't quite make sense of them.

Let's say I have structure as such:
1
2
3
4
5
struct PersonalInfo
{
	string name;
	int age;
};


And I want to create a vector of PersonalInfo structures to store data:
vector<PersonalInfo> vInfo;

How do I use the sort algorithm included in the STL to sort the vector of structures in order to have the first vector member be the youngest person, and the last be the oldest person?

My problem is that I can make sense of the sort algorithm with a simple vector of ints or doubles for instance but I don't quite know how to deal with structures or classes in this case.

Thank you in advance for any explanation or advice!

Have a good morning, day or evening!
I've found a few posts here and there about people having the same problem as me but I couldn't quite make sense of them.

Did you find and study any documentation for std::sort?

http://en.cppreference.com/w/cpp/algorithm/sort
http://www.cplusplus.com/reference/algorithm/sort/

You'll need to either define a comparison object or implement the operator< for your struct/class.



Did you find and study any documentation for std::sort?

http://en.cppreference.com/w/cpp/algorithm/sort
http://www.cplusplus.com/reference/algorithm/sort/


In fact I did read through these two sources earlier and I don't quite understand the comp argument of the sort algorithm. It's supposed to return a bool, and it's supposed to compare the two structure variables, but I don't grasp how it works especially if I have to overload an operator for the structure.

I'll try and go through it again and see if I can make more sense out of it!
It's supposed to return a bool,

Correct.

and it's supposed to compare the two structure variables,

Yes it should compare the two parameters for less than (<).

but I don't grasp how it works especially if I have to overload an operator for the structure.

Then perhaps you need to study up on operator overloading, the less than operator< to be specific.

Are you not allowed to build your own sort? Here's an example of what I have.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
#include <iostream>
#include <vector>
#include <string>

struct person
{
    std::string name;
    int age;
};

void bubbleSort(std::vector<person> &vectorObject) // sort vector
{
    for(int i = 1; i < vectorObject.size(); i++)
    {
        for(int j = 0; j < vectorObject.size()-i; j++)
        {
            if(vectorObject.at(j).age > vectorObject.at(j+1).age)
            {
                person temp = vectorObject.at(j);
                vectorObject.at(j) = vectorObject.at(j+1);
                vectorObject.at(j+1) = temp;
            }
        }
    }
}

int main()
{
    std::vector<person> myObject;

    // create some people...
    myObject.push_back({"jim", 20});
    myObject.push_back({"goku", 1000});
    myObject.push_back({"vegetable", 32});
    myObject.push_back({"coffee", 321});

    bubbleSort(myObject); // sort the people according to age

    // print vector
    for(int i = 0; i < myObject.size(); i++) // print the sorted vector
    {
        std::cout << "Name: " << myObject.at(i).name << "\t\t" << "Age: " << myObject.at(i).age << std::endl;
    }
    return 0;
}


Edited: to add, I didn't read your post fully. I don't know if you're just wanting to sort it in general, or wanting to learn how to use the sort function from STL...
Last edited on
yes, you can build your own sort. Of course. Do you want to do that instead?

Otherwise, if you want to sort by multiple things (first this, then that, then the third) type things there are 2 ways to do it.
1) sort multiple times, using stable sorts.
2) figure out a way to tie the comparison together so it does it naturally and sort once. String concat of the fields is one way, or making your home comparison operator handle it is another. This is the preferred method, as stable sorts are slower and sorting multiple times is inefficient, so unless you have some really weird reason you can't do this, avoid #1.

Last edited on
Here are 2 ways to sort your vector - one uses a function object(functor) the other one a function.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
#include <iostream>
#include <fstream>
#include <string>
#include <algorithm>
#include <cstdlib>
#include <cstdio>

using namespace std;

#include <iostream>
#include <vector>
#include <string>

struct Person
{
  std::string name;
  int age;
};

struct PersonSorter
{
  bool operator()(const Person& p1, const Person& p2)
  {
    // sort by name
    return p1.name < p2.name; 
    //sort by age 
    //return p1.age < p2.age;
  }
};

bool PersonComparer(const Person& p1, const Person& p2)
{
  // sort by name
  return p1.name < p2.name;
  //sort by age 
  //return p1.age < p2.age;
}

int main()
{
  std::vector<Person> people =
  {
    {"Zoe", 33},
    {"Jennifer", 29},
    {"Anna", 19},
    {"Lisa", 44}
  };
       
  //sort(people.begin(), people.end(), PersonSorter());
  sort(people.begin(), people.end(), PersonComparer);

  for (Person p: people) // print the sorted vector
  {
    std::cout << "Name: " << p.name << "\t\t" << "Age: " << p.age << std::endl;
  }
}
If you don't understand the comp argument, then look at std::for_each:
http://www.cplusplus.com/reference/algorithm/for_each/

Just like the std::sort takes comp, the std::for_each takes fn.
Both algorithms assume that the argument can be used like a function; called.

The for_each calls function fn with one argument and ignores the returned value.
The sort calls function comp with two arguments and uses the returned value (that should be a bool).

Okay, somewhere in the code of std::sort there is something like:
bool aIsLessThanb = comp( a, b );

If you do call the other version of std::sort that does not take argument comp, the corresponding code is:
bool aIsLessThanb = (a < b);


Can you write:
1
2
3
Person lhs;
Person rhs;
bool foo = (lhs < rhs);

Yes, if you define an operator< that takes two Person objects as arguments.

Problem is, there can be only one such operator and therefore the std::sort that uses < can sort in one way only.


The std::pair is a struct like Person that has two members. It does have operator< that looks at both members, if necessary.
See http://www.cplusplus.com/reference/utility/pair/operators/


C++11 did add lambda syntax that lets us write code of comp in the call of sort:
1
2
3
sort( people.begin(), people.end(),
      [](const Person& p1, const Person& p2){ return p1.age < p2.age; }
    );
(@jlb)
Then perhaps you need to study up on operator overloading, the less than operator< to be specific.

I have indeed studied for classes, I'll however need to have a look into it for structures. Does it mean, though, that for each structure object in the vector, the sort function will compare the age of the structure object that's currently being looked at and the age of the following one, and change their place if they're not in the right order.

In other words, what I don't get is that it sounds like a bubble sort without a nested for loop.


(@fiji885)
Are you not allowed to build your own sort? Here's an example of what I have.

Yes I could of course build my own sort, just like yours, which would work totally fine, however I'd like to one understand better how I can utilise the sort algorithm from STL and two I'm planning on using this in a game, so the shorter my functions the better. Thank you for the suggestion though!

(@jonnin)
yes, you can build your own sort. Of course. Do you want to do that instead

Not really, I'd rather code something short and precise which still does the job fine!


(@Thomas1965)
Here are 2 ways to sort your vector - one uses a function object(functor) the other one a function.

Thank you, this is actually the sort of explanation/illustration I was looking for, It makes more sense now. One question remains though: with the use of the functor sort(people.begin(), people.end(), PersonSorter()) wouldn't you need to pass argument to PersonSorter()? Or is the sort function automatically doing it by itself?


(@keskiverto)
Yes, if you define an operator< that takes two Person objects as arguments.

What would be your interest in overloading the < operator if it's only to perform a comparison inside the comp function which only purpose is to return a bool? How would the default < operator not perform the same comparison task?

One question remains though: with the use of the functor sort(people.begin(), people.end(), PersonSorter()) wouldn't you need to pass argument to PersonSorter()? Or is the sort function automatically doing it by itself?

Yes the sort function will provide the two Person objects for comparison.
Yes the sort function will provide the two Person objects for comparison.

Thank you!
What would be your interest in overloading the < operator if it's only to perform a comparison inside the comp function which only purpose is to return a bool? How would the default < operator not perform the same comparison task?

Q: What does the "default < operator for objects of type Person" do?

A: What operator? There is no such operator. The language cannot possibly know how you want to compare Persons "by default".
What operator? There is no such operator. The language cannot possibly know how you want to compare Persons "by default"

Oh right, sorry, I was misunderstood and miss-read. I thought you were referring to overloading the < operator within the comp functor which compares two numbers. In this case, dealing with person objects, I can see how bool foo = (lhs < rhs); has to use an overloaded < operator.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
#include <iostream>
#include <iomanip>
#include <string>
#include <vector>
#include <algorithm>
using namespace std;

//======================================

struct Person
{
   string surname;
   int age;
};

//======================================

enum SortBy{ SURNAME, AGE } sortby;

//======================================

bool operator < ( Person p, Person q )                     // Comparator
{
   switch( sortby )
   {
      case SURNAME: return p.surname < q.surname;
      case AGE    : return p.age     < q.age    ;
   }
}

//======================================

ostream & operator << ( ostream &strm, Person p ) 
{ 
   return strm << left << setw( 10 ) << p.surname << setw( 3 ) << p.age;
}

//======================================

template <typename T> void print( const vector<T> &vec )
{
   for ( T e : vec ) cout << e << endl;
}

//======================================

int main()
{
   vector<Person> v = { { "Brown", 44 }, { "Jones", 17 }, { "Ashley", 70 }, { "Humphrey", 29 } };

   cout << "\nOriginals:\n";
   print( v );

   cout << "\nSorted by name:\n";
   sortby = SURNAME;
   sort( v.begin(), v.end() );
   print( v );

   cout << "\nSorted by age:\n";
   sortby = AGE;
   sort( v.begin(), v.end() );
   print( v );
}


Originals:
Brown     44 
Jones     17 
Ashley    70 
Humphrey  29 

Sorted by name:
Ashley    70 
Brown     44 
Humphrey  29 
Jones     17 

Sorted by age:
Jones     17 
Humphrey  29 
Brown     44 
Ashley    70 
I have indeed studied for classes, I'll however need to have a look into it for structures.

The only real difference between a class and a struct is the default access modes, a structure has public access by default where a class has private access by default. If you always implicitly state the access mode there is no difference between the two.

Does it mean, though, that for each structure object in the vector, the sort function will compare the age of the structure object that's currently being looked at and the age of the following one, and change their place if they're not in the right order.

If you have overloaded the operator< to use the age then yes.

Here is an example of overloading the operator< to sort on the name using a structure with only the one function and the default access modes.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
#include <iostream>
#include <vector>
#include <string>
#include <iomanip>
#include <algorithm>

struct Person
{
    bool operator<(const Person& p) const
    {
        return (name < p.name);
    }

    std::string name;
    int age;
};


std::ostream& operator<<(std::ostream& out, const Person& p)
{
    out << "Name: " << std::left << std::setw(12) << p.name
        << std::right << "Age: " << p.age;

    return out;
}

int main()
{
    std::vector<Person> people =
    {
        {"Zoe", 33},
        {"Jennifer", 29},
        {"Anna", 19},
        {"Lisa", 44}
    };

    // Note since I overloaded the operator< I don't need to use a comparison function, it will use the overloaded operator< instead.
    std::sort(people.begin(), people.end());
    for(Person p : people)
    {
        std::cout << p << std::endl;
    }
}


Here is the same basic program implicitly stating the access modes (with private data).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
#include <vector>
#include <string>
#include <iomanip>
#include <algorithm>

struct Person
{
    friend std::ostream& operator<<(std::ostream& out, const Person& p)
    { // Overload the output operator for easier printing.
        out << "Name: " << std::left << std::setw(12) << p.name
            << std::right << "Age: " << p.age;

        return out;
    }

    public:
        // Now need a couple of constructors because of the private data members.
        Person() = default;
        Person(const std::string& name, int age) : name(name), age(age) {}
        bool operator<(const Person& p) const {return (name < p.name);}
    private:
        std::string name;
        int age;
};

int main()
{
    std::vector<Person> people =
    {
        {"Zoe", 33},
        {"Jennifer", 29},
        {"Anna", 19},
        {"Lisa", 44}
    };

    std::sort(people.begin(), people.end());
    for(Person p : people)
    {
        std::cout << p << std::endl;
    }
}


Also remember that you can reverse the sort order by using reverse iterators but to sort on different data members you will need to define a comparison function to use.

1
2
3
4
5
6
    // Sort ascending.
    std::sort(people.rbegin(), people.rend());
    for(Person p : people)
    {
        std::cout << p << std::endl;
    }



Last edited on
@lastchance & jlb

Thank you a lot for your explanations and the example codes you provided, it makes much more sense now.

There is however still two minor things which I'm not familiar with so I will ask merely our of curiosity:

1_
1
2
3
4
for(Person p : people)
    {
        std::cout << p << std::endl;
    }


What does Person p : people mean? is this a condition? (I'm familiar with for loops that are somewhat like this: for (int iterator ; condition ; iterator ++/--), in other words what does this colon mean, is this an operator?


2_enum SortBy{ SURNAME, AGE } sortby;

What is the difference between this and defining an enum by simply doing this: enum SortBy{ SURNAME, AGE };?



Thank you a lot to all who responded and helped!


EDIT: typos.
Last edited on
for(Person p : people)
This is a range-based for loop. It came in with C++11 (in case you have an older textbook). It basically means "for all items in people do the following ..." and you would use it typically where you wanted to loop through all the elements of some container.


enum SortBy{ SURNAME, AGE };
This defines a type (and, in this case, the set of values it can take). It can be followed by a further one-line statement (watch out for case):
SortBy sortby;
which is not all that different to you writing
int i;
for "type", then "name of variable".

Here, I simply put them together as one line, designating a type and naming a variable of that type at the same time:
enum SortBy{ SURNAME, AGE } sortby;


To be fair, if I'm sorting I usually define a comparator function and use the 3-argument form of std::sort(). Quite often I use lambda functions (also C++11; see @Keskiverto's first post in this thread) and define the function in-place. However, thanks to your post, I wanted to explore ways that I could redefine the operator < to use the two-parameter form of sort() so that I only needed one comparator function, instead of different ones for sorting on name or age. The only slight downside is having to carry a global variable, sortby.

It's also possible to define the < operator as a member function (as in @jlb's version) where the left-hand side of the < will be an actual object of that struct/class, or as a separate function taking two arguments of that struct/class (as in my version). If it needs to access private members to do the comparison then it will need to be made a "friend" function to get at them (see @jlb's example); however, for a struct with default public access that isn't necessary.

Last edited on
This is a range-based for loop. It came in with C++11 (in case you have an older textbook). It basically means "for all items in people do the following ..." and you would use it typically where you wanted to loop through all the elements of some container.

Oh waw that's interesting, I should really dig into the C++11 ressource to know all that can be done in the newer version. And yes my textbook is older so I only have a few sneak peeks into c++ 11 through it.


This defines a type (and, in this case, the set of values it can take)

Oh yes I was a fool not to have seen it, it's pretty obvious indeed.


Well I'm glad you could get something out of this post too! I can see why having int, double global variables can be problematic at times therefore the preference to avoid global variable. How could a global enum variable could be bad for your program though?


EDIT: typos.
Last edited on
Hi @hoogo, I learnt from a C++98 textbook too (and one that I know is distinctly frowned upon in this forum!) Nearly all the C++11 I picked up from reading posts on this forum, including those by the other contributors to this thread.

Global variables are sometimes a necessary evil, but, by definition, every function has access to and can change them. Imagine you have two classes: Students and Teachers. Which one would you like to have unrestricted access to variables called "draft_exam_paper" and "final_grade"?
I learnt from a C++98 textbook too (and one that I know is distinctly frowned upon in this forum!)

Which one is it if I may ask and why would it be frowned upon if you managed to acquire the knowledge anyway?


Ah. Yes I see what you mean, I didn't think so far as class functions but it could indeed be a mess if, as in your example above, the Students and Teachers classes could access variables that are meant to be used by the other class.
Pages: 12