how does the sort from algorithm work?

I understand that it sorts a vector in ascending order and you can write your own comparator method to specify the sort order, but how does the sort work behind the scenes, does it use quick sort or bubble sort or some other sort? Is there a code I can refer to on how the sort works?
http://www.cplusplus.com/reference/algorithm/sort/

Complexity
On average, linearithmic in the distance between first and last: Performs approximately N*log2(N) (where N is this distance) comparisons of elements, and up to that many element swaps (or moves).

The standard just characterises the expected performance complexity. It's down to an implementer to make an actual choice.

If you have a freedom compiler (like GCC or Clang), then you can usually find the source code for the implementation of your C++ library.

Being NLogN, that rules out bubble sort (which is awful).
Quicksort has some pathological worst-case performance.

https://en.wikipedia.org/wiki/Sorting_algorithm#Comparison_of_algorithms

The C++ standard does not dictate what sorting algorithm must be used by std::sort. It makes the following demands:
1 Effects: Sorts the elements in the range [first,last).
2 Requires: RandomAccessIterator shall satisfy the requirements of ValueSwappable (17.6.3.2). The typeof *first shall satisfy the requirements of MoveConstructible (Table20) and of MoveAssignable (Table 22).
3 Complexity: O(N log(N)) (where N == last - first) comparisons.


So whoever wrote the library you're using needed only to meet those requirements.

As you can see here, there are a number of possible sorting algorithms that could meet these requirements : https://en.wikipedia.org/wiki/Sorting_algorithm#Comparison_of_algorithms

Commonly, Introsort is the algorithm used to implement the standard library sort function.

For some actual source code: http://www.cplusplus.com/forum/general/219746/
Last edited on
Here are links to the files that implement std::sort for each library:

libstdc++: https://gcc.gnu.org/viewcvs/gcc/trunk/libstdc%2B%2B-v3/include/bits/stl_algo.h?view=markup#l2022

libc++: https://github.com/llvm/llvm-project/blob/956892433f7c0ae4520232b07d442fedbcc14cb2/libcxx/include/algorithm#L3899

Interestingly, both of them use insertion sort at some point in their hybrid sorting algorithms.

-Albatross
Last edited on
@Repeater points out the common sort algorithm is Introsort, and as @Albatross brings up, the tail is the insertion sort.

Introsort is actually two sort algorithms combined. It starts out with a classic Quicksort, known to be generally the "fastest" of the algorithms given random data, but it is also known to suffer loss of efficiency at the late stage(s).

Quicksort works by dividing an array roughly in half, running each left and right half through a recursive algorithm which also divides each in half.

If the volume starts out in the thousands to millions or more, there can be a substantial number of small sections generated by this approach (in theory, the algorithm is supposed to continue until subdivision makes no sense, and only two or 3 are to be compared - so 1 million entries could produce a few hundred thousand subdivisions of 2 entries each).

However, Introsort switches to insert sort (sometimes mergesort) when the volume of a subdivision reaches an arbitrarily small volume. This is typically anywhere from 7 to 30 entries in the code I've reviewed over the years (dozens upon dozens of implementations).

One might view Introsort as using Quicksort to set up thousands of small insert sorts (or merge sorts).

However, Introsort switches to insert sort (sometimes mergesort) when the volume of a subdivision reaches an arbitrarily small volume.

Ive seen arguments to use shell here and use a much larger cutoff. The trouble with using it in the standard one is the requirement vs the complex analysis. The arbitrary nlgn is possibly preventing a slightly better algorithm.
Repeater wrote:
The C++ standard does not dictate what sorting algorithm must be used by std::sort. It makes the following demands:

...which demands basically mean “Introsort”.

Introsort was created specifically for the C++ Standard Library. After that the required performance characteristics were updated to reflect what Introsort offered.

Properly implemented, it is a combination of three different sort algorithms:

  • insertion sort — the de facto fastest sort for N ≤ sizeof(L1)
  • quicksort — the de facto fastest sort (typically) for N > sizeof(L1)
  • heap sort — a typically high-performing sort

Long ago I wrote the FAQ for it: http://www.cplusplus.com/faq/sequences/sequencing/sort-algorithms/introsort/

Read it for all the gory details, see that page — it will flesh out the reason behind some of the details already mentioned in this thread, including the performance trick with the insertion sort at the end.

@Niccolo
The number of elements to leave unsorted for the insertion sort to handle is properly supposed to be tuned to the hardware that the quicksort is running on — default library implementations are typically rather conservative, and a lot of extant code has not (IFAIK) been properly updated to handle modern hardware capabilities... alas.

But only seven elements, LOL. That’s stupidly small even for a Z80...
You old man. :->
@Niccolo
The number of elements to leave unsorted for the insertion sort to handle is properly supposed to be tuned to the hardware that the quicksort is running on

I would argue that the data volume expected could affect it too if you were trying to DIY for a high speed program (not general purpose).
Niccolo understands everything about my comment, but you have missed something important.

Yes, a larger amount of data will always affect any sort (which is kind of a given) if for no other reason having to deal with inevitable page swaps.

But the important point about the insertion sort is explained in the FAQ link: as long as the maximum shift-rotate distance does not exceed a certain N (well within L1’s capacity), you’re golden — it will outperform anything else.

(I should clarify that my statement in the last post I made is very broad; N is typically smaller than L1’s capacity. Only performance metrics over the data type being sorted can give you a clear performance value. This is true of ALL sorts...)
Yea I see. I have to read things a few times these days. The values are just smaller than I expected, a LOT smaller. When I was writing sorts (Pre useful STL era, pre intro sort, long ago) we cut quicksort short much earlier than this.

I haven't spent nearly enough time on intro... that may be my next playtime project.
Heh, welcome to the old geezers club. :O)

I enjoyed playing with sorting algorithms. Interestingly, the sort I least like is merge sort, even though I really like the simplicity of the merge operation itself...

The one I enjoyed playing with most is the counting sort...
Topic archived. No new replies allowed.