Asymptotic Notation

I have somethings I need clarification on when it comes to asymptotic notation. Correct me if I'm wrong, but my understanding of Big O is that it is an upper bound. An algorithm of O(n^2) signifies that the algorithm cannot exceed a running time over n^2 (the algorithm can never be n^2 lg n, n^3m, etc...). Similarly, Big Omega means an algorithm of omega(n^2) cannot be less than n^2 ( n lg n, n, lg n, etc...). Big theta, however, is more precise as an algorithm of theta(n^2) signifies the algorithm will always perform at a running time of n^2. Am I correct with this line of thinking.

Also I am confused as to how one finds big omega and theta from a piece of code. From most of the examples that I've seen, the running time equation is always expressed as Big O notation. The rule for finding the worse case big O from a piece of code is to drop the least significant terms, along with all the coefficients. But what are the rules for Big Omega and Big theta?
That sounds right. I forget which term is which case but presumably you looked it up and have it right.

Code analysis can be easy or difficult. Take bubblesort: its 2 nested loops over N items, N*N, pretty basic. Then take a look at shellsort (a personal favorite) which is only a tiny bit more complex and just a few more lines of code, but you need a PHD degree to unravel its run time study, and that varies based off your choice of presort values!! (It could be N*N for bad values, N^(x+1/x) where the constant x varies by your choice (6 is normal?), and a couple of other possible run times as well).

Theta is not so bad. It is useful for constant difficulty (not constant run time) problems. Consider matrix multiply, for example. If you do that with a brute force algorithm, it does the same amount of work (based off input sizes) no matter what. Specialized sparse matrix multiply or something might vary based off % of zero entries, but the more general case is pretty much a fixed complexity.

Big omega is just the best case analysis. Consider a binary search: if it finds the item on the first try, the best case is 1. It can't do less than one comparison to find the value. Or back to bubble sort: an already sorted list, you must still do at least one loop over the whole list to see that it IS sorted, and that costs at least N comparisons, you can't do it in less.




Last edited on
@jonnin, thanks for responding. From what I have picked up so far about asymptotic notations is that they have nothing to do with worst case, best case, or average case. Rather they merely state the bounds of the algorithm. An algorithm could have a Big Omega of Omega(n^2). This only means the algorithm has a lower bound of n^2; this could be the algorithms worst case/best case. am I wrong in thinking this?
:) the bounds ARE the best and worst case. EG the best for a binary search is 1 try, worst is lg(N) tries, those are both the best & worst cases AND the bounds. They mean the same thing.

Your def of omega is the minimum complexity which is the lower bound or the best case, all those are saying the same thing. An omega of N*N means that the very best it can do is N*N, and it could be even slower.

This looks like it lays out which notation means which bound and all in pretty good plain English (most of this stuff, they get excited and make it hard to read)

https://www2.cs.arizona.edu/classes/cs345/summer14/files/bigO.pdf

in practice most only care about the worst case/ upper because they want to know if the algorithm can go awol and take all day to run. They may also care for the average case, as most of the time it performs near that. Few care about the best case, but you can sometimes arrange data to use the best case more often making it worth knowing (eg hashing with collisions but only very rarely lives mostly in the best-case space).

Last edited on
No, that's not quite right.

Complexity analysis is a class of functions that represent an algorithm’s behavior in relation to the size of its input. Its utility is more consequent for large input.

Big O is a tight upper bound == worst case behavior.
Big Ω is a tight lower bound == best case behavior.
Big Θ is a tight upper and lower bound, meaning Big O == Big Ω.

Example for bubble sort with early quit after no-swap pass:
 • Big O = n² — input is in reverse order: n passes over n elements.
 • Big Ω = n — input is already sorted: 1 pass over n elements.
 • Big Θ = does not exist — there is no tight bound over bubble sort’s behavior.

Example for heap sort:
 • Big O = n log n — n “sink” operations, each “sink” costs log n.
 • Big Ω = n log n — n “sink” operations, each “sink” costs log n.
 • Big Θ = n log n — n “sink” operations, each “sink” costs log n (Big O == Big Ω).
   (In other words, heap sort behaves the same no matter what the input sort order.)

In all cases, “Big” notation indicates a tight limit, meaning that the representative function is very close (if not identity) to the class function. For example, n²+2n is very tight to n². (n² dominates 2n as n→∞. In other words, n² represents the class of all functions where the dominant term is n².)

For a loose coupling, you use “Little” notation. This is rarely useful, and typically only applied when the underlying function is too complex or too poorly understood and a simpler function can be used as limit. For example, you could say heap sort has a Little o time behavior of o(n²). It also typically implies that f(n) < o(f(n)). And again, O/o is a strictly upper limit; Ω/ω is a strictly lower limit; and little ϑ does not exist.

Computing f(n)

This is the trick: loops matter.

Example 1: constant time.

1
2
3
4
int identity( int n )
{
  return n;
}

This is pretty easy. There are no loops. No matter what n’s value, the function always returns immediately.

It does not actually matter what other things the function does; as long as the function does not need to perform any computation that depends on the size of n, it is O(1). We can even discount O(n) stuff if n is really small.

1
2
3
4
5
6
7
8
9
int quux( int n )
{
  std::cout << "The value of n is " << n << ".\n";  // T(C+9+S(n)+4)
  std::this_thread::sleep_for( 2s );                // T(2 seconds)
  return n;                                         // T(1)
}
// C is whatever it takes to get cout working
// 9+4 is the number of characters in the string literals
// S(n) is the cost to convert n to a string (which is actually O(n), for n < 32) 

For our purposes, this function is O(1). Remember, we can discount S(n) since n is tiny.

This idea that all constant values do not matter is important!

Example 2: Linear time

1
2
3
4
5
6
7
8
// Find index of first x in xs[n], else -1
int index_of( int x, int *xs, int n )
{
  for (int index = 0; index < n; index++)
    if (xs[ index ] == x)
      return index;
  return -1;
}

There is one loop. Worst case is that x is not found in xs[], so we will look at every element of xs[] before returning -1. That is O(n).

The best case is that x is the first element in xs[], meaning that we perform exactly 1 operation before returning. That is Ω(1).

Since n ≠ 1, there is no Θ. (Remember, ‘n’ is important as n→∞, so only for arrays of size 1 is the function a degenerate constant case.)

Example 3: exponential time

1
2
3
4
5
6
7
void bubblesort( int xs[], int n )
{
  for (int i = 0; i < n; i++)
    for (int j = 1; j < n; j++)
      if (xs[j] < xs[j-1])
        swap( xs[j], xs[j-1] );
}

The question you should ask yourself is: “How many times does this function loop?”

On line 3 we loop n times.
Inside that loop on line 4 we also loop n times. (n-1 is the same as n as n→∞.)
Lines 5 and 6 are O(1).

So that is n * (n + 2 + C), again dropping constants we get n². So this particular version of bubble sort is Θ(n²).

If we wish to add the “stop if no swaps” trick, that is easy enough:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
void bubblesort( int xs[], int n )
{
  bool swapped = true;
  while (swapped)
  {
    swapped = false;
    for (int j = 1; j < n; j++)
      if (xs[j] < xs[j-1])
      {
        std::swap( xs[j], xs[j-1] );
        swapped = true;
      }
  }
}

Analysis is similar. The number of times that the outer loop loops is entirely dependent on the sortedness of xs[], so the best case is that it is sorted so the inner loop only runs once: Ω(n), and the worst case is that the inner loop runs n times because the element at x[0] must move to x[n-1] or vice versa: O(n²).

Well, hope this helps.
I have found some sources that claim that cases are not the same as bounded.

https://www.quora.com/What-is-the-difference-between-Big-O-notation-and-Worst-case-Analysis-of-an-Algorithm

https://stackoverflow.com/questions/15420848/big-o-for-worst-case-running-time-and-%CE%A9-is-for-the-best-case-but-why-is-%CE%A9-used

https://www.reddit.com/r/learnprogramming/comments/3qtgsh/how_is_big_o_not_the_same_as_worst_case_or_big/

Am I missing something?

@Duthomhas, thank you for the examples. Could you highlight parts of my original post that I misunderstood.

So this particular version of bubble sort is Θ(n²)

Why did you use Big theta notation? Every example I have seen uses Big O to arrive at an answer when dealing with code analysis (From my point of view it looks arbitrary). When dealing with real world code analysis can you explain when to use (Big O, Big Omega, and Big Theta) ?



Last edited on
Duthomhas seems to have a better grasp of it but I will go out on a limb for the last question..

In the real world, unless you are creating a new algorithm for a new problem and writing a paper on it, timing the code tends to give better results. Typically people want to know if the code can execute in X amount of time on current hardware platforms for some typical use case data size/load. The acceptance/testing team runs it and if it feels sluggish and has slowness when doing some tasks, they send it back and say that.

The main reason I say that is that this notation is not aware of TIME. It is tangentially related, but a good implementation of N*N might be faster than a bad implementation of N*lg(N). If the better complexity takes 10 seconds per iteration and the N*N takes 1/10 a second per iteration, it takes a while to outperform it, maybe not even in the expected values of N. But timed code is closer to truth... if it takes 10 seconds to do something that you think should take 1, you look at that code to see why, and that is a check for {bad algorithm, bad code, bugs, interactions with other code(blocking?)} etc. Since slow code gets a pass over looking to see if the algorithm choice is bad, its still covered. If the algorithm is unclear or unknown, that is when you need to do the analysis, or if you know what it should be doing, hit google and find the best algorithm for the job and re-code it with that approach (analysis done for you already, as is often the case for so many problems)...

Every time I had to do analysis on slow code, the problem was often figuring out what the programmer was trying to do in the first place, then seeing that their code had a hidden N*N or (more often) a piece of bad code (like tight loop memory alloc/deallocs or unnecessary copies etc), and fixing it. Less theoretical than schoolwork, and more hands-on digging, in other words. Most of the time, it was 'bad code' not 'bad algorithm'.




Choice of O,Θ,Ω

Remember that if O = Ω, then you can use Θ. The first bubble sort does not have the short-circuit behavior to quit when all necessary swaps are done. Instead, it loops n * n times no matter what. So its best case behavior is Ω(n²), and its worst case behavior is also O(n²). Since n² = n², we can say that its behavior is always Θ(n²).

The improved bubble sort changed that by noticing if no swaps were made on a pass then it need not loop any more and could simply quit. The best case behavior then changes to a single pass over the data, making it Ω(n). Since n ≠ n², then Θ no longer exists, and you must specify both O and Ω.


Re: Bound != Actual Behavior

The author of the Quora response poorly words it; all of them are splitting hairs in a non-obvious way.

Big-O/Θ/Ω notations describe the upper/string/lower bound of a function’s growth behavior.
What this means is that we are comparing the function’s resource use to itself as n increases.

If the type of resource is not specified we mean time. We can also measure memory use.
 
What this does not mean is that all O(n²) functions behave the same or use the same amount of time.

A perfect example is the introsort, developed specifically for the C++ Standard Library. The introsort — “introspective sort” is composed of three sort algorithms for best behavior.

 • quicksort, O(n²), Ω(n log n)
 • heapsort, Θ(n log n)
 • insertion sort, O(n²)

All of these are O(1) memory. However, they each have special characteristics that make them better in specific circumstances.

 • Insertion sort is bar-none the fastest sort algorithm for less than 60-100 elements (or whatever will fit into your processor’s L1 cache and still beat quicksort), even though it is an algorithm.

 • Quicksort is, for most cases, an n log n algorithm. But of all n log n algorithms, it has always been the fastest. This is due to its structure. Even in today’s computing environments that structure makes it by far the superior choice in terms of absolute (not relative) speed.

 • However, quicksort has a dirty secret: degenerate cases exist (and can be easily constructed) that bog its behavior down to O(n²) and make it slow relative to other algorithms. This is a vulnerability.

 • Heapsort is also an n log n algorithm. It does not work as fast as quicksort, but it has no worst case scenario like quicksort. Given data that would bog quicksort down, heapsort trumps it easily.
 
So, knowing the growth complexity of these algorithms is only part of the picture. Having actual time analysis of these three sorting algorithms allowed Mr Musser to combine them in a way that utilizes their strengths and minimizes their weaknesses. The basic construct is this:

    1 Start with a quicksort.
    2 If quicksort gets to the point where it is about to degenerate into a worst case
      scenario, switch to heapsort.
      
This prevents the worst case behavior from ever getting worse than a 2n log n → Ω(n log n) function. 
    3 Otherwise, when the quicksort subpartition size gets smaller than some k number of
      elements (typically 60-100, depending on the programmer — some libraries even allow
      you to calibrate it), it switches to an insertion sort.

There is no Big O/Θ/Ω analysis that describes this. The only useful analysis is a piecewise function over actual amortized time for specific ranges of n. (Amortized to discount the cost of comparing different types of data, say int vs string.)


The Real World

In the real world you use multiple measures to give you information about an algorithm. No matter how awesome your algorithm may be, if it has an n² behavior you probably won’t find much use for it when handling billions of data.

For example, the Floyd-Warshall algorithm (to find all shortest paths to some vertex) is Θ(n³). But it works swimmingly on any adjacency matrix. For graphs stored as an adjacency list, it can be adapted or supplanted by another algorithm that takes better advantage of the sparse matrix representation for better time behavior over a large graph.

Choose the correct tool for the problem. Complexity analysis just gives you information about your tools.

Hope this helps.
Registered users can post here. Sign in or register to post.