Computational Complexity

Hello everyone. I want to study algorithms, but firstly I need to be able to do algorithm analysis and be able to find a random algorithm's complexity (Big-Oh, Omega and Theta notations).

I know about the lower bound f(N) = Big-Omega(g(N)), the upper bound f(N) = O(g(N)) and the "same growth rate" f(N) = Big-Theta(g(N)).
However, I am at a loss with taking this further. I saw some videos and read in some books, but I cannot seem to be able to find something straight. I don't want to read stories, just the pure mathematical/computational method for this.

Do you think you could help me by linking me to a good tutorial/book/video/anything else? Thank you very much!

Kind regards,
Raul.
Due to the Halting Problem there is no such mathematical/computational method, unless you limit the factors. Even then, though, it is still quite complex and challenging to describe in a mathematical/computational sense.

It's like knowing whether a photograph is a picture of a bird or not (http://xkcd.com/1425/ ) - it's easier to just have a human do it.
Thanks for the reply. And sorry for the misunderstanding, I meant that I am looking for a full description of how to do that. I am currently unable to tell which is the efficiency of a random algorithm. I am looking for a tutorial or something that explains how to do that and how to deal with recursion.

Thanks in advance.

EDIT: I do not intend to do a software that does that for me (and I am sorry for how I put it in the main post, it was misleading). I want to be able to compute the efficiency myself (on a piece of paper with a pen), but I cannot find good tutorials/books/etc. on how to do that. I just know the notations, but now how to use them in practice. Also read some formal definition like:
f(N) = O(g(N)) <=> for any N > N0, f(N) < c * g(N)
where c and N0 are constant. However, I still don't know how to actually use this information to calculate the efficiency of a random given algorithm.

Kind regards,
Raul.
Last edited on
For an algorithm involving random values to decide what to do next (e.g. 0 to stop looping or 1 to continue looping), the worst case scenario (if you're not careful) is generally infinite time, and the best case scenario is generally constant time. How you determine the average time depends on the source of the random data.

If you're talking about an algorithm that generates random numbers, it would be constant time since the input data set is always the same (just the seed). Constant time doesn't mean it will take the same time to execute each time, however, just that the size of the input data set doesn't matter (because technically there is no input).
Last edited on
Well, I know that. I have studied ARM Assembly Language (fundamentals of computer architecture) and I know enough about hardware (fundamentals of computer engineering) to know that the CPU works real-time and thus the time difference between different executions. But I am talking about Turing Machine or other computational models that we use in software engineering to compute the time and memory complexity of an algorithm. Take Roy-Floyd-Warshall algorithm as an example. I know its time complexity is O(N3) because it uses 3 nested loops which go from 1 to N, and within that it uses one if statement which has one instruction. Thus the complexity would be O(1) for that instruction, and O(1) for the if statement. Then we have that O(1) instruction executed N times in the first loop, which is executed N times in the second loop, which is executed N times in the third loop (counting from inside-out), thus 1 * N * N * N = N3. This leads to the conclusion that the Roy-Floyd-Warshall algorithm has an upper bound for time efficiency that can be described by the notation O(N3).

However, what I asked is: How do I find the time complexity of an algorithm that I write? Let's say I invent a new algorithm, that I think is better than Dijkstra's. I need to prove that. But how do I do it? I mean, for easy-to-read algorithms like Roy-Floyd-Warshall, it's indeed easy. But how about when I have a large algorithm or one that uses Divide-and-Conquer? Such as Quick Sort.

Thanks again, hope this clarifies what I wanted to ask.

Kind regards,
Raul.
LB wrote:
Due to the Halting Problem there is no such mathematical/computational method, unless you limit the factors. Even then, though, it is still quite complex and challenging to describe in a mathematical/computational sense.

What?

The study of algorithmic computational complexity is an integral part of computer science. Knuth wrote huge volumes on this stuff that we still use today.

Whether or not you can get a computer to identify objects in a picture is different from analyzing the complexity of such a task -- which can be done.

@jumper007
Google around "algorithm analysis". One of the first links looks like a fairly good introduction: http://discrete.gr/complexity/

Also, a good book helps: http://www.amazon.com/dp/0262033844

The basic principle is that of asymptotic complexity. That is, you are interested in orders of growth of related functions.

That is, we tend to toss all terms and factors that are dominated by the remaining factor. Given, for example, a linear factor and a constant factor:

    lim (n)/1 = ∞ ::: the top factor dominates
    n→∞

equivalently:

    lim 1/(n) = 0 ::: the bottom factor dominates
    n→∞

The only thing you are interested in, as a rough measure of complexity (which is the time and memory spent to compute) is the dominant term. Hence:

    1 < log n < n < n log n < n²

Become comfortable with logarithms, if you aren't already.

To say we are interested in asymptotic complexity is to say we are interested in how the algorithm behaves as n→∞.

The bounds (rate of growth) are true for functions g(n) and f(n) such that n > k, where k << n. For example, quicksort O(n log n) has superior time over insertion sort O(n²) for n > k, but for some n <= k the insertion sort will actually perform better. (We know this by experience, nothing more.) That's why a well-written quicksort will bottom out at k items (instead of 1) and perform a single insertion sort on the final sequence -- it runs faster that way.

The magic letters you need to know are, in order from highest to lowest:

    f(n) = o(g(n)) means we have a function g(n) such that f(n) < cg(n), for n > k.

    f(n) = O(g(n)) means we have a function g(n) such that f(n) <= cg(n), for n > k.

    f(n) = Θ(g(n)) means we have a function g(n) such that c1g(n) <= f(n) <= c2g(n), for n > k.

    f(n) = Ω(g(n)) means we have a function g(n) such that cg(n) <= f(n), for n > k.

    f(n) = ω(g(n)) means we have a function g(n) such that cg(n) < f(n), for n > k.

You will often see algorithms expressed with big-O notation because that is a tight upper bound on the algorithm (meaning it is pretty representative of the algorithm's worst case performance). In stuff written for the hoi-poloi, you will also see big-O notation used where another Greek letter would have better been placed...


So, here's an example with merge sort. The lower bound for merging two lists, A and B (with M and N elements, respectively) is M+N-1 comparisons. (Do you know why?)

Since both M and N are linear, we'll just take them to be the same value and write it as 2n-1 comparisons.

Since 2n (linear term) dominates 1 (constant term), we toss all but the linear term and write it as 2n.

Since 2 is a constant (again dominated by the linear factor, n) we'll just write it as n.

All done! The best case performance of merge sort is Ω(n) comparisons. Nice!


There is, of course, much more to it than that, but that should get you over the first hurdle to understanding the math behind algorithms analysis.

Oh, before I forget, rewrite recursion as a loop. :O)

Hope this helps.
@Duoas: I'm not sure how the halting problem doesn't apply.
Many many thanks @Duoas. That's exactly what I was looking for!
And thanks @LB for giving me the idea of the halting problem. I'll actually take some time to check about that one too!

Kind regards,
Raul.
The Halting Problem applies to developing an algorithm that can deterministically analyze whether any given program will run to completion (and halt).

More generally, it is a problem of asking the universe to prove every truth. (Some things can't be proven true, even though they are.*)

It's a totally different thing than analyzing what it would take to do such a thing. And, for the purposes of people analyzing the design of specific algorithms with specific constraints, irrelevant.

Hope this helps.

*Yep, that's actual math for you. See http://en.wikipedia.org/wiki/G%C3%B6del's_incompleteness_theorems


[edit] fixed to make better sense
Last edited on
@Duoas Thanks again, very helpful!! :)
Topic archived. No new replies allowed.