### The Most Repeated Character In a String - Google Interview Question

I am starting to practice programming puzzles to prepare myself for my future interview questions.

I saw a question on YouTube about finding the most repeated character in a string in O(n), and not O(n^2).

It is easy to find O(n^2) algorithm, you simply have to compare every single character with other characters and increment the count, then return the most repeated character.

I found this algorithm online:

 ``123456789101112131415161718192021222324252627282930313233343536373839`` `````` #include #include using namespace std; string mostFrequent(string text){ int max = 0; int count = 0; string maxCharacter; for(char q=' ';q<='~';q++) { count = 0; for(unsigned int i=0; imax) { max=count; maxCharacter=q; } } return maxCharacter; } int main(){ cout << mostFrequent("DBCABAEEE"); return 0; } ``````

It appears the time complexity to be O(n^2) since there are two for loops.

Is there any way we can implement this algorithm in O(n) running time?
Last edited on
 It appears the time complexity to be O(n^2) since there are two for loops.

There are two loops, but each does not loop n times. The outer loop happens a constant number of times. The inner loop `text.length()` times. So it is O(n).
Last edited on
Just because there's a loop inside another doesn't mean that the time complexity is O(n^2). The time complexity of this algorithm is O(1):
 ``1234`` ``````for (int i = 0; i < 1000000; i++){ for (int j = 0; j < 1000000; j++){ } }``````

The inner loop runs for text.length() iterations, and the outer loop runs for k iterations. In total, that's k*text.length(), or just k*n. Since big O notation compares the asymptotic behavior of algorithms, constant factors are ignored, therefore the time complexity is just O(n).

These two algorithms have the same time complexity:
 `` `` ``for (int i = 0; i < n; i++);``

 ``12`` ``````for (int i = 0; i < n; i++) for (int j = 0; j < 1000000; j++);``````
Ok, but the outer for loop is running the ASCII character and symbols (we can consider this set of ASCII characters and symbols to be of a certain size n). That indicates for every iteration of outer loop we are comparing the second loop text.length()-1 times, so I think it is O(n^2).

Could you please explain what you mean by
 "outer loop happens a constant number of times"
?
Last edited on
 Ok, but the outer for loop is running the ASCII character and symbols. we can consider this set of ASCII characters and symbols to be of a certain size n

About 95 of them. Not n. In this case, n is the variable size of the string to be examined.

So the outer loop will run 95 times. Not n. Not anyhting to do with length of the text. A fixed amount.

So we will loop 95*n times in total, which is O(n).
Last edited on
I don't know why my edited response went lower, but I think I am understanding what you mean.

@helois
I believe you mean every loop is running on its own, so that is O(n). Can you explain to me how would the algorithm in the question would become O(n^2)? What conditions make the time complexity to change?

@Repeater, yeah that is true, some running times are like 3n^2 + 2n + 5, and we only take the greatest leading variable so n^2, and we ignore constants. I understand what you mean now :)

I just started studying about running times and efficiency, so I'm getting there.
Last edited on
The Standard Library really is your friend. O(n):
 ``1234567891011121314151617181920`` ``````#include #include #include int main() { std::string s; getline( std::cin, s ); std::unordered_map histogram; for (char c : s) histogram[ c ] += 1; // O(n) × O(1) std::size_t count = 0; for (auto p : histogram) // O(n) if (p.second > count) { count = p.second; s = p.first; } else if (p.second == count) { s += p.first; } if (s.size() != 1) std::cout << "No one character most common.\n"; else std::cout << "Most common character is '" << s << "'.\n"; }``````
Last edited on
I believe std::unordered_map has O(n) lookup.
 It is easy to find O(n^2) algorithm, you simply have to compare every single character with other characters and increment the count, then return the most repeated character.
That isn't O(n^2) because the second search isn't dependent on n, it's dependent on the number of characters in your alphabet, so it's O(n).
It seems fairly easy to do it in O(n).

 ``1234567891011121314151617`` ``````int occurenceTable[0xff] = {0}; //O(1) space complexity for(char c: someString) //O(n) loop ++occurenceTable[int(c)]; char mostOccurent; int max = 0; for(unsigned i = 0u; i < 0xff; ++i) //find max in O(1) { if(occurenceTable[i] > max) { max = occurenceTable[i]; mostOccurent = char(i); } } ``````

Also, OP, your algo is O(n)
 ``123456789101112131415161718192021`` `````` for(char q=' ';q<='~';q++) // this is constant so O(1) { count = 0; for(unsigned int i=0; imax) { max=count; maxCharacter=q; } }``````
Last edited on
> I believe std::unordered_map has O(n) lookup.

In pathogenic cases. It should be f(1) normally. (It is supposed to be a hash table lookup.)

Everyone is falling into a too-tricky/too-neat problem: What is the most common character in

abba

?