How to solve in less than O(n2) where N=100000 (length of string) and time limit 1 Sec

Pages: 123
We have to find the product of the length of all substring of a given string such that substring formed is prefix to given string. Since the answer can be very large print it modulo 10^9+7.

For ex.

S=ababaab

substrings are: a,ab,aba,abab,ababa,ababaa,ababaab,a,ab,aba,a,a,ab

answer: (1*2*3*4*5*6*7*1*2*3*1*1*2)%1000000007
Last edited on
I don't understand, why do you repeat a,ab,aba,a,ab? Is the answer not just n! ?
Last edited on
Ganado wrote:
I don't understand, why do you repeat a,ab,aba,a,ab? Is the answer not just n! ?


Doesn't his example show why the answer is not n!? For this particular example the answer is 7!+3!+2!

I'm having trouble thinking of a way to give a hint for this one without just explaining the answer.

Maybe this post is itself a hint.

Edit: Those substrings that you say are "repeated" are the ones that start at indices 2 and 5.

ababaab
..^^...
ababaab
..^.^..
ababaab
.....^^
Last edited on
Oh, I see what it's saying now. All the substrings that are prefixes of the full string, with duplicates being distinct.
(Originally, I had thought that "prefix" had meant it had to start at index 0.)

Edit: Not that it affects the final answer in the example, but wouldn't there be 4 'A's then, since we're counting duplicates at different starting indices as unique?
Last edited on
Yes, four 'a's. I just chose to ignore them for brevity since length 1 sequences don't matter, but perhaps it would have been clearer to include them.
for the multiply after, use powers, maybe..?
if you can find the substrings, and put them in something with (string, count) then after finding the substrings, youll have something like (ab) 7 (pow(ab,7)) stored.


Substrings, should be one pass + doctoring?

ababaab
pass 1
a, ab, aba, abab, ababa, ababaa, ababaab (just concat next letter to last substring for each)
pass 2:
take pass 1 and knock the first letter off:
null, b,ba,bab,...

but you don't need to actually do the passes, just divide your existing passes by a.
then divide your existing passes by ab (or, the 2nd entry of pass 1).


is that leading to a good place?

…. the multiply idea isn't even needed.
just do pass 1, multiply that up.
then take that value divided by pow(a, #terms in pass 1)
then take that value divided by pow(ab, #terms -1)

last one is just b … its the whole thing divided by the whole thing with the final b removed ….
Last edited on
@jonnin your solutions seem to be in O(n2) and will give time limit
Forgive me if what I post seems vague. I firmly believe that people benefit much more from these problems without having them explained in full detail. I would really like to see some feedback from OP before giving the answer away.

Correct me if I'm wrong, as I'm not familiar with exactly how big O notation works, but checking every substring is O(n^2) because the number of substrings is a quadratic function of n. Any solution which checks every substring is at least O(n^2).

Therefore my advice is to figure out which substrings can be ignored.
@Browni3141 If I had known the answer then I would have not asked do I?
MyCreativityDiedWithEngg wrote:
@Browni3141 If I had known the answer then I would have not asked do I?


Nope, but this is not a Q/A forum. This problem doesn't challenge me, so solving it doesn't benefit me very much. Nor would me solving the problem and giving you the answer benefit you very much.

This is not meant to be a brag. I have only reached the point where I find problems like this easy by struggling to solve similar ones on my own, and there are many problems I struggle with that other people find easy. You need to struggle a little bit to learn. I will help guide you but I won't give you the answer.

What approaches have you tried so far? Does this give you any ideas?
browni3141 wrote:
figure out which substrings can be ignored.

I can't think of a solution less than O(n2), even if I ignore unwanted substring.
A more clear direction of hint would be helpful
Last edited on
O(n) solution.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
len = strlen(str);
fact[0] = len;
for (i=1; i<n; i++)
{
    if (str[i] != str[0]) 
    {
        fact[i] = 1;
    }
    else if (str[i] != str[i-1])
    {
        fact[i] = len-i-2;      
    }
    else
    {
        fact[i] = len-i;    
    }
}

product = factorial(fact[0]);
for (i=1; i<len; i++)
{
    product *= factorial(fact[i]);
}


But I doubt how can we calculate factorial of number greater than 19.

Can I see your O(n2) solution where N=100000 ?
Homy18, your solution doesn't work for input abcaxyz. It thinks the second 'a' contributes 2!
my solution only required 1 pass that did any work.
first pass does work splitting out substrings.
every subsequent iteration does one pow and one division. if you kept the terms from pass 1, you don't even need the multiply in the secondary iterations.
the amount of work in all the passes except the first one is so tiny that if it blows out the time limit, I would be shocked. It is, technically, 2n, but its not 2n of substring nonsense, its 2n of a couple of VERY fast math algorithms. Eliminate pow with one of your favorite IPOW functions and it will eat it alive.
@jonnin,
the substring will most likely use a loop so you still have nested loops + additional memory allocations and deallocations. We don't know about the numbers of test cases but if they are large it still will be a problem.

Have a look here about the complexity required - around min. 7
https://www.youtube.com/watch?v=2frmG3Z_PRo
OP: Show what you've tried so far at least, and post a link to the problem.

jonnin, maybe I'm misunderstanding what you're method is, but aren't you just iterating over every substring?

I wrote a simple brute force method to play with.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
#include <iostream>
#include <string>
#include <chrono>

using namespace std;
using namespace chrono;

const uint64_t MOD = 1000000007;

uint64_t factorial(int n)
{
    return (n <= 1 ? 1 : n*factorial(n-1));
}

int main()
{
    string str;
    do
    {
        cin >> str;
        steady_clock::time_point start = steady_clock::now();
        uint64_t total = 1;
        for (int pos = 0; pos < str.length(); pos++)
            for (int len = 2; pos+len <= str.length(); len++)
                if (str.substr(0, len) == str.substr(pos, len))
                    total = (total*len)%MOD;
        cout << total << '\n';
        cout << "Time: " << duration_cast<duration<double>>(steady_clock::now()-start).count() << " seconds.\n";
    } while (str != "quit");
    return 0;
}
no, I am not iterating substrings.
lets see...
1234
pass 1, substrings and product:
1, 12, 123, 1234, product is 288 //you have to do the substrings on pass 1.
pass 2 is just 288 * 288 because first term is 1 (bad example maybe, 288/ 1 to the 4th) 82944
pass 3 is just 82944 * 288/2^3 = 82944*36 (2^3 is second term multiplied, terms left power)
and so on.
just 2 math steps per iteration after first.

pass 2 is
2,23,234 product 288 because of the ones
pass 3 is
3,34 product 36....

4 is 288 /(2^3 * 3^2 ) 3 is third term, 2 terms left..
but the division is a running number, its 36 (preveious) / the 3*3... is 4

is the pattern making sense?







Last edited on
Jonnin, I'm having trouble understanding your algorithm.
1234
pass 1, substrings and product:
1, 12, 123, 1234, product is 288

I assume you're talking about the string "1234". Then there are 4 prefix strings (1, 12, 123, 1234), they have lengths 1,2,3,4 and the product is 24. Why are you multiplying the digits together?

And here's another odd thing. The subject includes
N=100000 (length of string)
If the string is 100,000 characters then the prefixes that start with the first character will contribute 100,000! to the result. That's a mighty big number. Suddenly the time to do the arithmetic sounds like it will dwarf the time to find prefix strings.

Maybe I'm missing something?
maybe I misread the problem. I read it as all substrings multiplied up, not just the last one?

scaling the problem is for the OP to worry about. I would think its just using some big int class, which should be fairly efficient, though slower than atomics.
Pages: 123