what is optimistic(less time complexity) way to solve this problem?

please help?
how to count common characters (character by character) in two strings in cpp?
what is optimistic(less time complexity) way to solve this problem?
here is my code but its take more time so this gives TLE.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
 

string s11="abb",s22="bbc";

 for(int i=0; i<r; i++)
            {
                
                    if(s11[i]==s22[i])
                        count1++;
                    
                
                
                
            }

output is 1.
Last edited on
Why is your code spacing so bizarre?
Give the link to the problem.
Given two strings, in which you have to count the number of time both string have the same character in the same place, there's no way that doesn't involve at least the complexity of at least the size_of_shorter_string comparisons.
plz provide a pseudo code @repeater
@Repeater the example is sample test case. assume that there is 10^5*10^5 length of strings?
Last edited on
What's the full question?
don't compare it at all with if.

for(
count += (s11[i]==s22[i]); //no jump on if, just a bool conversion of 0 or 1 to add zero or 1 to count, which is more efficient.

it would be even more efficient to load a large register in the cpu with a chunk of the string and sum up several of those, but I dunno if its worth trying to express that in c++ instead of pure assembler. Ideally you could do them in parallel if you really dug into it, at least 2 at a time, maybe 4, per cpu.


the actual complexity won't change, its N where N is the length of the shortest string. But how you do it can get you much more speed, little things matter. You need about a 2gb long string to do your timings with, to really test it, or count actual cpu clocks, this is too little an example to check.
Last edited on
@Repeater
THE FULL QUESTION IS THIS , please tell the optimised way to solve this question.

Chef has N strings with him. He knows his strings very well. A very intelligent guy named Frank comes along and asked him a few questions about the strings. Chef is busy right now, so you have to help him answer the queries by Frank There are Q queries each containing 3 integers L,R,C and 1 character P. You are supposed to tell how many strings from L to R have the character P at position C. String contains lowercase letters only.

Note: Use one based indexing

Input:
First line will contain N, number of strings. Followed by N lines each containing a string each
Next line contains Q, number of queries.
Next Q lines contain 3 integers and 1 character, denoting L,R,C,P respectively.
Output:
For each query, output the answer in a new line.

Constraints
1≤N≤100000
1≤Q≤100000
1≤L,R≤N
1≤|S|≤20
Subtasks
30 points : 1≤N,Q≤1000
70 points : Original Constraints
Sample Input:
3
chef
cut
frank
1
1 3 1 c
Sample Output:
2
EXPLANATION:
There are 2 words with the letter c on 1st position.
There's no way to do it without having examined, at least once, the characters at the requested location in each word within the range.

The optimisation will, I guess, be to do it without examining each character more than once either. To remember what you found so that you don't have to go through the list of words over and over. Perhaps there's a way to run through the list once and build a set of data that holds all the answers to all possible queries, without having to look at the words again.
Last edited on
dunno if its worth it..
transposed input normalize the lengths with zeros? (I used + for zero for visual clarity).

1
2
3
4
5
ccf
hur
eta
f+n
++k

sum of first row for cs is 2. This would be slick if the strings in all the questions are the same set (nevermind the left/right starting points, just the total set of strings). A little extra work up front to get quick answers? And you just iterate start to finish subset of course.

another idea is a giant map of all the strings:
key1: string id #
key2: position in string
result/ data element: character at that position

so chef is mapped for
1,1 = c
1,2 = h
1,3 = e
1,4 = f

and cut is
2,1 = c
2,2 = u
2,3 = t

and so on. then just extract for the query? It doesn't even waste space.
the trouble with these (both) is up-front setup time has a cost, and the idea is pay a little more up front to beat the back end efficiently. I think the first one will perform better, but that is just gut feeling.
Last edited on
Has anyone solved SIGABRT error
@jonin, in the 2nd method, wouldn't we again have to check for the given range for eg, from l to r, we will have to check the map from l to r ??

And in the 1st method, can you please elaborate , I didn't quite understood your logic?
yes, in the map one, you would have to iterate the map using the string id (as L to R). That is why its one of the keys.

The first one is just storage into a 2-d vector of char (2-d logically, 1-d really, for speed). you would store them transposed letter by letter. You pad unused letters. Its slightly more expensive to store, but pretty much hands you the answer on a platter.


say your longest word is 10, and you have 10 words.

really roughly

vector<char> twod(100);
string word;



for..
for..
twod[ current_letter_num*maxletters +currentword_ID_num] = word[current_letter_num]; //if exists. prefill with pad character, and the inner loop can stop on word's length.


and doing a query
const int offset = query_letter_num*maxletters; //factor out a multiply
for(x = l; x <= r; x++)
count += (twod[offset+x]
==queryletter); //as I said originally see notes on that

does that make sense?
storing cut in the example, you get word is 1 (from 0, its the second)
so it puts letter 0 (c) *maxletter+wordid = [1].
it puts letter 1 (u) *10+1 = 11
it puts letter 2 (t) *10 +1 = 21

and pulling out the 2nd letters in a query for example, looking for say 'x' (there are none) …

count += queryletter(1, second letter from 0) *10 + left to right or
twod[11+0], twod[11+1], and twod[11+2]

note than 11+0 was loaded with u from cut, second letter, is it x? … see?

The idea is just to load up a super simple, sequential memory access iteration to resolve the queries. Only memory allocation and destruction is one single program lifespan vector.

Im not saying there is no clever way to resolve the queries on a single look method (remembers previous look at data somehow). This may not be the best possible answer, but I don't see anything better yet. Everything I can think of would cost more real cpu time than this does unless you have millions of queries.
Last edited on
Topic archived. No new replies allowed.