Text parsing per time stamp.

Hi everyone. I am trying to figure out how to parse a log file in a specific way. What I want it to be able to do is to go line by line and count how many times a specific word appears per time stamp.

For example the log file is set up in this way:


[18:46:48] You attack
[18:46:48] You attack
[18:46:50] You attack
[18:46:52] You attack
[18:46:52] You attack
[18:46:54] You attack
[18:46:54] You attack
[18:46:56] You attack
[18:46:56] You attack
[18:46:56] You attack
[18:46:58] You attack
[18:46:58] You attack
[18:47:00] You attack
[18:47:01] You attack
[18:47:01] You attack
[18:47:01] You attack
[18:47:01] You attack

Now for example for the time stamp: [18:47:01] I want the code to look at how many times the word "attack" appears.

This would be 4 obviously.

Now for the time stamp: [18:46:56] The word "attack" appears 3 times.

Throughout the log file the word "attack" can appear 1, 2, 3, or 4 times per time stamp.

What I want the code to do is to count how many 1's, 2's, 3's, or 4's there are per time stamp in the entire file and to print out for example:

1's == 60
2's == 35
3's == 29
4's == 14

I am most familiar with c++ and currently taking a class in c++, so I would like to use this language. Is this possible to do in c++? And could you give me suggestions on how to do this? I don't want any completed code because I want to figure out the majority of this on my own. Thanks for the help!
open a text file for input: std::ifstream file( "whatever.txt" ) ;

read lines one by one from the file: http://www.cplusplus.com/reference/string/string/getline/
1
2
3
4
5
    std::string line ;
    while( std::getline( file, line) ) // for each line read from the file
    {
        // do something with the line
    }


check if the line contains the string 'attack': http://www.cplusplus.com/reference/string/string/find/
if( line.find( "attack" ) != std::string::npos )

get the timestamp from the line: http://www.cplusplus.com/reference/string/string/substr/
const std::string time_stamp = line.substr( 0, line.find_first_of( " \t" ) ) ;

validate the time stamp: do this later

keep a count of lines with "attack" for a particular time stamp:
1
2
    // use a std::map where the time stamp is the key and the count is the mapped value.
    std::map< std::string, int > counts ; // 
http://www.cprogramming.com/tutorial/stl/stlmap.html

each time a line containg "attack" is read, get the time stamp and increment the count:
++counts[time_stamp] ;

iterate through the map; the second of the key mapped_value pair is the count
1
2
3
4
5
6
    // http://www.stroustrup.com/C++11FAQ.html#for
    for( const auto& pair : counts ) // for eack kay-value pair in the map
    {
        const int this_count = pair.second ;
        // consolidate the counts ...
    }


consolidate counts for the entire file:
1
2
3
4
5
6
7
    const std::size_t MAXCNTS = 4 ;
    // num_counts[0] == count of 1's, num_counts[1] == count of 2's etc.
    int num_counts[MAXCNTS] = {0} ; // initialise to all zeroes
    int bad_cnts_gt_4 = 0 ; // counts greater than four (unexpected)
    
    if( this_count < MAXCNTS ) ++num_counts[ this_count-1 ] ; // for the aforementioned this_count
    else ++bad_cnts_gt_4 ;
thanks very much for the help!
Topic archived. No new replies allowed.