Working with strings, files and time-tag calculation

Hello CPP community.

I've been interested in doing a little program for work at which I'm an apprentice which could aid me in facilitating the process of working out some necessary checks.

In short, I'm to calculate a series of times from start to end, and find out the duration between them, sum them up and see if they're above a certain value or not, for each particular instance.

My goal is to provide a prepared text file with time tags such as this:

1
2
3
4
5
6
7
Mon 19-Nov-2012 09:12 	Mon 19-Nov-2012 09:34
Wed 21-Nov-2012 13:14	Wed 21-Nov-2012 17:11
Fri 07-Dec-2012 15:21	Fri 07-Dec-2012 15:26
=============================================
Mon 26-Nov-2012 12:50	Tue 27-Nov-2012 15:29
Wed 12-Dec-2012 13:07	Wed 12-Dec-2012 14:58
Fri 14-Dec-2012 14:22	Fri 14-Dec-2012 14:29


And the program is able to calculate the total time relevant to each instance (instances separated by a line of '=').
Some form of number should somehow identify each instance or something similar and a text file is generated with total time printed for each instance

e.g.
1
2
Instance 1: 00h22m + 03h57m + 00h05m = 04h24m
Instance 2: 04h15m + 07h44m + 01h51m + 00h07m = 14h04m


Now I'm currently working on making the logic to calculate time within the ranges I'd like based on several parameters, the problem I'd like some help with is the following:

Are there any references I can use when it comes to working with strings in order to seek and extract these values in order to work with them? The documentation available on this website, despite being very informative, does not show practical applications of said class and I'm at a loss on how to implement the functionality.

Thanks for any tips and information.
You should try using boost::date_time library: http://www.boost.org/doc/libs/1_52_0/doc/html/date_time.html

It has all the necessary functionality for reading/writing dates and times in various formats and for calculating differences. I think it's the better approach than writing all code yourself.
Thanks for your reply. I'll be looking into checking whatever interesting functionality that library has.

Upon a quick look, however, it appears it's already "set" in its way of tacklings dates and times, whereas I must stick to the ones provided above in that exact format since these are extracted from a system used at work and cannot be in any way altered (I have absolutely no control of that).

If however I missed the ability of defining what kind of date format I'd like to use, possibly initialize some sort of object which allows me to define my date + time formats, that would be great.

Regardless, it does appear to have some rather easy variable type conversions integrated, so I'll be sure to give it a whirl once I'm back home and can actually install such things

Edit: I might actually be completely wrong. Is this going to allow me to dictate to the library how I'll be providing the date&time format:

http://www.boost.org/doc/libs/1_52_0/doc/html/date_time/date_time_io.html#date_time.format_flags

Or is that only used for output, rather than manipulation?
Last edited on
Alright I just believe boost is what I really need, problem is I've never worked with any kind of library using any kind of language, I'll keep reading as much as possible and testing out different things, but if anyone is familiar and can get me started in the right direction this is a more complete form of what I'm trying to go for:

Each instance must be handled separately. Instances will be acquired from a text file in the format mentioned in first post. Instances have to account for the following dynamic possibilities:

* Holidays which will be omitted from the calculations. (Determined by me beforehand in the program itself, or loaded from a textfile)
* Working times during which calculations are valid, otherwise it's considered as an off-time. (There are 2 possibilities: [7:45 to 17:15] OR [8:00 to 14:00]) This must be somehow indicated in the textfile before each instance and the program recognizes and adjusts accordingly.
* Possibly adding the time at the end of the time tags in that same text file, and adding the Total time just before the next instance starts, although that's just for fanciness :P


Any help on where I should direct my attention to making any of this happen, or maybe some examples you know of that use Boost which relate to my task that might help me?

Thanks kindly for any help.
You might want to consider using regular expressions:
http://www.zytrax.com/tech/web/regex.htm

Either std::regex or boost::regex

Can you please elaborate on how that will be of any help, and if Boost can accept dates/times in a format i predetermine, rather than a particular standard it's forced to stick to?

Thanks.
Your holidays etc part is making things complicated, pace yourself.

The original task is already somewhat complicated:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
#include <fstream>
#include <locale>
#include <sstream>
#include <numeric>
#include <boost/date_time.hpp>
namespace bpt = boost::posix_time;

void print_results(std::vector<bpt::time_duration>& durs, unsigned int instance)
{
   std::cout << "Instance " << instance << ": ";
   if(durs.empty())
   {
      std::cout << "No times found\n";
      return;
   }
   
   std::cout << durs[0];
   for(size_t n = 1; n < durs.size(); ++n)
      std::cout << " + " << durs[n];
   std::cout << " = " << std::accumulate(durs.begin(), durs.end(), bpt::time_duration()) << '\n';
   durs.clear();
}

int main()
{
    std::ifstream file("test.txt");

    // set up cout for printing durations the way you wanted
    bpt::time_facet* df = new bpt::time_facet;
    df->time_duration_format("%Hh%Mm");
    std::cout.imbue(std::locale(std::cout.getloc(), df));

    unsigned int instance = 0;
    std::vector<bpt::time_duration> durs;
    // read the file line by line
    for(std::string line; getline(file, line); )
    {
        // separator
        if(line.find_first_not_of("=") == std::string::npos)
        { 
            print_results(durs, ++instance);
            continue;
        }

        // two time values
        std::istringstream buf(line);
        buf.imbue(std::locale(file.getloc(), new bpt::time_input_facet("%a %d-%b-%Y %H:%M")));
        bpt::ptime t1, t2;
        if(buf >> t1 >> t2)
            durs.push_back(t2 - t1);
        else
            std::cout << "Cannot parse the line " << line << '\n';
    }
    print_results(durs, ++instance);
}

live demo: http://liveworkspace.org/code/4uNKEo
(output:
Instance 1: 00h22m + 03h57m + 00h05m = 04h24m
Instance 2: 26h39m + 01h51m + 00h07m = 28h37m


(I didn't break up 26h39m into 04h15m + 07h44m because you didn't show how the valid intervals would be indicated in the text file. I would use boost::posix_time::time_period's intersection() to do that, if I were)

This could be done a bit prettier and more organized, if I had more time, I'm just showing how to use boost to parse the inputs and format the outputs since it may not be obvious to someone who's reading their docs for the first time.
Last edited on
> Can you please elaborate on how that will be of any help

Regular expressions are useful in parsing patterns in strings - for instance, extracting date or time present in strings in assorted formats. Nothing more (not for date time computations).

Here is a very simple example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
#include <sstream>
#include <string>
#include <boost/regex.hpp>
#include <iostream>

int to_int( const std::string& str )
{
    int v ;
    std::istringstream stm(str) ;
    return stm >> v && stm.eof() ? v : -1 ;
}

int main()
{
    const boost::regex regex( "([01]?[0-9]|2[0-3])[:,h]([0-5][0-9])m?" ) ;

    const std::string time_strings[] =
    {
        "start time: 10:56",
        "repeat 10h56m",
        "job X started at 09h22 and",
        "the end time was: 16h04m07s",
        "the last step ended at 8:32 and some seconds."
    };

    for( const std::string& str : time_strings )
    {
        boost::smatch results ;
        if( boost::regex_search( str, results, regex ) )
        {
            int hour = to_int( results[1] ) ;
            int minute = to_int( results[2] ) ;
            std::cout << str << "\n\thour == " << hour << " & "
                      << "minute == " << minute << '\n' ;
        }
    }
}

Output:
time: 10:56
        hour == 10 & minute == 56
repeat 10h56m
        hour == 10 & minute == 56
Instance 1: 09h22
        hour == 9 & minute == 22
the end time was: 16h04m07s
        hour == 16 & minute == 4
this step ended at 8:32 and some seconds.
        hour == 8 & minute == 32
Thanks for your replies.

JLBorges:
I have just finished watching a video regarding regex, which is the following, for reference: http://www.youtube.com/watch?v=mUZL-PRWMeg

I feel it helped greatly in acquiring some footing in this confusing domain of grammar scanning and manipulation, however I don't know the exact syntax in detail so your regex is still a bit hazy for me, would appreciate if you could explain your regex logic.

It appears this whole regex thing is rather otherworldly especially with the replace or format. One cool thing I find useful is the to_int(results[n]);. It takes the whole option to a whole other level. I can not only rewrite them in a neater, and more compact format but also work directly with them.

Cubbi:
I apologize for the rather foggy examples of mine. In truth I don't really want to separate the times in multiple sections (that was just my way of calculating data which falls within 2 separate work days)

Essentially your program calculated the full circle. What I didn't mention which is to come later is that the times calculated must persist to a work day (7:45 to 17:15) and anything outside that duration is omitted. Additionally a holiday is never calculated, which is something I will have to provide the program with myself since holidays in my country are not set in stone, year by year.

So essentially, retaking:

Mon 26-Nov-2012 12:50 Tue 27-Nov-2012 15:29
I have to calculate mon 12:50 to 17:15 and Tue 7:45 to 15:29 and add them up together ( I may have made a mistake in my original calculation btw ).

These times will eventually also become variable, based on some parameters entered by the user in the initial line of '=====' but that's for later.

Right now I need to be able to read and calculate times within the 7:45 to 17:15 range during weekdays only (not weekends, nor holidays [although we can omit that for now as you mentioned]).

Your program sample appears to work, although once again I don't really want to print the whole instance thing and such, I was rather hoping to insert the times in the same text file, or clone it to a new one while updating the data.

I'd rather avoid Console output except for debugging purposes. I would prefer to make the user use only text files which can be easily used as evidence for results as opposed to a program (for security reasons).

Most importantly however, could you please explain the following lines in your code:
1
2
3
4
5
6
7
8
9
10
void print_results(std::vector<bpt::time_duration>& durs, unsigned int instance) // What's the intention of each argument?

std::accumulate(durs.begin(), durs.end(), bpt::time_duration()) // No idea what's happening here

bpt::time_facet // What's the use of time_facet?  What does it do, roughly?

std::cout.imbue(std::locale(std::cout.getloc(), df)); // Once again no idea what cout.imbue does

bpt::ptime t1, t2; // Is this a variable declaration or not?  Since directly below it:
if(buf >> t1 >> t2)  // You used what appear to be 2 freshly declared variables in an if condition. 


Sorry for not being able to fully follow what happened in there, a lot of these things I've never heard of, and I've never worked with Boost either.

Regardless, you've all been extremely helpful thus far, I genuinely appreciate your support and effort with writing code to explain this task better!
Most importantly however, could you please explain the following lines in your code:

That's why I wrote an example: boost time I/O takes time getting used to

void print_results(std::vector<bpt::time_duration>& durs, unsigned int instance) // What's the intention of each argument?

The first one is a vector of durations, the second one is the instance counter. The function prints the durations, and their sum, and it also wipes out the vector of durations (technically a bad idea to give this function that power, but I was in a hurry to write a demo), which is why it has to take the vector by reference.

std::accumulate(durs.begin(), durs.end(), bpt::time_duration()) // No idea what's happening here

It's the standard C++ function accumulate(), which is summing up the durations, it's no different from
1
2
vector<int> v = {1,2,3};
cout << accumulate(v.begin(), v.end(), 0); // prints 6 

This is the pay-off for learning how to use boost (or any other) date/time library: It's being able to handle durations, intervals, calendar days, time points, etc with simple operations like +.

bpt::time_facet // What's the use of time_facet? What does it do, roughly?

It holds the rules for formatting a time point and a duration as a string.

std::cout.imbue(std::locale(std::cout.getloc(), df)); // Once again no idea what cout.imbue does

imbue() applies a locale to a stream. A locale is a container of facets. A facet is a set of rules for parsing, formatting, character classification, etc. It'd have to write a book to explain the C++ I/O library, but I think Josuttis did a good job already. In this case, to be specific, I pull the locale that's already in cout, with cout.getloc(), then I add my time output facet, and I shove it back in with imbue().

bpt::ptime t1, t2; // Is this a variable declaration or not? Since directly below it:
if(buf >> t1 >> t2) // You used what appear to be 2 freshly declared variables in an if condition.

It is a variable declaration and a read from a stream. Same as
1
2
int n, m;
cin >> n >> m;
Last edited on
"([01]?[0-9]|2[0-3])[:,h]([0-5][0-9])m?"

[01] - either 0 or 1

[01]? - which is optional (repeats 0 or 1 times)

[0-9] - a decimal digit

[01]?[0-9]|2[0-3] - a decimal number in the range 00-23 with an optional leading 0 => the hour field

([01]?[0-9]|2[0-3]) - the hour field as a grouped sub-expression

[:,h] - any one of : , or h => the separator

([0-5][0-9]) - two digit number in the range 00-59 => the minute field as a grouped sub-expression

([01]?[0-9]|2[0-3])[:,h]([0-5][0-9])m? - (hour)separator(minute) or (hour)separator(minute)m

1
2
3
4
5
6
if( boost::regex_search( str, results, regex ) ) // if the regex was found in str
{
    // results[0] is the complete matched string eg. 7h42m 07:42 7,42 07h42m etc.
    // results[1] is the string holding the first grouped sub-expression ie. 7 
    // results[2] is the string holding the second grouped sub-expression ie. 42
}


Play around with regular expressions
with a regex tutorial eg. http://www.zytrax.com/tech/web/regex.htm at hand,
and a regex tester eg. http://regexpal.com/ to try things out
Many thanks JLBorges for explaining it out, I believe I pretty much got the hang of it (although I'm guessing this is just a basic example, it's still very useful :D)

Additional thanks to Chubbi, apparently there are some standard C++ stuff I've never met such as imbue(), accumulate(). I'll read up a little bit about these too.

If it's not a problem I plan on using both of your code samples as the skeleton to build upon, since there's already some good work thrown in there, and I don't have any alternatives right now.

EDIT:Borges, when compiling your code to test it, the last for loop during which output occurs resulted in several errors on VS 10 compiler. However running it on liveworkspace.org no such issues were encountered and the expected output worked fine.

The errors were along the lines of:
1
2
3
Error	1	error C2143: syntax error : missing ',' before ':'	d:\data archive\source code\testpad\testpad\testpad.cpp	26
Error	2	error C2530: 'str' : references must be initialized	d:\data archive\source code\testpad\testpad\testpad.cpp	26
Error	3	error C2143: syntax error : missing ';' before '{'	d:\data archive\source code\testpad\testpad\testpad.cpp	27


Any idea why that might be? I can't quite figure out why the for loop only has 1 parameter, I always thought it required 3.

Thanks,

L.
Last edited on
> the last for loop during which output occurs resulted in several errors on VS 10 compiler.
> However running it on liveworkspace.org no such issues were encountered...
> Any idea why that might be?

It is a C++11 range-based for loop. http://www.stroustrup.com/C++11FAQ.html#for

For an old compiler, just replace it with a classic for-loop, iterating over the time_strings[] array.

Thanks Borges, it's all set now.

One quick question to you Chubbi:
1
2
3
4
5
        if(line.find_first_not_of("=") == std::string::npos)
        { 
            print_results(durs, ++instance);
            continue;
        }


I'm unsure if I got this right (addressing the "if" function) but am I correct in interpreting that as the program taking in a whole line, checking for the first character which is not an "=" and finding out whether it's past the end of the string (string::npos), in which case it just prints the details since that indicates the end of a particular instance and the start of a new one.

However, if the first non-"=" character is found and it's not beyond the end of the current line being processed, it skips the if condition and moves on to the rest of the operation?

Thanks,

L.
Luponius wrote:
I'm unsure if I got this right

You got that one right.
Aplogies for bringing this back up, but I'm still working on this whenever I have some free time, so I'd like to request further instruction on the following if possible, Cubbi:

1
2
3
4
5
6
7
std::istringstream buf(line);
        buf.imbue(std::locale(file.getloc(), new bpt::time_input_facet("%a %d-%b-%Y %H:%M")));
        bpt::ptime t1, t2;
        if(buf >> t1 >> t2)
            durs.push_back(t2 - t1);
        else
            std::cout << "Cannot parse the line " << line << '\n';


You already mentioned that in this case you're declaring two variables t1 and t2 of type ptime, I'm with you on that.

The problem is the if() with buf>>t1>>t2.

Will the if() trigger positively as long as anything is in either t1 or t2 (or both) otherwise it triggers the else if nothing went through to t1 and t2?

Additionally, how is the istringstream handling the lines. I'm noting you're populating two inputs at once, one line is being read which contains both times, but I don't see how it is you're splitting them into t1 and t2? Could you please slowly walk me through this?

Thanks,

Lupo.
It's the same as with any other type, it's how stream I/O works.

Take ints for example:

1
2
3
4
5
6
7
8
9
10
11
#include <sstream>
#include <iostream>
int main()
{
    std::istringstream buf("1 2 3 4");
    int n, m;
    buf >> n >> m; // 1 and 2 are consumed, stream still has " 3 4"
    std::cout << "n = " << n << " m = " << m << '\n';
    buf >> n >> m;
    std::cout << "n = " << n << " m = " << m << '\n';
}


stream does the splitting. If something fails to parse, stream fails:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#include <sstream>
#include <iostream>
int main()
{
    std::istringstream buf("1 2 aaa 4");
    int n, m;
    if(buf >> n >> m)
        std::cout << "n = " << n << " m = " << m << '\n';
    else
        std::cout << "Parsing error\n";
    if(buf >> n >> m)
        std::cout << "n = " << n << " m = " << m << '\n';
    else
        std::cout << "Parsing error\n";
}
Thanks for the quick reply.

Sadly I've never used any sort of stream, despite undergoing programming lessons of some form or other for 8 years, never once were these mentioned... makes me want to tear my hair out. Everything else I've been thought is of just about no consequence apparently heh.

I'll read up on ifstream and istringstream, I believe this thread can quitely go to sleep. I'll be starting up another soon regarding the same problem once I've progressed somewhat further.

Once again thanks a lot Cubbi and JLBorges. You've taught me more in these few posts than what I've learnt thus far in programming.

L.
Topic archived. No new replies allowed.