Input Speeds

Why is scanf and printf quicker than their streams counterparts?
Because streams are inherently slower, but easier to work with. They provide stuff that cstdio does not. Skim through this thread:
http://www.cplusplus.com/forum/lounge/48796/
The thread-safe issue is due to the fact that iostream output is spanned across several commands, whereas printf is supposedly done in one go.

Each individual << operation may be atomic, but even if that's the case, an output may consist of several <<, and there's no way all of them can be one atomic operation.

Consider the following:

1
2
3
4
5
6
7
8
9



void ThreadOne()
{
cout << "a";
}

void ThreadTwo()
{
cout << "b" << "c";
}



If those run at the same time, logically you'd expect either "abc" or "bca" as output. However since Thread 2 splits it's output across multiple << operations, "bac" is entirely possible.

A more realistic example:

1
2
3
4
5



int myvar = 5;
void Foo()
{
cout << "myvar = " << myvar << endl;
}



This function being run simultaneously in 2 threads would be a problem. You might get some mangled output like:

myvar = myvar = 5
5



or even

myvar = myvar = 55

This one helped me. Pasted so that others too could find that here itself ;).




Now a newer problem :
What is atomicity in Computer Science?
Also I have studied C++ for two years at school and C for the only semester I have spent at college. I have planned to go through this book :
McGraw-Hill - C - The Complete Reference, 4th Ed.
Some suggestions about the book to follow other tha this please.

I also want to practice a lot of output problems so any good site will be of great help.
Last edited on
> Why is scanf and printf quicker than their streams counterparts?

To put it crudly, because much of iostream processing is distributed over several facets, and because sentries are involved in formatted i/o.

However:
The Standard IOStreams library has a well-earned reputation of being inefficient. Most of this reputation is, however, due to misinformation and naïve implementation of this library component. - PDTR18015

For instance Dietmar Kühl's cxxrt implementation is said to be on par with C stdio performance.

For more information, see Chapter 3 of ISO/IEC PDTR 18015 'Technical Report on C++ Performance'
http://www.open-std.org/jtc1/sc22/wg21/docs/PDTR18015.pdf
To be fair, int n; scanf("%d", &n); and int n; std::cin >> n; are specified by their respective standards to do exact same job: they apply the same locale facets to the same characters in the same order (LC_CTYPE/std::ctype to skip leading whitespace and LC_NUMERIC/num_get to parse the integer). The only significant difference is that C locales are system-provided global variables, while C++ locales are user-extensible stream members. It is a quality of implementation issue as the TR points out.

If you are seriously interested in fast input (much faster than scanf()), use boost.spirit or another parser that's built for speed rather than for internationalization.
Last edited on
> Why is scanf and printf quicker than their streams counterparts?
They are not (gcc) http://codeforces.com/blog/entry/925
I also want to practice a lot of output problems so any good site will be of great help.

There's the ever popular http://projecteuler.net/about
but I don't like it so much because only computed answers are submitted, which may be obtained by any means.

I like more rigorous sites such as http://www.codechef.com
At this site you submit your code for the solution in your chosen language.
The site compiles and runs your solution. There are several reasons that a solution can be rejected. A program giving a correct result can be rejected for TLE (time limit exceeded), allowance for which is language dependent.
Note: Programs written in c++ face the tightest time restrictions.
Check out the practice section. I have succeeded in solving only 8 of the "Easy" problems so far. I won't get much further until I buckle down on some algorithms study (a copy of Introduction to Algorithms 3rd ed is gathering dust over here).
Codechef also hosts regular competitions.

EDIT: You may also wish to see this lounge thread http://www.cplusplus.com/forum/lounge/88674/
Last edited on
@Everyone Thanks a lot.
@fun2code
I am on codechef for some 2 months. Was able to solve only six problems so far, got TLE in many.
May be now they will be solved.

@(JLBORGE&&CUBBI&&ne555) thanks a lot sir/s. You all seem to be omniscient.

Though I could not understand most of those tricky comments. But I am now making sincere efforts to try them to the best of my abilities.


What I have understood so far is :


1. iostream output is spanned across several commands and hence it is slower.

2.boost.spirit is real quick.
Last edited on
Wow. Thats an amazing link thanks a lot sir. I guess now i know the most basic reason for cin's slow performance.

Can you please give me some more basic tips which will help me quicken my programs.
If you are researching this because you suspect that it's behind the TLE results you are getting at codechef I say it's much more likely to be the algorithm used for your solution, though they do recommend using scanf and printf over cin, cout for speed reasons.

The main reason for TLE is to force the implementation of sufficiently efficient algorithms. Fortunately there are tutorials accompanying some problems, such as for Sumtrian (find maximum sum of elements visited on a path from top to bottom of a triangle) which explains the (rather dirty sounding) bottom up DP approach. I couldn't have gotten this problem otherwise.
i guess it was the one because i got two acs just by changing the cins in my code. thanksfor help though.
1. iostream output is spanned across several commands and hence it is slower.

2.boost.spirit is real quick.


When I said that boost.spirit is the fastest (for input speed, which was the title of the thread), I meant the code that has to do with converting the input (sequence of characters) to data (ints, doubles, structs, etc), which is where the significant differences between C++ and C streams are. The steps from an external device into a program-accessible sequence of characters are the same or can be made the same with simple steps (sync_with_stdio(false), setbuf, memory mapping, etc).

I felt like giving it a test: my input is a string holding 10,000,000 random doubles in -10.0 .. 10.0 range, separated by varying amounts of whitespace (1..10 spaces), as generated by this code:

1
2
3
4
5
6
7
8
9
const unsigned MAX = 10000000;

    std::random_device rd;
    std::mt19937 mt(rd());
    std::uniform_real_distribution<> value_d(-10.0, 10.0);
    std::uniform_int_distribution<> space_d(1, 10);
    for(unsigned n = 0; n < MAX; ++n)
        data += std::string(space_d(mt), ' ')
              + std::to_string(value_d(mt));

The size of the string was 139,993,617 bytes

Each of the following functions received a pre-allocated vector<double>

Boost.spirit (personal favorite)
1
2
3
4
void use_boost_spirit(const std::string& data, std::vector<double>& result)
{
    qi::phrase_parse(data.begin(), data.end(), *qi::double_, ascii::space, result);
}


C stdio (crowd favorite)
1
2
3
4
5
6
7
8
9
10
11
void use_stdio(const std::string& data, std::vector<double>& result)
{
    const char* p = data.c_str();
    char* p2;
    double val;
    while( val = std::strtod(p, &p2), p != p2 )
    {
        result.push_back(val);
        p = p2;
    };
}


C++ streams: run-of-the-mill stringstream
1
2
3
4
5
6
void use_stringstream(const std::string& data, std::vector<double>& result)
{
    std::istringstream buf(data);
    result.assign( std::istream_iterator<double>(buf),
                   std::istream_iterator<double>() );
}


C++ streams: istrstream (old-timer's delight)
1
2
3
4
5
6
void use_strstream(const std::string& data, std::vector<double>& result)
{
    std::istrstream buf(data.c_str());
    result.assign( std::istream_iterator<double>(buf),
                   std::istream_iterator<double>() );
}


C++ streams: boost replacement for istrstream
1
2
3
4
5
6
7
void use_boost_stream(const std::string& data, std::vector<double>& result)
{
    io::stream_buffer<io::array_source> boost_buf(data.c_str(), data.size());
    std::istream buf(&boost_buf);
    result.assign( std::istream_iterator<double>(buf),
                   std::istream_iterator<double>() );
}


Complete program: http://liveworkspace.org/code/30FxYn (also shows the output metrics on whatever system LWS uses)

Results: time in seconds, average over 10 runs:

                             spirit  strtod  boost::iostream strstream stringstream
on intel,
intel-13.0.1 -Ofast -xHost   0.901   1.765      4.765        4.551        4.981
gcc-4.7.2 -O3 -march=native  1.423   1.817      4.808        4.853        5.048
clang-3.1 -O3 -march=native  1.492   1.814      6.790        6.830        7.039
on ibm,
gcc-4.7.2 -O3 -mcpu=power6   0.651   1.396      >1 min 
xlC-11.1 -O3 -qarch=pwr6             0.882      6.290        6.341        6.330
on sun,
CC-5.10 -fast -library=stlport4      0.306     11.650       11.760       12.070
Last edited on
¿aren't you making a copy of the string?
http://cplusplus.com/reference/sstream/istringstream/istringstream/
the stream's buffer is initialized with the content of the string object str as if a call to member str.

aren't you making a copy of the string?

Yes, that's the drawback of stringstream which is why strstream is still part of C++11, even if it was already deprecated in 1998
Cubbi, would a real_parser<> with customized real_policies (input does not have exponent, nan or inf) make a difference in spirit performance?
@JLBorges using real_parser<double> with exponent, inf, and nan turned off, I am getting the same 0.90 seconds with intel, but a slight improvement with gcc (1.30 seconds instead of 1.42).
Thank you, Cubbi.
Topic archived. No new replies allowed.