cout.rdbuf() and printf()

I assigned a char buffer to cout in order to speed things up. And as expected it was extremely efficient, iterating all numbers from 0 up to short unsigned upper limit dropped from 11.5 secs to around 0.1 sec. But to my surprise printf() works better as well by assigning cout a buffer, performing the same task, the printf() timer dropped from 3.04 to 0.1 as well. To make sure it wasn't an isolate case I performed the test multiple times and on varying numbers, as well as testing with std::ios::sync_with_stdio(false) and without it and the result was the same, a major speed-up for both, in fact, they both had about the same running time. So the question is how exactly does cout's member functions rdbuf() and pubsetbuf() affect printf()? What's the connection between cout and printf except their function?

Here's the code I used to test them on Visual Studio 2013:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42

#include<iostream>
#include<cstdio>
#include<chrono>
#include<conio.h>

int main(void)
{
 
 char * buffer= new char [1024];

 std::cout.rdbuf()->pubsetbuf(buffer, 1024);
 
 std::chrono::time_point<std::chrono::system_clock> start_cout =
 std::chrono::system_clock::now();

  for (auto i = 0; i !=0xFFFF; ++i)
     std::cout<<"abcd";

 std::chrono::time_point<std::chrono::system_clock> finish_cout = 
 std::chrono::system_clock::now();
 
 std::chrono::duration<double> timed_cout = finish_cout - start_cout;

 std::cout<<"\n" <<timed_cout.count();
 _getch();

 std::chrono::time_point<std::chrono::system_clock> start_printf =std::chrono::system_clock::now();

 for (auto i = 0; i !=0xFFFF; ++i)
   printf("abcd");

  std::chrono::time_point<std::chrono::system_clock> finish_printf =   std::chrono::system_clock::now();
  std::chrono::duration<double> timed_printf = finish_printf - start_printf;

  std::cout<<"\n" <<timed_printf.count();

	
 _getch();
return 0;
}
Last edited on
Are you sure that std::ios::sync_with_stdio(false) was the first line in your program (before changing buffer)?
It was the first line after declaring 'buffer', so yes, it was before I changed the the buffer.
1) You are measuring wall clock. You are seriously affected by OS buffer and IO throughtput. You are unlikely to get any significant results from your measurements. Measure CPU time actually spent in your program.

My tests:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
#include<iostream>
#include<cstdio>
#include<ctime>

int main()
{
    std::ios::sync_with_stdio(false);
    char* buffer= new char [1024];

    std::cout.rdbuf()->pubsetbuf(buffer, 1024);

    auto start_cout = clock();
    for (auto i = 0; i !=0xFFFF; ++i)
        std::cout<<"abcd";
    auto finish_cout = clock();
    auto timed_cout = double(finish_cout - start_cout) / CLOCKS_PER_SEC;
    std::cout.flush();

    auto start_printf = clock();
    for (auto i = 0; i !=0xFFFF; ++i)
        printf("abcd");
    auto finish_printf = clock();
    auto timed_printf = double(finish_printf - start_printf) / CLOCKS_PER_SEC;
    fflush(stdout);

    std::cout<<"\n" <<timed_cout;
    std::cout<<"\n" <<timed_printf;
}
Results:
/*...*/
0.01
0.11
As you see, cout was even faster in this case (better inlining than monolythic printf)

With line 10 commented out:
/*...*/
0
0.11
As you can see, printf speed remained the same.

Then again intermixing printf and iostreams in one program after turning off synchronisation is undefined. I need to remake tests. Wait a minute.
Last edited on
New program (with increased amount of iterations to improve precision) and results:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
#include<iostream>
#include<cstdio>
#include<ctime>

//Removed for cout testing
#define PRINTF

int main()
{
    std::ios::sync_with_stdio(false);
    char* buffer= new char [1024];

    std::cout.rdbuf()->pubsetbuf(buffer, 1024);

    auto start = clock();
#ifndef PRINTF
    for (auto i = 0; i !=0xFFFFF; ++i)
        std::cout<<"abcd";
    std::cout.flush();
#else
    for (auto i = 0; i !=0xFFFFF; ++i)
        printf("abcd");
    fflush(stdout);
#endif
    auto finish = clock();
    auto time = double(finish - start) / CLOCKS_PER_SEC;

    std::cout << "\n" << time;
}
        w/ buf  w/o buf
cout    0.06    0.06
printf  1.82    1.81

Edit: test checking wall time had shown similar results as soon I stopped using horribly slow screen IO and redirected output into file.
Last edited on
I'd redirect stdout to /dev/null and use stderr to display the time.

Don't have such incredible difference, both are around 70 ms.


> Then again intermixing printf and iostreams in one program after turning off synchronisation is undefined.
¿isn't your last snip still doing that? ¿or does the flush solve the problem?
isn't your last snip still doing that?
It compiles either one or another part depending on if PRINTF macro is defined. You have to compile and run program several times with different configurations.

Don't have such incredible difference, both are around 70 ms.
I actually expected that. I believe my resuts are because MinGW links dynamically with C runtime library (MSVCRT). All test I saw shows comparable perfomance with C++ and C I/O
> It compiles either one or another part depending on if PRINTF macro is defined
yes, but in both cases you end with a `cout' to present the time.
Ah, you are right. I need to fix that too. Behavior of different calls is actually not undefined, but different buffers might lead to problems (in previous snippet results were buried in the middle of file, because of printf buffering).
Topic archived. No new replies allowed.