std::chrono gives me always 0? Under compile flag -O3

Here I scan a very large vector, which has more than 1 << 16 elements.
vector<vector<vector<uint64_t>>> table_portions: a vector of tables (here a vector of 2 tables, these tables is just part of a bigger table, so they have the same schema)
vector<vector<uint64_t>>: a single table with 13 columns
vector<uint64_t>: a single columns with size 1 << 16

1
2
3
4
5
6
 vector<vector<vector<uint64_t>>> table_portions(2, vector<vector<uint64_t>>(13));
  for (auto &vec : table_portions) {
    for (auto &v : vec) {
      v.resize(1 << 16);
    }
  }


Problem:
I just want to scan this vector with some restrictions to measure the time needed.
But the duration is always 0? I can not understand that.
I did turn on the -O3 flag while compiling.
Without -O3 the scans takes some time.


The inportant part:
1
2
3
4
5
6
7
8
9
10
11
12
13
      for (size_t scan = 0; scan < scans; ++scan) {
          uint64_t count_if = 0;
          for (auto &vec_vec : table_portions) {
            for (size_t i = 0; i < vec_vec[0].size(); ++i) {
              if (vec_vec[10][i] >= l_shipdate_left && vec_vec[10][i] < l_shipdate_right 
                  && vec_vec[6][i] >= l_discount_left && vec_vec[6][i] <= l_discount_right
                  && vec_vec[4][i] < l_quantity) ++count_if;
            }
            // std::cout << count_if << std::endl;
        }
      }
    auto end_time = std::chrono::high_resolution_clock::now();
    auto time_vec = std::chrono::duration_cast<std::chrono::microseconds>(end_time - start_time).count();


The whole function:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
void generateQuery6Csv_vector(vector<vector<vector<uint64_t>>> &table_portions) {
  std::ofstream myfile;
  myfile.open ("../data/scan_compare_query6_tpch_vector.csv");
  myfile << "scans,std::vector\n";
  uint64_t l_shipdate_left = 19940101;  // [10]
  uint64_t l_shipdate_right = 19950101;
  uint64_t l_discount_left = 5;  // [6]
  uint64_t l_discount_right = 7;
  uint64_t l_quantity = 2400;  // [4]
  for (size_t scans = 200; scans < 1000; scans += 200) {
    auto start_time = std::chrono::high_resolution_clock::now();
      for (size_t scan = 0; scan < scans; ++scan) {
        // Query_6(table_portions);
          uint64_t count_if = 0;
          for (auto &vec_vec : table_portions) {
            for (size_t i = 0; i < vec_vec[0].size(); ++i) {
              if (vec_vec[10][i] >= l_shipdate_left && vec_vec[10][i] < l_shipdate_right 
                  && vec_vec[6][i] >= l_discount_left && vec_vec[6][i] <= l_discount_right
                  && vec_vec[4][i] < l_quantity) ++count_if;
            }
            // std::cout << count_if << std::endl;
        }
      }
    auto end_time = std::chrono::high_resolution_clock::now();
    auto time_vec = std::chrono::duration_cast<std::chrono::microseconds>(end_time - start_time).count();
    myfile << scans << "," << time_vec << "\n";
  }
  myfile.close();
}
Last edited on
This function is like sql query 6.
1
2
3
4
5
6
7
8
9
10
11
-- TPC-H Query 6

select
        count(*)
from
        lineitem
where
        l_shipdate >= date '1994-01-01'
        and l_shipdate < date '1995-01-01'
        and l_discount between 0.06 - 0.01 and 0.06 + 0.01
        and l_quantity < 24
Last edited on
The csv file that I generated, where the time is 0 always, which means I scan 1000000 times in 0 microseconds.
Part of the file:

scans,std::vector
2000,0
4000,0
6000,0
8000,0
10000,0



Without the -O3:
scans,std::vector
2000,2612241
4000,6067853
6000,10377604
8000,15985652
10000,20316525
Last edited on
With -O3, the compiler can see you never use the result of your hard work, so it just optimises it all away.

Do something like this to prove it needs to do the work.
cout << count_if;
Wow Indeed.
Compiler can do the job in the compile time.
Hmm Long debugged and found that is aimed at the flag!

Hi salem c.
But I re think this.
I want to measure, how long it takes to scan 2000 times.
How can I make it?
Cout woud do negative impact on the performance
Last edited on
Put any I/O outside of the start/end time measurements of the portion of code you're trying to measure (unless it's the I/O that you're specifically trying to measure).

Ask yourself, why are you calculating the value of count_if if you never use the result of the value?
Last edited on
I was somehow dummy.
But good to know that -O3 is powerful!
you can avoid most of cout's time penalty by redirecting the output to a file.
you can also factor it out, run it 2000 times and give that time (for the total, not per iteration) so you cout after the timing portion is totally done, no effect.

as far as running it in a tight loop ... be warned: running something 2000 times in a loop is very different from running it 2000 times between other code calls in the real program. If there is a bunch of stuff between the calls, you lose some of the efficiency that loops have (caches and registers and such that don't have to get swapped out, memory access magic, and more come into play).

time it in a loop to tweak it and beat its run time per iteration down.
run it in the real code alongside the other real code to see how it really performs and if its good enough.
Last edited on
In a separate source file, put
void dummy(uint64_t x) {}

In your inner loop, change
1
2
3
              if (vec_vec[10][i] >= l_shipdate_left && vec_vec[10][i] < l_shipdate_right 
                  && vec_vec[6][i] >= l_discount_left && vec_vec[6][i] <= l_discount_right
                  && vec_vec[4][i] < l_quantity) ++count_if;

to
1
2
3
4
5
6
7
              if (vec_vec[10][i] >= l_shipdate_left && vec_vec[10][i] < l_shipdate_right 
                  && vec_vec[6][i] >= l_discount_left && vec_vec[6][i] <= l_discount_right
                  && vec_vec[4][i] < l_quantity) {
                     ++count_if;
                     extern void dummy(uint64_t);
                     dummy(count_if);
              }


The call to dummy() will force the compiler to compute each value of count_if since it doesn't know what dummy() does.
Topic archived. No new replies allowed.