How Should I Combine One-Per-Row Entries

I have a set of data like this:

  A  B  Mon Tue ...
  _________________
1|c  d   2   0  ...
2|c  d   0   8  ...


I am trying to consolidate rows 1 and 2 based on the values in columns A and B being the same into a horizontal format.
Ex (Using the above):

  A B Mon Tue ...
  _______________
1|c d  2   8  ...


Currently, I am using the following code:
1
2
3
4
5
6
7
8
9
10
11
printHeaders();
string ID = "";
string currentID = "";
uint ind = 1; // Row 0 is the header row
for(uint i = 1; i < 2dVec.size(); i++) {
    currentID = (2dVec[i][rowA] + 2dVec[i][rowB]);
    for(uint j = 1; j < 2dVec[ind].size(); j++) {
        if(ID != currentID) ind = i;
        //Print out the corresponding values
    }
}

2dVec is a .csv file saved into a vector< vector<string> >
The data is sorted based on A (Last Name) then B (First Name), and ID and currentID are concatenations of A and B.
ID = initial ID (runs are possible)
currentID = ID currently being looked at (at 2dVec[i])
This is not working and seems overly complicated. Any suggestions?
Last edited on
is excel an option? C++ might be creating work for yourself.

it looks like you have the right approach if excel isnt viable; load to a 2d construct, find the groups of A&B matching, and consolidate them. I don't think your solution is overly complicated, but it may be slow for HUGE files. It looks like the right stuff.

What is it doing wrong?
it seems to me you would want some slight changes...
1
2
3
4
5
6
7
8
     do
    { 
         currentID = a+b;
         while(current row ID == currentID && rows left to do)
             {
                  process row.
             }
    } while(rows left to do)


that is, to me it looks wrong to iterate over EVERY row.
imagine
1 2 x
1 2 y
1 2 z
1 2 c
2 3 m
etc
you don't want to iterate over each of those and set the current ID and process everything after it. You want to skip to the 2 3 m row and start again there after doing all the 1 2 rows. You also don't want to keep looping once the key changes, it wastes processing time.


Brute force:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
#include <iostream>
#include <vector>
#include <string>
#include <map>

using vec2d = std::vector< std::vector<std::string> > ;
using row_map = std::map< std::pair<std::string,std::string>, std::vector<std::string> > ;

std::vector<std::string>& combine( std::vector<std::string>& a, std::vector<std::string> b )
{
    std::size_t sz = std::max( a.size(), b.size() ) ;
    a.resize(sz) ;
    b.resize(sz) ;
    for( std::size_t i = 0 ; i < sz ; ++i ) a[i] = std::max( a[i], b[i] ) ;
    return a ;
}

row_map make_row_map( const vec2d& vec )
{
    row_map result ;
    for( const auto& v : vec )
        if( v.size() > 2 ) combine( result[ { v[0], v[1] } ], v ) ;
    return result ;
}

vec2d combine_rows( const vec2d& vec )
{
    const auto row_map = make_row_map(vec) ;

    vec2d result ;
    for( auto& pair : row_map ) result.push_back( std::move(pair.second) ) ;
    return result ;
}

int main() // trivial test driver
{
   const vec2d original =
   {
       { "Stroustrup", "Bjarne", "4" },
       { "Koenig", "Andrew", "0", "0", "9" },
       { "Stroustrup", "Bjarne", "0", "0", "0", "8" },
       { "Koenig", "Andrew", "6" },
       { "Stroustrup", "Bjarne", "0", "2" },
       { "Koenig", "Andrew", "0", "1", "0", "5" },
       { "Stroustrup", "Bjarne", "0", "0", "6" },
   };

   for( const auto& row : combine_rows(original) )
   {
       for( const auto& str : row ) std::cout << str << ' ' ;
       std::cout << '\n' ;
   }
}

http://coliru.stacked-crooked.com/a/18429555794f00e6
Topic archived. No new replies allowed.