split a big dat file to small files

Hi,
I have a large dat file like this:

# a b
# c d e f
1 2 3 4
5 6 7 8
# g h
# i j k l
9 10 11
33 6 7

I need to split this file based on the information which comes after the first "#" to two files.
like these:

# a b
# c d e f
1 2 3 4
5 6 7 8

and

# g h
# i j k l
9 10 11
33 6 7

I want to use this information for the file's name.
It would be appreciated if someone help me.

do you have a very clear idea of exactly what logic triggers a new file?
It looks as simple as
read line of original file. use this as file name (or first 2, or first N, ?? rows with #s??)
read original file and write to new file until find line that triggers a break.
on break, close new file and make new file name and repeat until original file empty.
do you need to protect yourself from 2 files same name due to the data?

Last edited on
Each new file should start with two # until it reaches the next two #. I mean for each new file I should have two lines starting with #. For the example above, the name of new files should be a_b and g_h.
Last edited on
Something like this might work.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#include <iostream>
#include <sstream>
#include <fstream>
#include <string>

int main() {
    std::ifstream fin("input_file");
    std::ofstream fout;
    bool firstline = true;
    std::string line;
    while (std::getline(fin, line)) {
        if (line[0] == '#') {
            if (firstline) {
                firstline = false;
                std::string filename(line.substr(2));
                for (auto& ch: filename) if (ch == ' ') ch = '_';
                if (fout.is_open()) fout.close();
                fout.open(filename);
            }
            else
                firstline = true;
        }
        fout << line << '\n';
    }
}

Many thanks.

running error : Error: illegal label name auto& ch: main.cxx:16:
The auto keyword is re-purposed in C++11. It was previously a relic from C.
https://en.cppreference.com/w/cpp/language/auto

If you have a reasonably modern GCC or Clang, then you need to specify something like
g++ -std=c++11 prog.cpp
cc1plus: error: unrecognized command line option ‘-std=c++11’
Provide more details about your setup.
It's going to be a long thread if we have to keep making guesses as to what you're seeing on screen and typing at your keyboard.

For example, what actual compiler are you using.
1
2
3
4
5
$ g++ --version
g++-5.real (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.



Eg.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
$ cat baz.cpp
#include <iostream>
#include <sstream>
#include <fstream>
#include <string>

int main() {
    std::ifstream fin("input_file");
    std::ofstream fout;
    bool firstline = true;
    std::string line;
    while (std::getline(fin, line)) {
        if (line[0] == '#') {
            if (firstline) {
                firstline = false;
                std::string filename(line.substr(2));
                for (auto& ch: filename) if (ch == ' ') ch = '_';
                if (fout.is_open()) fout.close();
                fout.open(filename);
            }
            else
                firstline = true;
        }
        fout << line << '\n';
    }
}

$ g++ baz.cpp
baz.cpp: In function ‘int main()’:
baz.cpp:16:28: error: ISO C++ forbids declaration of ‘ch’ with no type [-fpermissive]
                 for (auto& ch: filename) if (ch == ' ') ch = '_';
                            ^
baz.cpp:16:32: warning: range-based ‘for’ loops only available with -std=c++11 or -std=gnu++11
                 for (auto& ch: filename) if (ch == ' ') ch = '_';
                                ^
baz.cpp:18:35: error: no matching function for call to ‘std::basic_ofstream<char>::open(std::__cxx11::string&)’
                 fout.open(filename);
                                   ^
In file included from baz.cpp:3:0:
/usr/include/c++/5/fstream:799:7: note: candidate: void std::basic_ofstream<_CharT, _Traits>::open(const char*, std::ios_base::openmode) [with _CharT = char; _Traits = std::char_traits<char>; std::ios_base::openmode = std::_Ios_Openmode]
       open(const char* __s,
       ^
/usr/include/c++/5/fstream:799:7: note:   no known conversion for argument 1 from ‘std::__cxx11::string {aka std::__cxx11::basic_string<char>}’ to ‘const char*’

$ g++ -std=c++11 baz.cpp
$ g++ --version
g++ (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
Copyright (C) 2011 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Mmm, according to https://www.gnu.org/software/gcc/projects/cxx-status.html#cxx11
C++11 support was still somewhat experimental prior to 4.7

Perhaps -std=c++0x would work for your old compiler?


I suppose you could just re-write for (auto& ch: filename) using more traditional syntax.
Try g++
-std=c++0x


If that doesn't work,
Change fout.open(filename); to fout.open(filename.c_str());

Change for (auto& ch: filename) if (ch == ' ') ch = '_'; to
for (size_t i 0; i < filename.length(); i++) if (filename[i] == ' ') filename[i] = '_';
Thanks, it works
Topic archived. No new replies allowed.