List words of text one-by-one

Hello

I want to do something, but I'm not sure how to start.
If I have text in a text file,
example.txt:

The example programs of the previous sections provided little interaction with the user, if any at all. They simply printed simple values on screen, but the standard library provides many additional ways to interact with the user via its input/output features. This section will present a short introduction to some of the most useful.


I want to list all those words beneath each other in another file,

result.txt:

The
Example
Programs
Of
The
Previous
...


How would I do that?

Thanks for reading,
Niely
closed account (48T7M4Gy)
First you need to read the file in line by line than tokenize it, and capitalise the first letter of each word.

Tokenizer ... http://www.cplusplus.com/reference/cstring/strtok/?kw=strtok
Here is the code I used:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
using std::ifstream;
using std::ofstream;
using std::getline;
using std::string;
using std::vector;
using std::endl;
using std::cout;
using std::cin;

int main() {
	vector<string> lines;
	vector<string> newlines;
	string path = "C:\\Shared\\Test.txt";
	string newpath = "C:\\Shared\\Test2.txt";
	ifstream file(path.c_str());
	while (!file.eof()) {
		string temp;
		getline(file, temp);
		lines.push_back(temp);
	}
	bool eof = false;
	int a;
	a = lines[0].find(" ");
	for (int x(0); a != string::npos; x++) {
		eof = false;
		string word;
		while (!eof) {
			if (a != string::npos) {
				word = lines[x].substr(0, a);
				lines[x] = lines[x].substr(a + 1, lines[x].length() - a);
			} else {
				word = lines[x];
				eof = true;
			}
			newlines.push_back(word);
			a = lines[x].find(" ");
		}
	}
	file.close();
	ofstream newfile(newpath);
	for (int x(0); x < newlines.size(); x++) {
		newfile << newlines[x] << endl;
	}
	newfile.close();
	cin.get();
	return 0;
}


If you'd like me to explain what parts of it mean then I would be glad to.
Last edited on
@Kemort: Thanks a lot for your reply, helped a lot!
But can you explain this code a bit more detailed please? Also with the %s etcetera.
Last edited on
Do you realize that if all you want to do is write a new file with every word from an existing file on a new line you could do this is one loop? Consider reading the file word by word instead of reading an entire line by using the extraction operator>>.



closed account (48T7M4Gy)
Worth a try jlb but the problem might be commas, full stops, slashes etc in the original text. strtok overcomes this problem if the word (token) delimeter is not just whitespace.
Yes strtok() may be a better solution in a C program or a C++ program that is using C-strings. But if this is a C++ program std::strings should be used and strtok() should be avoided since there are string methods to parse lines of text.

And if you look at the first post you will see that the primary delimiter seems to be the space character. That any punctuation must also be removed has not been stated by the OP, but this can be handled separately.

Do you realize that if all you want to do is write a new file with every word from an existing file on a new line you could do this is one loop? Consider reading the file word by word instead of reading an entire line by using the extraction operator>>.


How would you do that?

Also, can someone explain this code line-by-line?
It's exactly what I want:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#include <stdio.h>
#include <string.h>
#include <iostream>

using namespace std;
int main ()
{
  char str[] ="- This, a sample string test; word-1 word-2 under_line, . j ^ _ ° _ù⁼ test.";
  char * pch;
  printf("Splitting string %s \n",str);
  pch = strtok (str," ,.-");
  while (pch != NULL)
  {
    printf ("%s\n",pch);
    pch = strtok (NULL, " ,.-");
  }
  return 0;
}
What language are you using to write your program? C? C++?

^C++.
I also need to know how to hook a text file on that piece of code.
That code is essentially C.

You have two things that should be kept separate: the echo and the true nature of the streams.

Lets do echo:
1
2
3
4
5
6
7
void echo( std::istream & in, std::ostream & out ) {
  std::string word;
  while ( /*read from in into word*/ ) {
    /*write word into out*/
    out << '\n';
  }
}

The code that you should put there instead of the comments uses operator>> and operator<<.

Why have a fancy function. What to do with it?
1
2
3
4
5
6
7
8
9
10
int main( int argc, char* argv[] ) {
  if ( 2 <= argc ) {
    std::ifstream fin( argv[1] );
    echo( fin, std::cout );
  }
  else {
    echo( std::cin, std::cout );
  }
  return 0;
}

What does that do? If you give at least one command line argument, when running the program, the first argument is used as name of the input file. If run without arguments, the std::cin is used. In both cases the output goes to std::cout.

Therefore, either of these would be ok:
a.out < example.txt > result.txt
a.out example.txt > result.txt

You could, obviously, go one step further and make the program use two arguments, one for input and the other for output filename.

The above code depends on
1
2
3
#include <iostream>
#include <fstream>
#include <string> 

Sorry, but I didn't understand that.
How does C++ have echo?

I just want a simple C++ code who just takes the content of one text file, and puts it one by one to another...
Just keep it simple;
closed account (48T7M4Gy)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
#include <stdio.h>
#include <string.h>
#include <iostream>

const int MAX_LINE_LENGTH = 100;

int main()
{
	FILE* source;
	FILE* destination;

	char line[ MAX_LINE_LENGTH ];
	char separators[]   = "?!. ,\t\n";
	char* token;

	char* format = "+%s+ ";

	source = fopen( "data.txt", "r" );
	destination = fopen( "output.txt", "w" );

	if(  source != NULL )
	{
		while( fgets( line, MAX_LINE_LENGTH, source ) != NULL )
		{
			token = strtok( line, separators );
			while( token != NULL )
			{
				printf( format, token );
				fprintf( destination, format, token );
				token = strtok( NULL, separators );
			}
		}
		printf("\n +++ ENDS +++\n");
	}
	else
		printf( "fgets error\n" );

	fclose( source );
	fclose( destination );

	return 0;
}


With a bit of luck it should be self explanatory - read each line, separate out each word setting a pointer called 'token', then process the token, then moving along the line and then down the file.

This does something very close to what you want. Sure it uses C-strings but C++ according to Stroustrup is an extension of C not an alternative. But, sure, if you like do it in purist mode. :-)

I just want a simple C++ code who just takes the content of one text file, and puts it one by one to another...
Just keep it simple;


So what have you tried? You may want to study the following tutorial: http://www.cplusplus.com/doc/tutorial/files/
It should explain basic C++ file IO.

Just keep it simple;

Boring, rigid, not reusable, but whatever:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#include <iostream>
#include <fstream>
#include <string>

void echo( std::istream & in, std::ostream & out ) {
  std::string word;
  while ( /*read from in into word*/ ) {
    /*write word into out*/
    out << '\n';
  }
}

int main() {
  std::ifstream fin( "example.txt" );
  std::ofstream fout( "result.txt" );
  echo( fin, fout );
  return 0;
}

All you need to do is to fix lines 7 and 8.
This does something very close to what you want. Sure it uses C-strings but C++ according to Stroustrup is an extension of C not an alternative. But, sure, if you like do it in purist mode. :-)

Love to see that quote, but I suspect it doesn't exist.

A relevant quote might be from Stroustrup's faq:
I have never seen a program that could be expressed better in C than in C++ (and I don't think such a program could exist - every construct in C has an obvious C++ equivalent).

From: http://www.stroustrup.com/bs_faq.html#difference

A C++ solution might look like this:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
#include <vector>
#include <string>
#include <iostream>
#include <fstream>
#include <sstream>

std::vector<std::string> tokenize(const std::string& s, const std::string& delim)
{
    std::vector<std::string> result;

    std::size_t prev = 0, next = 0;

    while ((next = s.find_first_of(delim, prev)) != std::string::npos)
    {
        if (std::size_t length = next - prev)
            result.push_back(s.substr(prev, length));

        prev = next + 1;
    }

    if (prev != s.size())
        result.push_back(s.substr(prev));

    return result;
}

int main()
{
    const char* in_file = "example.txt";
    const char* out_file = "result.txt";

    std::ifstream in(in_file);
    if (in.is_open())
    {
        std::ostringstream os;
        os << in.rdbuf();

        std::vector<std::string> tokens = tokenize(os.str(), " \n\t,./\\-!");

        std::ofstream out(out_file);
        if (out.is_open())
        {
            for (auto& token : tokens)
                out << token << '\n';

            //If no C++11 support for ranged for loops use the following loop instead:
            //for (std::vector<std::string>::iterator it = tokens.begin(); it != tokens.end(); ++it)
            //    out << *it << '\n';
        }
        else
            std::cerr << "Unable to open file " << out_file << " for output.\n";
    }
    else
        std::cerr << "Unable to open file " << in_file << " for input.\n";
}
my solution is short!
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
#include <iostream>
#include <fstream>
#include <string>
using namespace std;

int main() {
    string s;
    ifstream in ("example.txt");
    ofstream out ("result.txt");
	
    if(in.is_open() && out.is_open())
      {
        while(in >> s)
            {
    	     s[0] = toupper(s[0]);
	     out << s << endl;
	    }   
        in.close();
        out.close();      
      }  
    else cout << "Unable to open file."; 

    return 0;
}
Last edited on
^Thanks a lot! :)
Works like a charm and I fully understand it.

Thanks everyone!!!
Topic archived. No new replies allowed.