File reading

Hi!

I have a problem with file reading. In particular, I have to find some keyword in text file and then I have to do different operations.
I make an example, consider the file:

keyword_1 word {
keyword_2 name_1,name_2
keyword_3 name_3
}

keyword_4 : name_3 keyword_5 name_2

I don't know how to read this file, if I have to read line by line or character by character, or if it's possible an hybrid approach.
If i read line by line, how can I separate different words of a line ? How can I ignore ' ' (blank space)?
The best way is to read token-by-token.

Assuming your punctuators are always single-character, maybe something like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
#include <iostream>
#include <fstream>
#include <string>
#include <cctype>

bool myispunct(char ch) { return std::ispunct(ch) && ch != '_'; }

bool get_token(std::istream& in, std::string& token) {
    token.clear();
    char ch;
    while (in.get(ch) && std::isspace(ch)) ;    
    if (in) token += ch;
    if (!myispunct(ch))
        for ( ; in.get(ch); token += ch)
            if (std::isspace(ch) || myispunct(ch)) {
                if (std::ispunct(ch)) in.putback(ch);
                break;
            }
    return token.size() != 0;
}

int main() {
    std::ifstream fin("input_file");
    std::string token;
    while (get_token(fin, token))
        std::cout << '[' << token << "]\n";
}


[keyword_1]
[word]
[{]
[keyword_2]
[name_1]
[,]
[name_2]
[keyword_3]
[name_3]
[}]
[keyword_4]
[:]
[name_3]
[keyword_5]
[name_2]

Last edited on
sorry, can you tell me what does this programm do line by line?
I'm in difficult to understand at all.
can you tell me what does this programm do line by line?

No.
Hello ema897, no need to report a post if you're not happy about someone's answer, it's not really a dislike button.

That being said, what dutch wrote for you is exactly what you asked for:
If i read line by line, how can I separate different words of a line ? How can I ignore ' ' (blank space)?
He's extracting every token ('word' or punctuation) from the file, that's either separated by punctuation or blank spaces
Last edited on
I just reported you for being "lazy".
You're in trouble now!

Seriously, though, if you read through carefully and OF COURSE look up any functions that you don't understand, you should be able to figure out how it works. Why should I exert myself if you won't?
H00G0 : Sorry, but I did not report dutch.. he helped me, why did I report him?
Moreover, I saw this post 5 minutes ago after I asked to explain the code.. When did I report?
dutch : I'm not lazy, i just asked for an help because i tried to understand by myself but i didn't understand it at all, for example the function myispunct..
But it's ok, sorry if I hurt you.
That might have been someone else reporting dutch then, in which case it's all a big misunderstanding...
Refer to the reference section of this site at:
http://www.cplusplus.com/reference/cctype/ispunct/?kw=ispunct

ispunct is a (dreaded C) function that checks if a character is a punctuation character

Another function that's useful is strtok() - I suggest you look it up - there is a sample program and you can modify the separators (delimiters) as required.

Reading line by line is a good move. But there again you can read the whole file in as a string and analyse that.

If you stay with C functions then one you could easily need is strcmp() - look that up too because it enables you to check whether a token (read word) is equal to the one you are looking for. (C++ <strings> just use a == b)

File input/output is another area - http://www.cplusplus.com/doc/tutorial/files/

(C++ does more or less the same but the reference material isn't quite so handy.)

But it's ok, sorry if I hurt you.

No harm done.
"Reporting" is basically a no-op (i.e., it does nothing).
Don't worry about it.
I modified a little bit your code, because I don't need function myispunct().
I have a class with _my_file attribute:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
bool myClass :: getToken(string& token) {
    token.clear();
    char ch;
    //_my_file.open(_filename.c_str(), ios::in);
    while (_my_file.get(ch) && isspace(ch)); 
    if (_my_file) word += ch;
    if (!ispunct(ch)) 
        for ( ; _my_file.get(ch); word += ch)
            if (isspace(ch) || ispunct(ch)) { 
                if (ispunct(ch)) _my_file.putback(ch);
                break;
            }
     return word.size() !=0;
}

int main() {
    string token;
    myClass object;
    object.setFileName();//receve string input and set _filename, class member

    while (object.getToken(token)){
        cout << '[' << token << "]" <<endl;
    }
    return 0;
}


it does not give me output, but if I uncomment _my_file.open(…), it gives me only first word of text file.

Here is another possibility, to especially demonstrate the use of the reference material and tutorials which form part of this site.

strtok() and a few other functions are easily found there.

Most of the code below is a direct liftout/copy of the examples.

The only job still to be done is filter out which words are keywords and which aren't. As a starting point strcmp() might be useful.

BTW The purists will say this uses C-style strings - strings are read in and converted to char arrays via the c_str() function - and is therefore treasonous. But ignore that because strtok() has a convenient ability to have multiple delimiters which save a lot of stuffing around with streams and strings, but fill your boots with that if you want to.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
#include <iostream>
#include <fstream>
#include <string>

using namespace std;

int main ()
{
    string line;
    char buffer[200];
    char * pch;
    
    ifstream myfile ("keyword_search.txt");
    if (myfile.is_open())
    {
        while ( getline (myfile, line ) )
        {
            strcpy( buffer, line.c_str() );
            pch = strtok (buffer," ,.-");
            
            while (pch != NULL)
            {
                cout << pch << '\n';
                pch = strtok (NULL, " ,.-");
            }
        }
        myfile.close();
    }
    else
        cout << "Unable to open file\n";
    
    return 0;
}


Please don't forget to use the reference and tutorial material.
PS

This is the output of unfiltered words I get on my machine:

keyword_1
word
{
keyword_2
name_1
name_2
keyword_3
name_3
}
keyword_4
:
name_3
keyword_5
name_2
Program ended with exit code: 0
Topic archived. No new replies allowed.