Finding words in a file

This function buildTree that reads an text input (contained in the file named in argv[1]). Then, I am opening the file, reading character by character, if there is a new line ("if (token == '\n')") keep track of this line number and store it in a vector to access it later. Next it breaks it into a sequence of words (using any character other than a digit or an alphabetical symbol as the terminator). This is where I'm getting an error. I am then trying to add each character to a string and then when the token is a digit or an alphabetical symbol, then push the string into a vector so I can access it later. Is my logic right? And also can you help with my error when pushing each word into a vector.

Sorry if confusing

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
BinarySearchTree buildTree (char *argv[]){

    ifstream file;
    vector<char *> V;
    int line = 0;
    vector<int> LineNumber;
    file.open(argv[1],ios::in);
    
    char token;
    string word[] = {};


    if (file.is_open()){
        token = file.get();//reads the next character from a stream
        if (token == '\n')
            line++;
        LineNumber.push_back(line);
        while (token!= ' ' || '0' || '1' || '2' || '3' || '4' || '5' ||'6' || '7' || '8' || '9'){
        //while character is not space, digit, or non-alphabetic character
            word += token;//adds character to string array *error here
        }
        V.push_back(word);//adds word to vector *error here
    }
}
Last edited on
Is my logic right?

Not exactly.

If the only place one reads in a value is line 14, do you think that line should occur within a loop of some sort? Or do you just want to process one character?

Line 18 is incorrect. It could be rewritten equivalently as while(true) which you will notice is an infinite loop. Should be: while (token != ' ' && token != '1' && ...)

If line 22 is supposed to push a word back into a vector, do you think that line should occur within a loop of some sort? Or do you just want to process one word?

As to the error, a string is not a char*, nor is it implicitly convertible to one. I would suggest changing line 4 to vector<string> V;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
#include <iostream>
#include <cctype>
#include <string>
#include <vector>
#include <utility>
#include <fstream>

// breaks it into a sequence of words
// (using any character other than a digit or an alphabetical symbol as the terminator
std::vector<std::string> words_in( std::string line )
{
    std::vector<std::string> result ;
    std::string word ;

    for( char c : line )
    {
        if( std::isalnum(c) ) word += c ; // http://en.cppreference.com/w/cpp/string/byte/isalnum

        else
        {
            if( !word.empty() )
            {
                result.push_back(word) ;
                word.clear() ;
            }
        }
    }
    return result ;
}

// std::pair< int, std::vector<std::string> > => pair( line number, words in that line )
// http://en.cppreference.com/w/cpp/utility/pair
std::vector< std::pair< int, std::vector<std::string> > > lines_and_words_in( std::string path_to_file )
{
    std::vector< std::pair< int, std::vector<std::string> > > result ;

    std::ifstream file(path_to_file) ;
    std::size_t line_number = 0 ;
    std::string line ;

    while( std::getline( file, line ) ) // http://en.cppreference.com/w/cpp/string/basic_string/getline
    {
        ++line_number ;
        const auto words = words_in(line) ;
        // http://en.cppreference.com/w/cpp/container/vector/emplace_back
        if( !words.empty() ) result.emplace_back( line_number, words ) ; // std::move(words)
    }

    return result ;
}

int main()
{
    for( const auto& pair : lines_and_words_in( __FILE__ ) )
    {
        std::cout << "line number " << pair.first << "\n\twords: " ;
        for( std::string word : pair.second ) std::cout << word << ' ' ;
        std::cout << "\n\n" ;
    }
}

http://coliru.stacked-crooked.com/a/9e0c82ebb238a4d2
Topic archived. No new replies allowed.