Making GREP with C++

Good day everyone, I have been trying to make a cross-platform grep using C++11 standards and the boost library.

Here is what I have so far: http://ideone.com/taP2U3 .

Now for my issues. I am planning to accept up to five flags as command line arguments, -i for case insensitivity, -V to reverse match, and only match files not containing the regular expression. -X for exact line match, -L, to output file names for files with no match, and -l to print file names with matches. For some Grep documentation :http://unixhelp.ed.ac.uk/CGI/man-cgi?grep

Since a regex object has to be initialized with all it's flags, I would have to know which flags to assign before hand, which is not a problem. The problem is once I know which flags to assign how do I assign them all, see below

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
unsigned short iCount=0, VCount=0, XCount=0, LCount=0, lCount=0;
for(unsigned short count=1; count<argv[argc-2];count++)
{
    if(argv[count]=="-i")
    {
        iCount++;
        continue;
    }
    if(argv[count]=="-V")
    {
        VCount++;
        continue;
    }
    if(argv[count]=="-X")
    {
        XCount++;
        continue;
    }
    if(argv[count]=="-L")
    {
        LCount++;
        continue;
    }
    if(argv[count]=="-l")
    {
        lCount++;
        continue;
    }
}
//Here is the problem, I don't know how to assign the flags, since the .assign() function overwrites
//the previous regular expression and flags, so I have no clue how to assign it. 


Also does anyone know how to iterate over all the files in a directory? Since the regular expression "C:\Users\admin\Desktop\*.txt" means search all txt files in the desktop. I already separated the root from the rest of it. Now how do I iterate over the entire directory, so I can now match the provided extension with the rest of the files.

Thanks in advance
Last edited on
While we wait for the filesystem TS to become widely available
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4100.pdf

we can use boost::filesystem.
http://www.boost.org/doc/libs/1_57_0/libs/filesystem/doc/index.htm

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
#include <iostream>
#include <string>
#include <vector>
#include <boost/filesystem.hpp>

namespace fs = boost::filesystem ;

std::vector<std::string> files_in_directory( fs::path path = ".", bool recursive = true )
{
    std::vector<std::string> files ;

    try
    {
        if( fs::exists(path) ) files.push_back( fs::system_complete(path).string() ) ;
        
        if( fs::is_directory(path) )
        {
            using iterator = fs::directory_iterator ;
            for( iterator iter(path) ; iter != iterator() ; ++iter )
            {
                files.push_back( fs::system_complete( iter->path() ).string() ) ;
                if( recursive )
                {
                    for( const std::string& p : files_in_directory( iter->path() ) )
                        files.push_back( std::move(p) ) ;
                }
            }
        }
    }
    catch( const std::exception& ) { /* error */ }

    return files ;
}

int main()
{
    for( const std::string& path_str : files_in_directory( "/usr/local/share", true ) )
        std::cout << path_str << '\n' ;
}

http://coliru.stacked-crooked.com/a/f236e9b7687d5409
Last edited on
@JLBorges , thank you for the help, I never quite got the filesystem library to a full extent, I am very new to boost, about 4 days or so. But I have a question.

Why set recursive as a parameter, does that mean that if recursive was passed false, only the first file in the directory would be placed in the vector?

Also in main you have

1
2
3
for( const std::string& path_str : files_in_directory( "/usr/local/share", true ) )/*I am not sure
what this condition means*/
        std::cout << path_str << '\n' ;


Where is the returned vector stored?
Last edited on
> does that mean that if recursive was passed false, only the first file in the directory would be placed in the vector?

If recursive is false, the returned vector would contain all the the path strings of the top level contents of the directory.

If /A is a directory, which contains B (file), C (directory) and D (link),
files_in_directory( "/A", false ) would return a vector with "/A", "/A/B", "/A/C" and "/A/D"

If the directory /A/C contains files E and F,
files_in_directory( "/A", true ) would return a vector with "/A", "/A/B", "/A/C", "/A/C/E", "/A/C/F" and "/A/D"


> Where is the returned vector stored?

The returned vector is used as a prvalue; it is not specifically stored anywhere.

To use it as an lvalue:
1
2
3
4
5
const auto returned_paths = files_in_directory( "/usr/local/share", true )  ; // 'store' it

std::cout << returned_paths.size() << " entries were returned.\n" ; // use it

for( const std::string& path_str : returned_paths ) { /*...*/ }
@JLBorges, thank you, it was very helpful.
Topic archived. No new replies allowed.