how to use regex_serach with dir->path.extension()

I am trying to create a program which scans a folder and lists all the C++ files when the user enters a 'c' for argv[1] in the program.

for some reason i can't use dir->path.extension() with regex_search on line 84:

(regex_search(dir->path().extension(), extensions)



1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
  /* 





ressource: http://en.cppreference.com/w/cpp/fileysystem

*/

# include <iostream>
# include <locale>
# include <fstream>
# include <regex>

using namespace std;

#include <filesystem>
using namespace  std::experimental::filesystem;

void scan(path const& folder);

//function to scan the current folder and its subfolders

void rscan(path const& folder);

int main(int argc, char* argv[])
{
	//path objects have a value_type of std::basic_string <value_type> where value_type is wchar_t
	// either unicaode or asccii values

	// 1. create a path object that begins with the current folder
	// "." represents the current folder

	path  current = argv[2];
	
	string switch = "";

	switch = argv[1]; 
		 


	cout << "normal scan" << endl;

	scan(current);

	cout << "recursive scan" << endl;

	rscan(current);

	//cout << relative_path << endl; 

	//scan(relative_path); 


	system("pause");
}


// function to scan the current folder 

void scan( string switch , path const& folder)
{


	cout << "\n Scanning current folder: \n";
	directory_iterator dir(folder);   // points to the beginning of the folder
	directory_iterator  end;  // points to the end



	while (dir != end)
	{
		cout << dir->path();

		if (is_directory(dir->status()))
		{


			if (switch == 'c')
			{
				regex extensions("\\.(cpp|c|h|hpp$");
				if (regex_search(dir->path().extension(), extensions))
				{
					cout << "[dir]";

					cout << "ext  - " << dir->path().extension() << endl;

					cout << "filename =  " << dir->path().filename() << endl;


				}		

		}
		++dir;


	}
	cout << endl;

}



void rscan2(path const& folder)
{
	cout << "\n Scanning current folder: \n";
	recursive_directory_iterator dir(folder);   // points to the beginning of the folder
	recursive_directory_iterator  end;  // points to the end



	while (dir != end)
	{
		cout << dir->path();

		if (is_directory(dir->status()))
		{

			cout << "[dir]";

			cout << "ext  - " << dir->path().extension() << endl;

			cout << "filename =  " << dir->path().filename() << endl;

		}
		++dir;


	}
	cout << endl;

}



void rscan(path const& f)
{
	cout << "\nScanning current folder and its subfolders now:\n";
	cout << "Recursive scan, Version 1:\n";

	//create a recursive directory iterator passing it the folder object
	//dir points to the first directory in the folder, the root of the search
	recursive_directory_iterator dir(f);

	//Create another recursive directory iterator passing it no value
	//end points to the end point of the search, end of any folder
	recursive_directory_iterator end;

	//let's go into the folder
	while (dir != end)
	{
		//Print path, if dir or not, and ext, and filename		
		cout << dir->path();

		if (is_directory(dir->status()))
		{
			cout << " [dir]";
		}
		else
		{
			cout << "";
		}

		cout << " ext = " << dir->path().extension() << endl;
		cout << " filename = " << dir->path().filename() << endl;

		++dir;
	}
}


Last edited on
What do you mean by "i can't use dir->path.extension() with regex_search on line 84"? If you're getting an error message, what does it say?

In any case, as written, this program contains an invalid regex.
regex extensions("\\.(cpp|c|h|hpp$");
gcc/libstdc++ throws regex error saying "Parenthesis is not closed."
clang/libc++ throws regex_error saying "The expression contained mismatched ( and )."

I fixed the error by converting the path to u8string, but my code still have lots of other errors.

regex extensions("\\.(cpp|c|h|hpp$");

string ppath = dir->path().extension().u8string();


if (regex_search(ppath, extensions))



for example on line 38:

string switch = "";
i am getting expected an identifier .
Last edited on
Something like this, perhaps:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
#include <iostream>
#include <string>
#include <vector>
#include <experimental/filesystem>
#include <regex>
#include <type_traits>

namespace fs = std::experimental::filesystem ;

// list of paths of all files under the directory 'dir' when the extenstion matches the regex
// file_list<true> searches recursively into sub-directories; file_list<false> searches only the specified directory
template < bool RECURSIVE > std::vector<fs::path> file_list( fs::path dir, std::regex ext_pattern )
{
    std::vector<fs::path> result ;

    using iterator = std::conditional< RECURSIVE, 
                                       fs::recursive_directory_iterator, fs::directory_iterator >::type ;

    const iterator end ;
    for( iterator iter { dir } ; iter != end ; ++iter )
    {
        const std::string extension = iter->path().extension().string() ;
        if( fs::is_regular_file(*iter) && std::regex_match( extension, ext_pattern ) ) result.push_back( *iter ) ;
    }
    
    return result ;
}

// literal '.' followed by one of "cpp", "cc", "cxx", "h", "hh", "hpp" or "hxx"
// note: ?: indicates that it is a non-capturing group
static const std::regex cpp_files( "\\.(?:cpp|cc|cxx|h|hh|hpp|hxx)" ) ;

// non-recursive scan for c++ files: if dir is omitted, current directory is scanned 
std::vector<fs::path> scan_cpp_files( fs::path dir = "." ) { return file_list<false>( dir, cpp_files ) ; }

// recursive scan for c++ files: if dir is omitted, current directory is scanned 
std::vector<fs::path> rscan_cpp_files( fs::path dir = "." ) { return file_list<true>( dir, cpp_files ) ; }


Usage example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
int main()
{
    const fs::path win_dir = "C:/Windows/" ; 
    
    // print files with extension ".log" or ".ini" in the directory win_dir (non-recursive)
    for( const auto& file_path : file_list<false>( win_dir, std::regex( "\\.(?:log|ini)" ) ) )
         std::cout << file_path << '\n' ;
    std::cout << "\n----------------------------------------------\n" ;

    // print files with extension ".log" in the directory "C:/Windows/Logs" (recursive)
    for( const auto& file_path : file_list<true>( win_dir/"Logs", std::regex( "\\.(?:log|ini)" ) ) )
        std::cout << file_path << '\n' ;
    std::cout << "\n----------------------------------------------\n" ;

    // print all c++ files the current directory (non-recursive)
    for( const auto& file_path : scan_cpp_files() )
        std::cout << file_path << '\n' ;
    std::cout << "\n----------------------------------------------\n" ;

    const fs::path vc_crt_src = "C:/Program Files (x86)/Microsoft Visual Studio 14.0/VC/crt/src" ;
    // print all c++ files in the directory vc_crt_src (recursive)
    for( const auto& file_path : rscan_cpp_files(vc_crt_src) )
        std::cout << file_path.parent_path().stem() / file_path.filename() << '\n' ;
    std::cout << "\n----------------------------------------------\n" ;
}

http://rextester.com/SAY51073


for( iterator iter { dir } ; iter != end ; ++iter )
{
const std::string extension = iter->path().extension().string() ;
if( fs::is_regular_file(*iter) && std::regex_match( extension, ext_pattern ) ) result.push_back( *iter ) ;
}


how were u able to dereference the iterator iter using the "*" like *iter , i have defined a similar iterator using

recursive_directory_iterator dir(f);

but i cannot dereference the iterator "dir" using *dir.

In my debugger i am gettting no operator "*" matches these operands. I wish i can include a picture to show this.
> i have defined a similar iterator using
> recursive_directory_iterator dir(f);
> but i cannot dereference the iterator "dir" using *dir.

I am not able to reproduce the error; there should be no error because every iterator is dereferenceable.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
int main()
{
    namespace fs = std::experimental::filesystem ;

    const fs::path f = "C:/Windows/Logs" ; 
    const std::regex re( "\\.(?:log|ini)" ) ;

    fs::recursive_directory_iterator dir(f);
    fs::recursive_directory_iterator end;

    for( ; dir != end ; ++dir )
    {
        const std::string extension = dir->path().extension().string() ;
        
        if( fs::is_regular_file( *dir ) && std::regex_match( extension, re ) ) 
            std::cout << *dir << '\n' ;
    }
}

http://rextester.com/BXDS86094


> I wish i can include a picture to show this.

You could try out the code on rextester (Visual Studio): http://rextester.com/l/cpp_online_compiler_visual
and then post the link.

You may want to consider changing the name of the iterator from 'dir' to something else; say 'dir_iter'
i am just curious , what do we get if we dereference a recursive_directory_iterator , the name of the file path ?
The value type of the filesystem library iterators is directory_entry
http://en.cppreference.com/w/cpp/filesystem/directory_entry

When we dereference the iterator we get const directory_entry&
http://en.cppreference.com/w/cpp/filesystem/recursive_directory_iterator/operator*

Code like is_regular_file(*iter) or std::cout << *iter, which expects a path works
because there is an implicit conversion from const directory_entry to const path&
http://en.cppreference.com/w/cpp/filesystem/directory_entry/path
Topic archived. No new replies allowed.