Complex boost regex does not match string

I have the following simple program that aims to replace all lines starting with -- or with space and -- with a blank space.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
/*
 *
 * Read text files and remove all lines starting with -- or <space>*--
 * Clean text is passed to cout.
 * 
 * Compile and test:
 * clear && clang++ -lboost_regex -Wall -std=c++11 comment_regex.cpp -o so_question && ./comment_regex tst.sql
 */


#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <boost/regex.hpp>
#include <boost/algorithm/string/replace.hpp>

int main(int argc, char *argv[]) {

	// Read file to stringstream
	std::ifstream file( argv[1] );

	if ( file )
	{
		std::stringstream buffer;

		buffer << file.rdbuf();

		file.close();

		// Create a string variable to apply boost::regex
		std::string readText;
		readText = buffer.str();

		// Create flags for regex matching
		// https://searchcode.com/file/35899467/libs/regex/example/snippets/regex_search_example.cpp

		// Define regex to match lines with comments
		static const boost::regex re_comment("^([[:space:]]{1,}|)--.*(\\n|\\r|)", boost::regex::extended);

		// Replace via regex replace
		std::string result = boost::regex_replace(readText, re_comment, " ");

		// Show clean text
		std::cout << "\nClean text:\n" << result << std::endl;

		return 0;
	}
	
}


The utilised regex:
 
^(\s*|)-{2}.*(\n|\r|\z)

works when tested on regex101: https://regex101.com/r/MOx6t4/8 However, when compiled the program returns empty string.
^ and $ do not match line ends by default — they match the beginning and end of the entire string.

Match on whatever begins and ends your lines in a separate grouping from the line content. Then replace the line ends in your substitution expression.

Also, be careful how you match stuff.

 • “{1,}” is the same as “+” — matches at least one character.
   If I understood you correctly, you indicated leading spaces are ok?

 • “.*” is greedy — it will match to the end of the file if it can.
   Either be specific about what you match (instead of any character),
   or turn off greedy by adding a question mark: “.*?”.

1
2
3
4
5
		// Define regex to match lines with comments
                static const boost::regex re_comment("(^|\n)(\\s*--[^\n|$]*)(\n|$)", boost::regex::extended);

		// Replace via regex replace
		std::string result = boost::regex_replace(readText, re_comment, "\\1 \\3");

Hope this helps.
Duthomhas, your solution works as advertised. Thank you very much.
Topic archived. No new replies allowed.