Multiple matches from a regex

Forum

Forum
Beginners
Multiple matches from a regex

Multiple matches from a regex

Hi,

I use boost libraries since my debian gcc version 4.7.2 isn't compiled with the latest c11++ standard.

My question is : why cannot I retrieve all the results of a string out of a regex (i read a lot about tokens, but don't get it, I started c++ coding 3 days ago :


    typedef std::istreambuf_iterator<char> iter;
    string c;
    std::ifstream input_file("myfile.txt");

    iter file_begin(input_file);
    iter file_end;

    static const boost::regex first_regex("(tommy=^[\"](\w)*[\"]$)*");
    boost::smatch str_matches;
    for (iter i = file_begin; i != file_end; ++i)
    c+= *i;

        if (boost::regex_search(c, str_matches, first_regex))
        {
           cout << "ok";
        }

here is myfile.txt content :

tommy="hello";tommy="byebye";

I should get two "ok", but only get one...

Why ??

Thanks,

Larry

Last edited on

Larry2 (34)

This might help :

    std::string text("coucou='abc' coucou abf lol abd");
    boost::regex regex("coucou=[']*(ab[cz])[']*.");

    boost::sregex_token_iterator iter(text.begin(), text.end(), regex, 0);
    boost::sregex_token_iterator end;

    for(; iter != end; ++iter ) {
        std::cout<<regex<<'\n';
    }

But I cannot retrieve the capture only...

cire (8284)

#include <string>
#include <iostream>
#include <boost/regex.hpp>

int main()
{
    std::string text("coucou='abc' coucou abf lol abd");
    boost::regex regex("coucou=[']*(ab[cz])[']*.");

    boost::sregex_token_iterator iter(text.begin(), text.end(), regex, 0);
    boost::sregex_token_iterator end;

    for(; iter != end; ++iter ) {
        std::cout << *iter << '\n' ;
        // std::cout<<regex<<'\n';
    }
}

In your first post, you only search one time. Thus, you can only have one "ok"

Last edited on

Larry2 (34)

Thanks cire,

The cout << regex was a typo of mine, too much code :)

I strengthen my sword on a real world example : a mere html parsing.

<img src=\"myfirst123\"/><img src=\"mysec567\"/>

my code :

boost::regex e("\<img src\=(?=.*\"([a-zA-Z])*([0-9])*\")(?!.*(/.*>))") ;


   string input = "<img src=\"myfirst123\"/><img src=\"mysec567\"/>";

   boost::match_results<std::string::const_iterator> what;
   boost::regex_search(input, what, e);

   if(what[0].matched)
   {
        cout << what.suffix();
   }
   else
    cout << "nope";

I am trying to extract myfirst123 and mysec567 only.

My code gets everything after myfirst123, hence

"myfirst123"/><img src="mysec567"/>

The first one is ok, but obviously not the second one..

When I change .suffix with what[1] or what[2], I only get "7" !!

It becomes so difficult to manage..

Any help ?

Larry

cire (8284)

#include <string>
#include <iostream>
#include <boost/regex.hpp>

int main()
{
    try {
        boost::regex exp( "<img src=(\"[[:alnum:]]*\")/>" ) ;

        std::string input = "<img src=\"myfirst123\"/><img src=\"mysec567\"/>";

        boost::match_results<std::string::const_iterator> what;

        std::string::const_iterator start = input.begin() ;

        while ( boost::regex_search(start, input.cend(), what, exp) )
        {
            std::cout << "Sub-match : " << what[1] << " found in full match: " << what[0] <<  '\n' ;
            start = what[0].second ;
        }
    }
    catch ( boost::bad_expression & ex )
    {
        std::cout << ex.what() ;
    }
}

Larry2 (34)

Wonderful cire,

I had to complete/replace the input.cend with end, and add

std::string::const_iterator end = input.end() ;

to make it compliant with my gcc version on debian (not c11++ enabled).

Many thanks for sharing your knowledge,

Larry

Last edited on

Topic archived. No new replies allowed.

C++

Forum

Multiple matches from a regex