Searching a MultiMap with Regex

closed account (18hRX9L8)
Hello all,

I have a multimap with over 300k entries defined like so:
std::multimap<std::string, std::string> filedata;

Using the following code and std::multimap::equal_range, I am able to successfully search for a word in the multimap and get needed data:
1
2
3
4
5
6
7
8
// Data with strings.
data = std::vector<std::string>();
// Get iterators to matched pairs.
std::pair <std::multimap<std::string, std::string>::iterator, std::multimap<std::string, std::string>::iterator> dat = filedata.equal_range(word);
// Go through each matched pair and get needed info.
for (std::multimap<std::string, std::string>::iterator iter = dat.first; iter != dat.second; iter++) {
    data.push_back(iter->second);
}


Now, I would like to search the multimap using regular expressions (EX: std::regex("\\b[a-z][a-e]h\\b")). What is the fastest way to do this? Example code may look like:
std::pair <std::multimap<std::string, std::string>::iterator, std::multimap<std::string, std::string>::iterator> dat = filedata.equal_range_with_regex(std::regex("\\b" + word + "\\b"));. Pseudo-code / algorithms will be enough.

Thank you so much,
~Usandfriends
Last edited on
By fastest, do you mean getting the work done fast or performance? Whatever people post, you know you'll have to profile to confirm what is or isn't the fastest approach in terms of performance. Can you do it on your own at all? If not, then first you need to do it somehow. Then worry about optimizing it.

By the way, typedef is your friend and mine too. It'll make your example code much easier to read.
1
2
typedef std::multimap<std::string, std::string> TStrStrMM;
std::pair<TStrStrMM::iterator, TStrStrMM::iterator> dat;


I don't really see how equal range could work with a regex, to be honest with you. A regex match is not the same thing as equivalence between too strings, and the algorithm is designed to work on a sorted range.
Last edited on
closed account (18hRX9L8)
Hi kempofighter,

I meant getting the work done fast (chrono). Thank you for your input and suggestions! Like you said, equal_range was not really useful in this case and I was forced to do it the old-fashioned, brute-force way and and iterate through the whole multimap and use std::regex_match to check each pair...

Thank you for your time,
~Usandfriends
Last edited on
Topic archived. No new replies allowed.