string.find method problem

hey guys. this segment of code tries to store input from only inside <body>stuff to be strored </body> part. see the comments below for problem.
1
2
3
4
5
6
7
8
9
10
  string html_input;

    cout<<"Enter html input: ";
    getline(cin,html_input,'\n');
    int body_begin=html_input.find("<body>",0);///body_begin index to be used later
    cout<<"body_begin: "<<body_begin<<endl;
    int body_end=html_input.find("</body>",(body_begin));///body_begin acts as zero.why???
    cout<<"body_end: "<<body_end<<endl;
    string only_body=html_input.substr((body_begin+6),(body_end-6));
    cout<<only_body;


if my input is:
 
<html><body>games</body></html>///  "games</body" is printed which apparently shouldn't happen 


in code above, while finding body_end, body_begin seems to act as being 0. does
.find method's arguments' value need to be initialized during compile time?
any help would be appreciated!
It doesn't act as zero if it is not zero already. Infact you don't need an argument there if you don't want to put one. When you look at the parameters for std::string::find, you will notice that the last parameter is a default parameter, i.e. it already has a value of zero. This means that if you don't supply a different value, find function will start searching for the string you are looking for at index zero.

In your code, body_begin has the value 6 so the find function will start at index 6 to find the next string you are looking for.

For the substr method, the last parameter is actually asking for the number of characters to read starting at the position specified in the first parameter.

Not sure why std::string methods are defined in such a weird way, but atleast now you know

http://cpp.sh/9ydo
Last edited on
In your code, body_begin has the value 6 so the find function will start at index 6 to find the next string you are looking for.

that is what i want the program to do but it is not working that way.


actually i usually used your approach of finding </body> from index 0 which worked as well. but i was thinking to search for </body> starting from the index of <body> as it was already found and it would solve the trouble of searching through the string from the beginning again, particularly if the string is quite big and <body> first appear a towards the end of the html_input string.

why isn't the original code working?


One of the problems I see with the code is the use of an int to hold the return value from std::string.find(). These functions return a size_t (std::string::size_type) which can usually hold a value much larger than an int. And you also seem to be unaware that on "failure" this function returns std::string::npos, which is the largest value that can be held in a size_t. You should also insure that your body_begin is less than the length of the string. Also don't forget, in your substr() call, to get the number of characters you need to also take into account where the substring starts.

string only_body = html_input.substr((body_begin + 6),(body_end - 6 - body_begin));

@jlb changed the 'int' to 'size_t'. still didn't work.

size_t body_begin=html_input.find("<body>",0);///changed int to size_t
size_t body_end=html_input.find("</body>",(body_begin));///changed int to size_t. still didn't work!

It works for me:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
#include <iostream>
#include <string>

using namespace std;

int main()
{
   string html_input = "<html><body>games</body></html>";

   size_t body_begin = html_input.find("<body>", 0);///body_begin index to be used later
   if(body_begin + 6 > html_input.length())
   { // Prevent a possible range error.
      cerr << "Error \"<body>\" not found.\n";
      return(1);
   }
   size_t body_end = html_input.find("</body>", (body_begin));
   string only_body = html_input.substr((body_begin+6), (body_end - 6 - body_begin));  //// Notice this change!

   cout << html_input << endl;
   cout << "body_begin: " << body_begin << endl;
   cout << "body_end: " << body_end << endl;
   cout << only_body;

}


Output:

1
2
3
4
<html><body>games</body></html>
body_begin: 6
body_end: 17
games


a great misunderstanding! i initially thought after using 'body_begin' as 2nd argument
size_t body_end = html_input.find("</body>", (body_begin));
body_end would contain distance from body_begin, not the distance from zero(0).
thanks a great deal! :O
Topic archived. No new replies allowed.