Get Images From a Web, VB13 (Best of how

Forum

Forum
Windows Programming
Get Images From a Web, VB13 (Best of how
Page 2

Get Images From a Web, VB13 (Best of how to include LibcURL, POCO, Google Image Search API)

Pages: 12

I can answer that. A browser's user agent is a string that says exactly what browser it is. A webpage can look at the user agent to see what browser is trying to view the website.

Here is the user agent for the latest version of Firefox:

Mozilla/5.0 (Windows NT 6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.1

And here is the user agent for the latest version of Google Chrome:

Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36

And here is the default user agent for curl:

curl/7.9.8 (i686-pc-linux-gnu) libcurl 7.9.8 (OpenSSL 0.9.6b) (ipv6 enabled)

There is a lot of irrelevant stuff in there, but notice that at some point, it says Firefox, Chrome, or curl. Now, let's say you are a website owner, and you don't want any bots to gain access to your website. One way you could do this is to check the user agent of whoever is making an HTTP request, and deny it if it is curl's user agent (or display a different page, or something else). Now, let's say you, the bot programmer, need to get around this, and make your bot look like it's a normal browser. Curl can do this in one function call. Simply put this in your code...

curl_easy_setopt(curl, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.1");

...and curl will now use Firefox's user agent! That website will now think you are trying to view it from Firefox, and it will show you the normal HTML, instead of denying the request, or showing you an alternate version for bots.

Be careful with this, however. Many websites do not tolerate user agent spoofing, and may ban your account (if you have one) or your IP address if they find out that you are spoofing your user agent. Others may not care. Still others won't display different webpages to different user agents, so there is no point in spoofing the user agent in the first place.

ALL OF THIS IS PROBABLY IRRELEVANT, HOWEVER. If I had to guess, you copied the code from this curl example: http://curl.haxx.se/libcurl/c/simple.html. This will give you the HTML of this page: http://example.com/. This is not the same as the HTML of the iana.org page that you posted. I guess you clicked the "more information" button on example.com, and thought that page was what curl was looking at. It is not - you told curl to give you the HTML for example.com, and it did. If you go to example.com and view its source from your browser, you will see it is the exact same as what the curl program prints out. There are no problems here, and there is no need to change your user agent. You were just looking at the wrong webpage in your browser.

Hope this helps. If you have any other questions about curl, feel free to ask. I have been struggling with this library for the past year, so I hope I can save someone else that hardship. It is a pain to get working, but once you do get it to work, it works really well.

kkhalaf (35)

Wyboth,

Impressive as always. My focus went a little bet far from libcURL within this research due to urgent incoming projects. But this page is all what anyone needs for working on libcURL. You are being the savior and we are very thankful, truly. Hats off sir.

Topic archived. No new replies allowed.

Pages: 12

C++

Forum

Get Images From a Web, VB13 (Best of how to include LibcURL, POCO, Google Image Search API)