Downloading a webpage as .html

Forum

Forum
General C++ Programming
Downloading a webpage as .html

Downloading a webpage as .html

Hello,

I have very little experience with network or socket related programming. I'm hoping someone can give me some resources to study.

There's a website I frequent which streams daily .mp3's via jplayer. A new .mp3 is uploaded every 24 hours, and the old one expires. Access to the stream requires an account - which I have.

The actual .mp3's are hosted on cloudfront, but they're easy to download if you're using chrome (right click > view page source) - the URL to the direct file is easy to find.

I'd like to write a program which automates the process for me. However, this entails logging into the website and either downloading the webpage as an .html (to extract the URL), or to directly read the source code from the page (the way chrome does it apparently) to find the file's location. In addition, it should download the file once it has the URL.

If it makes a difference, the login form posts to this:

<form method="post" action="https://[...].com/users/login_ajax" class="ajaxForm">

I've replaced the domain with

[...]

I'm also using windows.

Thanks for any help. I've heard of curl, not sure if that's what I need, or if there's a more lightweight solution. Will any of this require HTTP requests or cookies? I don't know much when it comes to this...

Last edited on

Duthomhas (13126)

Get libcurl.
http://curl.haxx.se/libcurl/

Then browse the examples.
http://curl.haxx.se/libcurl/c/example.html

Hope this helps.

xismn (961)

Hi Duoas, thanks for the help.

I'm using libcurl now and I'm almost done with the project. What a good recommendation! There is one final problem I can't seem to circumvent. Do you have a lot of experience with libcurl? If so, maybe you can help me further.

I'm able to log in to the website and find the .mp3's direct URL using my program, but I can't seem to "download" it. I can't find much documentation on this matter either because everyone seems to assume I want to download a webpage and parse the HTML source (which I don't want to do, I just want to download an .mp3).

How might I do this using libcurl?

Thanks.

Hippogriff (727)

I don't use Libcurl. But these seem to be what you are after?

http://stackoverflow.com/questions/1636333/download-file-using-libcurl-in-c-c
http://stackoverflow.com/questions/3471122/saving-a-file-using-libcurl-in-c?lq=1

So looks like a combination of CURLOPT_URL -- CURLOPT_WRITEFUNCTION -- CURLOPT_WRITEDATA.

Lots more examples online.

xismn (961)

Hi James2250,

Unfortunately, the examples you provided are the same ones I exhausted first. I should have provided some of my code to begin with.
I'll tinker with it some more today, and post any results.

EDIT - it seems my problem has specifically to do with trying to download a file from cloudfront, something about authentication and the HTTPS protocol. Ugh...

Last edited on

Topic archived. No new replies allowed.

C++

Forum

Downloading a webpage as .html