Get data from a Internet file into a char

Pages: 12
This is what I am trying to do now, by using WININET
I have this function, that downloads a file from the Internet. It works fine, it gets me the bytes of the remote file and stores them into a local file.
What I wanna do is modify it and get the downloaded data into a char* variable instead of storing it in a file and return a Boolean.

My function is
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
bool DownloadFile(char* szUrl, char* szPath) {
	HINTERNET hOpen = NULL;
	HINTERNET hFile = NULL;
	HANDLE hOut = NULL;
	char* lpBuffer = NULL;
	DWORD dwBytesRead = 0;
	DWORD dwBytesWritten = 0;

	hOpen = InternetOpen("MyAgent", NULL, NULL, NULL, NULL);
	if(!hOpen) return false;

	hFile = InternetOpenUrl(hOpen, szUrl, NULL, NULL, INTERNET_FLAG_RELOAD | INTERNET_FLAG_DONT_CACHE, NULL);
	if(!hFile) {
		InternetCloseHandle(hOpen);
		return false;
	}

	hOut = CreateFile(szPath, GENERIC_WRITE, FILE_SHARE_READ | FILE_SHARE_WRITE, NULL, CREATE_ALWAYS, NULL, NULL);
	if (hOut == INVALID_HANDLE_VALUE) {
		InternetCloseHandle(hFile);
		InternetCloseHandle(hOpen);
		return false;
	}

	do {
		lpBuffer = new char[2000];
		ZeroMemory(lpBuffer, 2000);
		InternetReadFile(hFile, (LPVOID)lpBuffer, 2000, &dwBytesRead);
		WriteFile(hOut, &lpBuffer[0], dwBytesRead, &dwBytesWritten, NULL);
		delete[] lpBuffer;
		lpBuffer = NULL;
	} while (dwBytesRead);

	CloseHandle(hOut);
	InternetCloseHandle(hFile);
	InternetCloseHandle(hOpen);
	return true;
}


Instead of storing the bytes into a local file I want it to return them as a char* (I suppose char* is used for handling bytes)

Something like this
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
char* DownloadBytes(char* szUrl) {
	HINTERNET hOpen = NULL;
	HINTERNET hFile = NULL;
	HANDLE hOut = NULL;
	char* lpBuffer = NULL;
	DWORD dwBytesRead = 0;
	DWORD dwBytesWritten = 0;

	hOpen = InternetOpen("MyAgent", NULL, NULL, NULL, NULL);
	if(!hOpen) return (char*)"";

	hFile = InternetOpenUrl(hOpen, szUrl, NULL, NULL, INTERNET_FLAG_RELOAD | INTERNET_FLAG_DONT_CACHE, NULL);
	if(!hFile) {
		InternetCloseHandle(hOpen);
		return (char*)"";
	}

	// hOut = CreateFile(szPath, GENERIC_WRITE, FILE_SHARE_READ | FILE_SHARE_WRITE, NULL, CREATE_ALWAYS, NULL, NULL);
	//if (hOut == INVALID_HANDLE_VALUE) {
	//	InternetCloseHandle(hFile);
	//	InternetCloseHandle(hOpen);
	//	return (char*)"";
	//}

	do {
		lpBuffer = new char[2000];
		ZeroMemory(lpBuffer, 2000);
		InternetReadFile(hFile, (LPVOID)lpBuffer, 2000, &dwBytesRead);
		//WriteFile(hOut, &lpBuffer[0], dwBytesRead, &dwBytesWritten, NULL);
		//delete[] lpBuffer;
		//lpBuffer = NULL;
	} while (dwBytesRead);

	//CloseHandle(hOut);
	InternetCloseHandle(hFile);
	InternetCloseHandle(hOpen);
	return lpBuffer;
}


I don't know how to make lpBuffer get the size of the downloaded information, store it and return it. I get an empty char as a result.
Last edited on
You use a fixed buffer for InternetReadFile(), and then you append that data to a larger buffer that you dynamically allocate. The simplest would be:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
char *DownloadBytes(const char * const szUrl)
{
    //Pointer to dynamic buffer.
    char *data = 0;
    //Dynamic data size.
    DWORD dataSize = 0;
    ...
    do
    {
        buffer char[2000];
        InternetReadFile(hFile, (LPVOID) lpBuffer, _countof(buffer), &dwBytesRead);
        //Allocate more space.
        char *tempData = new char[dataSize + dwBytesRead);
        //Copy the already-fetched data into the new buffer.
        memcpy(tempData, data, dataSize);
        //Now copy the new chunk of data.
        memcpy(tempData + dataSize, buffer, dwBytesRead);
        //Now update the permanent variables
        delete[] data;
        data = tempData;
        dataSize += dwBytesRead;
    } while (dwBytesRead);
    ...
    //You must also return the data size because the caller of the function needs to know that.
    //How?  Simplest modification would be to accept a DWORD parameter by reference.
    //I personally would create a simple struct:  struct Buffer { char *data; DWORD size; };
    return data;
}


That should do it.
I'm getting only the first 3 bytes.

tempData gets the information correctly, I checked it. The problem is when I assign it to data*.

The problem is here:
1
2
3
4
5
6
        //Copy the already-fetched data into the new buffer.
        memcpy(tempData, data, dataSize);
        //Now copy the new chunk of data.
        memcpy(tempData + dataSize, buffer, dwBytesRead);
        //Now update the permanent variables
        delete[] data;
Last edited on
The problem is here:

When I check tempData here

 
char *tempData = new char[dataSize + dwBytesRead);


I get it right


When I check tempData here it is always the same value (three bytes)
 
memcpy(tempData, data, dataSize);
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
do {
        // Create a buffer for 2000 bytes
        buffer char[2000];

        // Copy 2000 bytes read into the buffer, dwBytesRead will contain the number of bytes read
        InternetReadFile(hFile, (LPVOID) buffer, strlen(buffer), &dwBytesRead); 

        // Allocate more space for a temporary char array to store the bytes read till now
        // It has the size of the data read till now + the size of the new bytes to add
        // Anyway, this variable is deleted when the loop starts again
        // Copy the already-fetched data into the new buffer. 
        // At this point tempData is O.K. It contains the correct bytes
        // dataSize is 0, dwBytesRead is the count of read bytes. tempData contains the byte read
        char *tempData = new char[dataSize + dwBytesRead];
        
        // Here is where things get weird
        // data (which is empty at the beginning of the loop) is copied into tempData
        // The number of bytes to be copied is dataSize. At the beginning of the loop it is 0
        // so it won't copy anything (copy 0 bytes from an empty char* into tempData
        // Anyway, at this point I don't get it
        memcpy(tempData, data, dataSize);

        // Now copy the new chunk of data. 
        // Now, i really don't understand anything.
        // This is copying buffer into what??? Into tempData + dataSize ???
        memcpy(tempData + dataSize, buffer, dwBytesRead); 

        // Now update the permanent variables 
        // Why am I deallocating data? For the moment it is empty?
        delete[] data; 

        // Well, this is supposed to assign the tempData to data
        // Anyway, with each loop, anything is reset.
        data = tempData; 

        // O.K. I understand this
        dataSize += dwBytesRead; 
} while (dwBytesRead);


What am I getting wrong?
Last edited on
tempData + dataSize is pointer math. Look it up.

If you suspect that things go wrong in the first iteration of the loop, protect the memcpy() call in line 21 with an if (!dataSize).

And except for that little piece it should work just fine. I'll try it out.
Why not use std::string append() method ?

It works with embedded null's and you do not need any memory allocation.
That wouldn't work for binary data, like a downloaded picture, modoran. Right?
My test works just fine:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
#define _WIN32_WINNT _WIN32_WINNT_WINXP
#define NOMINMAX
#include <Windows.h>
#include <WinInet.h>
#include <string>
#include "resource.h"

#pragma comment(lib, "wininet.lib")

HINSTANCE g_inst;

char* DownloadBytes(LPCWSTR szUrl) {
	HINTERNET hOpen = NULL;
	HINTERNET hFile = NULL;
	HANDLE hOut = NULL;
	char* data = NULL;
	DWORD dataSize = 0;
	DWORD dwBytesRead = 0;
	DWORD dwBytesWritten = 0;

	hOpen = InternetOpenW(L"MyAgent", NULL, NULL, NULL, NULL);
	if(!hOpen) return NULL;

	hFile = InternetOpenUrlW(hOpen, szUrl, NULL, NULL, INTERNET_FLAG_RELOAD | INTERNET_FLAG_DONT_CACHE, NULL);
	if(!hFile) {
		InternetCloseHandle(hOpen);
		return NULL;
	}
	do {
		char buffer[2000];
		InternetReadFile(hFile, (LPVOID)buffer, _countof(buffer), &dwBytesRead);
		char *tempData = new char[dataSize + dwBytesRead];
		memcpy(tempData, data, dataSize);
		memcpy(tempData + dataSize, buffer, dwBytesRead);
		delete[] data;
		data = tempData;
		dataSize += dwBytesRead;
	} while (dwBytesRead);
	InternetCloseHandle(hFile);
	InternetCloseHandle(hOpen);
	return data;
}

#define SETRESULT(x) result = (LRESULT)(x); resultSet = true

INT_PTR CALLBACK DlgProc(HWND hDlg, UINT msg, WPARAM wParam, LPARAM lParam)
{
	bool handled = true;
	LRESULT result = 0;
	bool resultSet = false;
	switch (msg)
	{
	case WM_CLOSE:
		EndDialog(hDlg, 0);
		break;
	case WM_COMMAND:
		{
			switch (LOWORD(wParam))
			{
			case IDOK:
				{
					HWND txt = GetDlgItem(hDlg, IDC_URL);
					int size = GetWindowTextLengthW(txt);
					LPWSTR url = new WCHAR[++size];
					GetWindowTextW(txt, url, size);
					char *data = DownloadBytes(url);
					SetDlgItemTextA(hDlg, IDC_DATA, data);
					delete[] data;
				}
				break;
			case IDCANCEL:
				SendMessage(hDlg, WM_CLOSE, 0, 0);
				break;
			}
		}
	default:
		handled = false;
	}
	if (resultSet)
	{
		SetWindowLongPtrW(hDlg, DWLP_MSGRESULT, result);
	}
	return handled;
}

int WINAPI wWinMain(HINSTANCE hInst, HINSTANCE hPrevInst, LPWSTR szCmdLine, int nCmdShow)
{
	UNREFERENCED_PARAMETER(hPrevInst);
	g_inst = hInst;
	DialogBoxW(hInst, MAKEINTRESOURCE(IDD_DOWNLOAD), NULL, &DlgProc);
}


I get the URL contents displayed fine in a multi-line textbox in my dialog box.
Last edited on
_countof? Can I use strlen() instead? My compiler (GNU GCC Compiler) does not find the _countof macro.
If you use C, _countof can be defined like this:

#define _countof(x) (sizeof(x) / sizeof(x[0]))

If you use C++, you can define it like this:

1
2
template<class T, size_t N>
inline size_t _countof(T &array[N]) { return N; }


I think that works.
I have to adapt it to this because I am using CodeBlocks and MinGW and this is the only way I don't get errors
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
char* DownloadBytes(char* szUrl) {
    HINTERNET hOpen = NULL;
	HINTERNET hFile = NULL;
	char* data = NULL;
	DWORD dataSize = 0;
	DWORD dwBytesRead = 0;

	hOpen = InternetOpen("MyAgent", NULL, NULL, NULL, NULL);
	if(!hOpen) return (char*)"";

	hFile = InternetOpenUrl(hOpen, szUrl, NULL, NULL, INTERNET_FLAG_RELOAD | INTERNET_FLAG_DONT_CACHE, NULL);
	if(!hFile) {
		InternetCloseHandle(hOpen);
		return (char*)"";
	}

	do {
		char buffer[2000];
		InternetReadFile(hFile, (LPVOID)buffer, _countof(buffer), &dwBytesRead);
		char *tempData = new char[dataSize + dwBytesRead];
		handle_error(tempData, false);
		memcpy(tempData, data, dataSize);
		memcpy(tempData + dataSize, buffer, dwBytesRead);
		delete[] data;
		data = tempData;
		dataSize += dwBytesRead;
	} while (dwBytesRead);

	InternetCloseHandle(hFile);
	InternetCloseHandle(hOpen);
	return data;
}


I only get 3 bytes (BM6), the first three bytes of the file (a BMP image) when I use strlen. If I use sizeof() I get only one byte
Last edited on
What I see is that the information arrives correctly into buffer.
What I did was insert a function that for each loop stores the buffer into a file.
1
2
3
4
5
6
7
8
9
.
.
.
InternetReadFile(hFile, (LPVOID)buffer, _countof(buffer), &dwBytesRead);
store_into_file(buffer);
char *tempData = new char[dataSize + dwBytesRead];
.
.
.

In the first loop buffer contains "BM6" which are the only bytes that are returned by the function. It is like tempData only get the buffer in the first loop

Now, if I do this
1
2
3
4
5
6
7
8
9
10
.
.
.
InternetReadFile(hFile, (LPVOID)buffer, _countof(buffer), &dwBytesRead);
char *tempData = new char[dataSize + dwBytesRead];
memcpy(tempData, data, dataSize);
store_into_file(tempData);
.
.
.


I get the first tempData as empty, then only contains the first bytes "BM6". It's like tempData always get the first three bytes.

I checked dataSize and it does increase with each loop, so that is not the problem

Wait a moment
My image is 198 Kb (196608 bytes). After dataSize=196662 I get dataSize=BM6
Last edited on
O.K.
Now, when I check _countof(buffer) for each loop I get 2000 (más razón que un santo).
When I try to get _countof(tempData) I get 4. Even when I try to get strlen(tempData), I get 4.

The problem could lie here. Don't know. I am completely lost. The only thing I know is that the chunks of 2000 bytes are stored correctly in buffer. The situation gets out of control with memcpy.
Last edited on
Unless you are pulling text data, strlen() won't work. And sizeof() returns the size of the data type. Your data type is char, hence the result of 1. This is why you need a separate variable to keep track of the data size.

Like I said above, I have no trouble with the code I posted. Try a different compiler. Mine is Visual Studio 2010 Ultimate.
O.K. I think I have a problem with memcpy()
When I try to do this

1
2
3
char* mem1 = (char*)"Text1";
char* mem2 = (char*)"Text2";
memcpy(mem1, mem2, 5);


My application crashes without warning.
Could my O.S. (Windows 7) disallow memory operations?

Anyway, in this case
1
2
3
4
char* str1 = (char*)"Sample string";
char* str2 = new char[50];
memcpy (str2, str1, strlen(str1));
memcpy (str2 + 13, " copy successful", 17);


It does work

Maybe I should use memmove()
Last edited on
You cannot memcpy() into a read-only piece of memory. Your sample is incorrect because both mem1 and mem2 point to read-only memory.

1
2
3
const char *mem1 = "Text1"; //No need for casting.  Why do you do casting here???  Is it the const thing?
char mem2[] = "Text2"; //Now this is writable memory.
memcpy(mem2, mem1, strlen(mem1));
That wouldn't work for binary data, like a downloaded picture, modoran. Right?


Why will not work ? InternetReadFile returns chunk buffer size, just pass that to std::string.append().

Example:
1
2
3
4
5
6
7
8
std::string output;
do {
		char buffer[2000];
		InternetReadFile(hFile, (LPVOID)buffer, _countof(buffer), &dwBytesRead);
		output.append(buffer, dwBytesRead);
	} while (dwBytesRead);

// then access it with output.data() and get the size with output.length() 



Embedded NULL characters are not a problem here.
Last edited on
@modoran. I am trying to download a BMP image, not a text document. Anyway I will try that one too. At this point, anything could or not work.

Question: If I save binary data to string, then string to char*, do I get the original binary data or string operations removes non-alphanumeric characters from it?
No, it doesn't. Wonder if Visual Basic is still mad at me for abandoning it. I hope it will accept me back.

You have my project uploaded here. It is compiled and you have the Release executable if you want to test it. http://www.devimperium.info/DownloadFile.zip

Please, tell me if this works on your computers. The project type is CodeBlocks but I suppose that you can get those files into any IDE.
Last edited on
Pages: 12