Spider Testing

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
#include <Windows.h>
#include <stdio.h>
#include <chilkat/CkSpider.h>

int main()
    {
    //  The Chilkat Spider component/library is free.
    CkSpider spider;
    const char *url = "http://www.chilkatsoft.com/crawlStart.html";
    const char *domain = "www.chilkatsoft.com";
    
    printf("Started");//point 1

    //  The spider object crawls a single web site at a time.  As you'll see
    //  in later examples, you can collect outbound links and use them to
    //  crawl the web.  For now, we'll simply spider 10 pages of chilkatsoft.com
    spider.Initialize(domain);

     printf("Started");//point 2

    //  Add the 1st URL:
    spider.AddUnspidered(url);


    //  Begin crawling the site by calling CrawlNext repeatedly.
    long i;
    for (i = 0; i <= 9; i++) {
        bool success;
        success = spider.CrawlNext();
        if (success == true) {
            //  Show the URL of the page just spidered.
            printf("%s\n",spider.lastUrl());
            //  The HTML is available in the LastHtml property
        }
        else {
            //  Did we get an error or are there no more URLs to crawl?
            if (spider.get_NumUnspidered() == 0) {
                printf("No more URLs to spider\n");
            }
            else {
                printf("%s\n",spider.lastErrorText());
            }

        }

        //  Sleep 1 second before spidering the next URL.
        spider.SleepMs(1000);
        printf("End");
    }


    }


Hi, I'm still new to C++ but I do have some basic in programming in Java. I'm trying to get this example to work but I kept on getting the xxx.exe has encountered a problem and needs to close... error from windows. So there is obviously a problem in there but I could not see it ^^ I don't get any error when compiling it. So any clue for where is the problem.

Extra info, the printf("Started"); is my way to see where the code start having problem. It still shows Started at point 1 but not a point 2. My best guess is that the problem start there.

Using - MinGW to compile and run it. Windows Xp SP3 (for testing purposes ^^)

Edit:
I have a bad feeling that this could be a Windows related problem, I will try the code on other windows platform to see if it works. Well, if it is hopefully somebody here could suggest a way to fix it.
Last edited on
Try using, instead of printf,

std::cout <<

And see if that makes a difference. Mind you if you use that method, replace the parenthesis with quotation marks.
Last edited on
Ispil - just tried it, it still give the same error as before. Any more ideas? The original code is here http://www.example-code.com/vcpp/spider_begin.asp
Does this work?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
long i;
    for (i = 0; i <= 9; i++) {
        bool success;
        success = spider.CrawlNext();
        if (success == true) {
            //  Show the URL of the page just spidered.
            printf("%s\n",spider.lastUrl());
            //  The HTML is available in the LastHtml property
        }
        else {
            //  Did we get an error or are there no more URLs to crawl?
            if (spider.get_NumUnspidered() == 0) {
                printf("No more URLs to spider\n");
            }
            else {
                printf("%s\n",spider.lastErrorText());
            }
            break; // break the for() loop
        }

        //  Sleep 1 second before spidering the next URL.
        spider.SleepMs(1000);
    }
    printf("End");

@Catfish3

Done with the codes and windows XP. There must be something I've missed or windows issues that caused it. Because I've tried the codes on linux with the same compiler and it works fine. I will look more into it in the not so near future ^^ But anyone that find the solution 1st, please share.

Here is the codes (thanks to modoran) http://www.cplusplus.com/forum/windows/96559/ if you want to tests it.
Last edited on
Topic archived. No new replies allowed.