Adding nodes to linked lists breaks after 10 insertions

Hi guys,

So I've been trying to do a program to read a file and print each repeated word along with the number of its repetitions.

For this, I use a linked list:
1
2
3
4
5
6
typedef struct words {
    char *word;
    size_t repetitions;

    struct words *next;
} Words;



Thing is, when I read the file and start forming words and and saving them in new dynamically allocated nodes, the program breaks. Usually around the 10 - 15 new node. It varies.
This only if I run it directly. If I run it in debug mode works great.

Do you have any input about my code and what could probably cause this?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
Words *count_words(char * input_file_name)
{
    // We'll use a dynamic array to hold the word read. More fun.
    size_t word_capacity, current_capacity;
    current_capacity = word_capacity = 8;       // fits best with my text

    // Construct a new Words linked list. To be returned at end.
    Words *head = malloc(sizeof(Words));
    assert(head != NULL);
    Words *tail = malloc(sizeof(Words));
    assert(tail != NULL);
    head = tail = NULL;

    char ch;               // buffer for every char read from the input file
    size_t char_count;     // will count the characters for each word read

    FILE *input_file = fopen(input_file_name, "r");
    assert(input_file != NULL);
    while ((ch = fgetc(input_file)) != EOF)
    {
        // create a new dynamically allocated array to hold the word
        char *word_read = malloc(word_capacity);
        assert(word_read != NULL);
        char_count = 0;
        bool bad_ch = false;

        if (isspace(ch) || ispunct(ch))
            continue;

        while (isalnum(ch))                 /* read the word */
        {
            if (char_count == current_capacity - 1) // -1 for the \0
            {
                current_capacity *= 1.5;
                assert(realloc(word_read, current_capacity) != NULL);
            }
            word_read[char_count] = ch;    /* store it */
            char_count++;
            ch = fgetc(input_file);
        }
        word_read[char_count] = '\0';      /* cap word with NULL */

        // construct our linked list, ahem map of words
        Words *current = head;
        bool duplicate = false;
        while (current != NULL)
        {

After some print logs seems like the program is crashing on this following line (strcasemp).
It doesn't enter into the if body and it doesnt exit the while loop.
And upon further inspection, my strcasemp is returning all kinds of "random"? values ranging between -22 to 19. WTF?
1
2
3
4
5
6
7
8
9
              ...
            puts("Sunt in while");
            int result = 0;
            printf("current->word is \"%s\"\n", current->word);
            printf("word_read is \"%s\"\n", word_read);
            result = strcasecmp(current->word, word_read);
            printf("strcasecmp de ele e \"%d\"", result);
            if (result == 0)
                ...

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
            // if (strcasecmp(current->word, word_read) == 0)
            {
                current->repetitions++;
                duplicate = true;
                break;
            }
            current = current->next;
        }
        // if we have a new word, create a new node and append it to the list
        if (!duplicate)
        {
            Words *new_node = (Words*)malloc(sizeof(Words*));
            assert(new_node != NULL);
            new_node->word = word_read;
            new_node->repetitions = 1;
            new_node->next = NULL;

            // construct the first node of our list
            if (head == NULL) {
                head = tail = new_node;
            }
            else    // if we have at least one constructed node in our list
            {
                tail->next = new_node;
                tail = new_node;
            }
        }
    }

    fclose(input_file);

    return head;
}
Last edited on
You are not allocating enough memory for the nodes.

(The type passed to sizeof is wrong)
Last edited on
12
13
Words *new_node = (Words*)malloc(sizeof(Words*));
assert(new_node != NULL);
How many bytes you are allocating again?

assert(realloc(word_read, current_capacity) != NULL); NEVER do that. assert statements are removed in release builds. Additionally realloc does not guaranteed to expand in-place and word_read can be invalidated. Correct approach:
1
2
word_read = realloc(word_read, current_capacity);
assert(word_read != NULL);


ch should be int, or you will have problems, as EOF does not fit into char and comparison with it can fail.

If you have problems where program works under debugger, you can solve them by launching program first then attaching debugger later.

Last edited on
A few issues:

1) You never reset current_capacity. This makes it go out of date with the actual capacity of word_read. You should set current_capacity to word_capacity around where you allocate word_read.

2) You need to use the result of realloc. i.e. change

assert(realloc(word_read, current_capacity) != NULL);

to

1
2
word_read = realloc(word_read, current_capacity);
assert(word_read != NULL);


Although to avoid a memory leak in the case of realloc returning NULL, you should assign the result of realloc to a new variable and free the original one. See the example here: http://www.cplusplus.com/reference/cstdlib/realloc/

3) There's a memory leak with word_read. You check isspace(ch) || ispunct(ch) and then continue the loop but without also freeing word_read. I would allocate word_read after this check.
Last edited on
Wow!
Didn't expect this much quality feedback! And so quick!
Thanks guys!

Will try to incorporate changes and see if it works.



//// Thanks a lot guys!

Indeed the problem was at
Words *new_node = (Words*)malloc(sizeof(Words*));
Probably in the hurry I just repeated the cast declaration...

Also implemented your suggested changes:
- to avoid any possible memory leak with realloc changed it to
1
2
3
char *more_chars = realloc(word_read, current_capacity);
if (more_chars != NULL)
    word_read = more_chars;

- declared ch as int as to accommodate EOF. Forgot about this..
- made sure to reset current_capacity before reading a new word
- organized the code a little bit better as to handle any memory leak in the "check isspace(ch) || ispunct(ch) and then continue the loop but without also freeing word_read. I would allocate word_read after this check." situation.
1
2
3
4
5
6
7
8
9
10
11
12
while ((ch = fgetc(input_file)) != EOF)      // ch is int
    {
        // ignore any non-valid characters
        if (!isalnum(ch))
            continue;       // bad char, start again

        // reset current_capacity in the event that it was increased below
        current_capacity = word_capacity;

        // create a new dynamically allocated array to hold the word
        char *word_read = malloc(word_capacity);
        ....

Last edited on
Topic archived. No new replies allowed.