Should dynamically allocated arrays be null terminated

I have just had a nightmare debugging a call to Postgresql's PQexecparams() which requires a char** array. I wrote a little class to handle updates which would iterate through a vector<homebrew_struct> putting any values that needed updating in a char** array and building an appropriate string to pass to PQexecparams.

For the most part this class has worked flawlessly and is used all over my code, but a couple of days ago I ended up in a situation where after one call where it worked, a second call to PQexec() was throwing a seg fault.

gdb backtraced it to resetpqexpbuffer (which is part of libpq) but it turned out to be a char** array that was not null terminated.

Once I did something similar to this, it worked fine


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
#include <string>
#include <vector>
#include <cstring>
#include <iostream>

std::vector<std::string> create_array(int nRecords) {

	std::vector<std::string> myStrings;
	
	for(int i =0; i <= nRecords; ++i)
		myStrings.push_back("String " + std::to_string(i +1));

	return(myStrings);
	
}

int main() {

	char 		**myCharArray 	        = nullptr;
	const int 	nRecords 		= 10;
	
	std::vector<std::string> vecStrings = create_array(nRecords);
	
	//+1 FOR NULL TERMINATOR
	myCharArray = new char *[nRecords +1];
	
	//BUILD CHAR ARRAY
	int count = 0;
	for(std::string &it : vecStrings) {
		myCharArray[count] = new char[it.size() +1];
		strncpy(myCharArray[count], it.c_str(), it.size() +1);
		++count;
	}
	
	myCharArray[nRecords] = nullptr;
	
	//OUTPUT
	for(int i =0; i <= nRecords; ++i) {
		std::cout << myCharArray[i] << std::endl;
	}
	
	//CLEAN UP
	for(int i =0; i <= nRecords; ++i) {
		delete[] myCharArray[i];
		myCharArray = nullptr;
	}
	
	delete[] myCharArray;
	myCharArray = nullptr;
	
}


My question is: Should all ** arrays be terminated in this manner? I do a lot of work with Ncurses and note that in the tutorials its ITEM** array and FIELD** array are terminated in a similar manner.
Should all ** arrays be terminated in this manner?


No. If a function requires an array to end in a zero, that should be documented in the function documentation.

It's common for char arrays to use a zero to mark the end of data; these are commonly referred to as C strings, or C style strings. Nonetheless, that should be documented. Often, documentation isn't very good. I see that https://www.postgresql.org/docs/9.1/static/libpq-exec.html doesn't specify that the parameter is to be zero terminated; it is describing a C API and using the word "string" a lot, so maybe they thought that didn't need mentioning. Your own experience is clear evidence that is was worth mentioning.

If you see something accepting a char pointer, be suspicious. Maybe the documentation isn't very good. Check how it's supposed to know how long the array of data is. If the parameter is called "string" or something similar, be very very suspicious.
No, char** arrays don't need to be null terminated.
It is usually done so (especially for char**) so you can iterate over the array until NULL.
E.g. char** argv is null terminated by the standard specifically so you could do something like this:
1
2
3
4
while (*argv) {
    //foo
    ++argv
}

This isn't considered good practice at all, hence why argv is usually declared as const.
However the bottom line is sure: you don't have to null terminate arrays!
______________________________________________________

As for your code there are two immediate problems that can cause a seg fault.
The first (which is actually causing the seg fault in this code) is this bit:
1
2
3
4
for(int i =0; i <= nRecords; ++i) {
	delete[] myCharArray[i];
	myCharArray = nullptr;
}

You're setting myCharArray to null but in next iteration you're using myCharArray[i].
Of course that line should be myCharArray[i] = nullptr;

I'll assume that was just a typo, because the second one (which can potentially cause problems and which may have actually been the culprit in your original code) is much more elusive.

First, your create_array() is actually creating eleven strings because you're using <= in
 
for(int i =0; i <= nRecords; ++i)
.
Not ten. Hence myCharArray will also hold 11 chars. When you do:
 
myCharArray[nRecords] = nullptr;

You're actually erasing a valid 11th pointer to a string, thus causing a memory leak and potentially opening a segfault opportunity in all subsequent for's which use <=!
The reason you did not get a segfault this time is because std::cout and delete[] are leient toward accepting a NULL pointer.
If you had decided to do something with myCharArray[i] e.g.:
 
myCharArray[i][0] = 'A';

You would have gotten a segfault.

Recommendation: Stop using <= in for loops which start from 0.

Edit: I had to reformat this post a bit so it's easier to read.

Last edited on
It was the last element that needed terminating i.e. declaring the array with [nRecords +1] elements and then setting the last element to '\0'

 
          myCharArray[nRecords] = nullptr;


I had already left space to terminate the individual C-strings and left it to strncpy() to do the terminating (which it will if you leave it enough space) but, for want of a better way of explaining it, not having an empty record at the end of the list was what was causing the problem.

Is there anything inherently evil in erring on the the side of caution and terminating all such arrays in this manner?
Many things are done by convention; but as Repeater said, it should be in the documentation.

That said, char* and arrays of char* do tend toward the convention that:

  • Each char* is always a string unless explicitly documented otherwise:
    therefore it should terminate with a null character

  • An array of char* will often follow the same convention: the final element is a nullptr.

Hope this helps.
@Duthomhas, thanks for the clarification, this is pretty much the information that I am seeking.

@Ihatov, Yes, the "delete" was a typo (as was the "<=", in the production code I am using "!="). I am pretty sure that in this case libpq is using

1
2
3
4
while (*argv) {
    //foo
    ++argv
}


Thanks for the advice, all.
Topic archived. No new replies allowed.