Weird problem with matrix structure of points of pointers

So I have written this method making use of double** structures, being used for storing values in a matrix like structure.

Like this:

1
2
3
4
5
6
7
double **allocDouble(size_t x,size_t y){
  double **ret = new double*[x];
  for(size_t i=0;i<x;i++){
    ret[i] = new double[y];
  }
  return ret;
}


Then I run this method calculating a lot of stuff through many iterations. At every iteration storing the values in my matrix M. And it works fine, however when I start doing something called bootstrapping, where I sample rows with replacement from my input data, which is in the shape of a matrix also, I sometimes get this error, where the entries of my matrix M, are claimed not to be allocated, even though I have used these entries through many iterations.

So does anyone have any idea why this can be? Should one not overwrite these entries too many times? Or can a double be too large for an entry?

It also has to be said that with other input data the method runs fine. The method might run fine with one seed and not with another seed, for the same data. Implying that it somehow depends on which rows are sampled with the bootstrap.

Or I am sorry if this is far too vague? Do you have any idea how I could look further into this issue?

The error I get is:

==83081== Invalid write of size 8
==83081== at 0x403147: main
==83081== Address 0x48d012a8 is 16 bytes after a block of size 40 alloc'd
==83081== at 0x4C2E80F: operator new[](unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==83081== by 0x402CD0: main
Last edited on
You can read or write as often as you want as long as you do it at the right location.
Yo have allocated memory for double so you can write doubles into it.
It's possible that you write at the wrong location but to know we need to see the code.
Is it an assignment where you have to use ?
Hi thanks for the answer!

When you say that you write at the wrong location, what exactly do you mean?
Like going out of bounds with the indexes?

I can show you the boostraping function:

1
2
3
4
5
6
7
8
9
10
11
12
void bootstrap(type1 dOrg, type1 &d, double** M_orgOrg, double** M_org, double** M, int n) {
  for(int j=0;j<dOrg.rows;j++){
    int row = std::rand() % dOrg.rows;
    for(int k = 0; k < n; k++) {
      M[j][k] = M_orgOrg[row][k];
      M_org[j][k] = M_orgOrg[row][k];
      if(k<=2){
	d.values[j][k] = dOrg.values[row][k];
      }
    }
  }
}


d and dOrg are structs with a "values" field which is a double**.
I call this function multiple times, throughout the method.

I might be able to show more of the code later.
...like going out of bounds with the indexes?

That's what Valgrind's diagnostic means - you've written 8 bytes of memory at an offset of 16 bytes past-the-end of a 40-byte block (i.e., offset 56 from the array's first element).

==83081== by 0x402CD0: main
Check your main function for uses of operator new[] allocating an array of one or two unsigned long.

allocDouble is quite likely to result in a memory leak.

What's wrong with just a single array, preferably in the form of a single std::vector<double>?
It would be more cache-friendly, more compatible with the standard-library, more exception-safe, and won't require you to manage memory by hand.
Last edited on
You can try my array class if you want.
debug.h
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
#ifndef DEBUG_H
#define DEBUG_H

#include <iostream>
#include <string>

struct Error
{
  const char* mMsg;
  const char* mFile;
  unsigned mLine;

  Error(const char *msg, const char *file, unsigned line) :
    mMsg(msg), mFile(file), mLine(line)
  {
  }
};

std::ostream& operator<< (std::ostream& oss, const Error& err)
{
  std::string line(75, '-');
  oss << line
      << "\nERROR: " << err.mMsg
      << "\nat File: " << err.mFile
      << "\nat Line: " << err.mLine
      << "\n" << line;

  return oss;
}

#define CHECK(cnd) {if (!(cnd)) throw Error(#cnd, __FILE__, __LINE__);}
#define ASSERT(cnd) CHECK(cnd)
#define REQUIRE(cnd) CHECK(cnd)
#define ENSURE(cnd) CHECK(cnd)

#define EEROR(msg) {throw Error(msg, __FILE__, __LINE__);}

#endif 

Array2D.h
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
#ifndef ARRAY2D_H
#define ARRAY2D_H

#include "debug.h"
#include <vector>

template<class T>
class Array2D
{
public:
  Array2D(const int rows, const int cols)
  {
    REQUIRE(rows > 0);
    REQUIRE(cols > 0);

    m_ColCount = cols;
    m_RowCount = rows;
    m_Data.resize(rows * cols);
  }
  const T& get(int row, int col) const
  {
    REQUIRE((row >= 0) && (row < m_RowCount));
    REQUIRE((col >= 0) && (col < m_ColCount));    

    return m_Data[row * m_ColCount + col];
  }
  void set(const int row, const int col, const T val)
  {
    REQUIRE((col >= 0) && (col < m_ColCount));
    REQUIRE((row >= 0) && (row < m_RowCount));

    m_Data[row * m_ColCount + col] = val;
  }
  int rows() const
  {
    return m_RowCount;
  }
  int cols() const
  {
    return m_ColCount;
  }
private:
  std::vector<T> m_Data;
  int m_ColCount, m_RowCount;
};
#endif 


main.cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
#include <iostream>
#include <cstdio>
#include <ctime>
#include <cstdlib>
#include "Array2D.h"

using namespace std;

template<class T>
void Print(const Array2D<T> &array)
{
  for (int row = 0; row < array.rows(); row++)
  {
    for (int col = 0; col < array.cols(); col++)
    {
      cout << array.get(row, col) << "\t";
    }
    cout << "\n";
  }
}

void FillArray(Array2D<int> &array)
{
  for (int row = 0; row < array.rows(); row++)
  {
    for (int col = 0; col < array.cols(); col++)
    {
      array.set(row, col, rand() % 10);
    }
  }
}

void Run()
{
  Array2D<int> array(10, 10);

  srand(time(nullptr));
  FillArray(array);
  array.set(1, 1, -1);
  Print(array);
  // Test exception
  cout << "\n\n** Going to use invalid index for row **\n\n";
  array.set(10, 1, 999);
}

int main()
{
  try
  {
    Run();
    system("pause");
    return 0;
  }
  catch (Error& err)
  {
    cerr << err << "\n\n";
  }
  catch (const exception& ex)
  {
    cerr << ex.what() << "\n\n";
  }
  catch (...)
  {
    cerr << "Unknown error." << "\n\n";
  }

  system("pause");
  return -1;
}


*** CAUTION ***
This is just a quick hack and not fully tested.
Hi thanks a lot for the input! And many thanks for the code

Two questions:

1.
But would that be as fast as doing this pointer approach to matrices? I mean this method is potentially going to run many many iterations.

2.
I have discovered that I have some call to strdup in my method (I plan on removing these), some of which I first deallocate, at the end of this method. Might it be that by accident these memory addresses somehow conflicts, so that the memory malloced clashes with that of the other pointers by accident?
1.
But would that be as fast as doing this pointer approach to matrices? I mean this method is potentially going to run many many iterations.

That's a good question. The best way to find out is to do a simple test using clock()

You can use the string or unique_ptr when dealing with strdup. No more worries about memory leaks or other problems.
1) compilers have gotten better but Ive tried it all at one point or another (some of it quite a few years back). Ive tried valarray. Ive tried vector. Ive tried double (**) pointers.

The fastest matrix algebra I have been able to do was just a pure POD double* indexed manually as if a 2d array. Vectors come very close, but they have some overhead that rears up when the matrices get big or the iterations run long.

The above has a ton of advantages... memset clear to zero, reshape (change dimensions, or transpose dimensions, etc), efficient copies, and more. Vectors can do that too in 1d. The real hit is when you try to double down and make 2d constructs -- you pay for those coming and going if not careful, and even if careful you pay a little.




But would that be as fast as doing this pointer approach to matrices? I mean this method is potentially going to run many many iterations.


The double** approach suggested above requires extra indirections, which are likely to have detrimental effects on performance due to poor locality of reference. This is why jonnin's suggestion of a single-dimensional array is likely to be quicker.

Vectors have negligible overhead when bounds checking is disabled, and they should be preferred until they are shown to be insufficient. That basically means that you have measured it. There's no substitute for measuring.

Have you looked into Eigen?
http://eigen.tuxfamily.org/index.php?title=Main_Page

Last edited on
And vector implementations and compilers and computers are all 3 better than when I last did a big linear algebra project. I would start with the vectors. The notation to index is exactly the same as an array (pointer, of course) so swapping them to test the performance should be trivial if you go into it with that in mind. It would take, literally, tens of millions of accesses to notice the difference, if any.

On a side note, is valarray totally dead? Or did they ever get that to work efficiently?
Last edited on
So I tried purging all the strdup calls, and it seems that it has worked!

So can this really be true that the compiler somehow, does not have a perfect overview of the memory allocated, and thus might try to write to an already written to address?

I should say that I had included some helper functions written as a C file, could this be the cause of the mischief? I have now converted both files to C++.

And thanks a lot for the help everybody, you are great!
Last edited on
So can this really be true that the compiler somehow, does not have a perfect overview of the memory allocated, and thus might try to write to an already written to address?


It does what you tell it. If it's overwriting something, it's doing it because you told it to.
Ok well I am not exactly sure how it looks, when there is a memory conflict with for instance new and malloc?

Could someone give an example of how this might occur?
So I think I might have identified the source of this issue, so the reason seems to be this function that I call numerous time in my program:

 
int func(const dgd d, double* n, int p, double** f,double* q,double** &F_new,double* &Q_new, double** f_org)


However originally it was:

 
int func(const dgd &d, double* n, int p, double** f,double* q,double** &F_new,double* &Q_new, double** f_org)


And when it is like this, it does run faster, but occasionally crashes after x times of bootstrapping.

d is a struct and looks like this:

1
2
3
4
5
6
7
8
9
10
11
                                                                                  
typedef struct{                                                                                                                                           
  double **d;                                                                                                                                         
  char *k1;                                                                                                                                            
  char *l2;                                                                                                                                            
  std::vector<std::string> barcode;                                                                                                                            
  int n;                                                                                                                                             
  int p;                                                                                                                                               
  std::map <std::string,int> barcodeMap;                                                                                                                       
                                                                                                                                                          
}dgd;                                                                                                                                                     


In the bootstrapping function, I change d.d replacing the rows with the randomly sampled rows.
As d.d is almost the only values I use in this function.
Last edited on
Sorry not to be pushy, but I would be very interested if someone could enlighten/help me with regards to my last post. As I state I think I have found the source of this malfunction.
I would recommend that you learn to use a debugger and unit tests first before messing around with such complicated code. Also it's a good idea to provide enough code so that we can compile and run it.


Originally Posted by Bjarne Stroustrup(2000 - 10 - 14)
I get maybe two dozen requests for help with some sort of programming or design problem every day.Most have more sense than to send me hundreds of lines of code.If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
Last edited on
for your post 2 back..

you cannot mix new and malloc and free and delete. That is, you will corrupt something if you do

x = new int;
free x; //kablooie

and if you use objects, malloc and free are not so hot..

class cl
{
yada..
};

cl = malloc(..);
free(cl); ///oops! Your destructor was not called, and consequences of this can be major or minor.

I can't remember the other problems but those are 2 right off the bat.

There is really only 1 reason to use malloc in C++ ... and that is deep in the weeds if you need a resizeable array. Which one would normally use a vector for, but say for some reason, you could not use it. And this should be extremely unusual, and probably 100% obsolete in today's world. There was a time when this made sense, before vectors existed... this should only be in legacy code now... The other reason being mixed language C/C++ code base, which is not the same topic really.







Last edited on
Topic archived. No new replies allowed.