I am trying to write a code that requires a lot of matrix operations. So I decided to use the "gsl library". In different steps of my code, I need to add some columns to the allocated matrix. Meaning the size of my matrix must change dynamically. What is the best way to do this? What is the best approach to deal with my problem?
lots of the same operations on different data/matrices can be done efficiently using openCL / cuda
in case it has to be done with gsl:
i'd try to assign a new "bigger block" to en existing matrix and giving it ownership of the new block while deleting the old block. block represents data, matrices/vectors just point to that data
transpose in place, push-back enough entries for the new columns, and transpose back is one way.
copy to another matrix with the extra columns added is another.
both are expensive due to copying.
if you know it is coming, you can allocate space for the 'extra' columns up front and fill them in when needed. That means that any processing on the smaller sub-matrix needs to be able to ignore the extra columns, though. I don't know if your library can be fooled this way.
or, if you know it is coming, you MAY be able to just work in the transposed space until you can add rows, which is cheaper, and only transpose 1 time at the very end, which may be tolerable.
at the very least, you can allocate the underlying structure with data sufficient to hold the extra columns to avoid paying to allocate memory more than once, even if you must pay for a copy somewhere along the way.
what are you doing that needs an append, anyway? There may be a better algorithm.
@jonnin I am doing some non-parametric bayesian inference. Well, each column is representative of the latent feature which I am trying to find out the number of them for the given data (known number of rows) from sampling approach. There is no upper limit to the number of columns. It should be learnt from data. I can not work with the transposed matrix and I need to do a lot of matrix operations such as matrix multiplications and inversions. It should be a way in c/c++ to deal with this type of analysis, shouldn't be?
multiply can do in transpose. At*Bt = (B*A)T ... a pretty simple issue to deal with, you just multiply in the reverse order that you originally wanted (I think). Regardless there is a doable relationship in there somewhere.
I forget but I am fairly certain you can deal with inversion in transpose space also. Its been a long time. You can also invert without appending rows, the identity append & RREF inverse is easy but not the only way.
granted doing it this way is annoying. It may or may not be useful to try it, but lets table that until we run out of options.
There are ways to deal with it. But the language knows NOTHING about a matrix or matrix math. THe libraries do, and some libraries are smarter than others .. I don't know anything about yours. The *language* is row major and THAT means it is inefficient to try to work on columns directly esp if allocating more of them in the middle of a process. You can DO it of course, it just costs more than adding a row does. Just copying into a larger matrix with extra columns flat out works, its just expensive for big matrices. If they are smallish, its no big deal either way.
assuming you asked for larger matrices because copying into a new one is fairly trivial to do? If copying works, just do that... for anything smaller than 100x 100 you won't even notice the difference...
what is the underlying memory structure anyway? That affects what is possible. Hopefully its not ** or something frustrating like that?
I can not work with the transposed matrix and I need to do a lot of matrix operations such as matrix multiplications and inversions. It should be a way in c/c++ to deal with this type of analysis, shouldn't be?
take a look at std::vector for example: the current memory size is NOT equal to its actual allocated memory (vector::capacity() != vector::size()), it "reserves" memory beforehand, in case you "pushback" a new item into a vector and the new size surpassses the allocated space, the vector will reserve a little more, sometimes 1.5x .. 2.0x the current size. that way, is doesnt mave to allocate and free memory that often (performance).
in case of your "gsl library", it gives you that option by separating vectors from memory (so-called "block structure"). so what i'd do is preallocate a 5000x5000 block, each time you resize a "vector" pointing to that block check if the new size is bigger than what you've reserved, if it surpasses the size, re-allocate a bigger block, delete the old one after you've transfered its content (row-wise).
@John87Conner, would you please give an example how I should do matrix operations and inversion by defining a (2d) vector instead of a 2d gsl_matrix? Could you please provide an example on how to use librsb library for my problem? thanks
@Jonnin how is it much easier to add rows instead of columns to the gsl_matrix, because you still need to allocate another matrix and write one or two for loops to write the content of one into another which will take time for a matrix with an order of 10000 rows and a 10-50 columns and deallocate the memory of the first matrix?!! Would you elaborate it more?
2d is 'bad'. It is much easier to do a great many things with 1d mapped as 2d manually. I don't know if your library can handle that, though. You have to feed it what it expects...
2d vector is just
vector<vector<double> > x; //you can add preallocation syntax or use .reserve for the memory
but better to just do
vector<double > x(maxrows*(maxcols+growthroom));
access it with
x[desired_row*maxrows+desiredcolumn]; //2d emulation.
it is MUCH easier to add rows.
to add rows, you just add them... say you had a nxm above and want to add a row...
for(all the columns )
x.push_back[columnvalue]; //appends a row. or if you preallocated the space, you just fill it in...
for(all the columns)
x[desired_row*maxrows+desiredcolumn] = new_values[i];
presumably, here, you pre-allocated the extra space so its just a copy of the extra row.
My old matrix library was crude, but the key to its speed was I did not do any internal memory allocation (I allocated a LOT of memory up front and borrowed and returned from that pool as I needed temporary space). Of all the things in linear algebra, cool algorithms, inner loop tweaking, all the stuff you can do... management of the memory was the #1 thing to making it fast. Everything else pales in comparison to doing excessive copies and new/deletes.
also page faults fall right in there on the memory dealings. Take basic matrix multiply... if you transpose first, you can iterate both matrices along rows instead of columns, preventing a page fault on the column side matrix on EVERY access nearly on a large matrix. Once the # of columns is big enough to fill more than 10% or so of a page in memory, and the # of rows spans many pages, this becomes huge.
again, though, your library should be doing most of this under the hood for you.
the bottom line is that there are going to be a few things you have to pay big for.
adding columns is one of them. Here, a vector of vector might be able to DO it FOR you with clean simple syntax but inside, it has a high chance of having to do something slow. The best way I know of to work in column space is to mentally reverse the columns and rows or work in the transpose space. That makes row operations become the trouble spot. But all this is when rolling your own. Using the library... you have to do what you can within its framework. Ill ask again, what is the underlying data structure of the matrix data?
@Jonnin then which library should I use to do matrix operation and inversion on an allegedly "2d vector"? well so far I used gsl_matrix. To be honest I am now lost what is the best and fastest data structure to handle the operations which I need and I am not a professional c programmer. So I would appreciate for any suggestion or examples.
I have no idea on that. I wrote my own, and for one or two very difficult problems I used the intel math kernal library which is I believe an optimal version of the old C lapack engine. Ancient stuff, really. I had to take my own stuff and format it for the library and take the result and format that back to my own design. That part was a little sloppy and costly, but everything else more than made up for it.
its not the data structure, its the memory allocations and copies that are the key. code to minimize those. almost any quality linear algebra package should be doing this internally, so its your own code that strings operations together that need to work within the structure you library uses to try to minimize those impacts. GSL may be just fine, but whatever it uses, you may need to play around with it to get an efficient solution in your glue code that ties library calls together.
basically my boss said "solve this @ 50 hz"
XA + BX = C (all matrices)
and after about a week trying to use lapack and blas and all that old stuff I was ready to /wrists so I wrote my own solver for the blasted thing and to do it I ended up needing about 90% of all the standard matrix operations ... eigen vals & vecs, + - * invert transpose RREF and all that stuff. It was in controls area, so the matrices were stable, so I skipped all that condition number / normalize /factor / etc numerical stuff, because I could get the right answer with simple routines. Fun stuff.
I don't know how they coded it so I can't comment on whether it will do this efficiently.
if you do it yourself you know whether it does it efficiently, but its a LOT of reinventing the wheel which is almost always a red flag. I am not recommending doing it yourself. I did it myself a long, long time ago and when I did it, I was converting code from m format to c++ and my library was designed to make that conversion as painless as possible while gaining real time performance. Nothing at the time existed to do those 2 things in one go.
There are only 2 real options here that I could seriously recommend at this point.
1) write this one section yourself and convert to and from the library of choice format internally, efficiently as possible, and eat a (hopefully smallish) performance penalty to do the work. This may mean an aggravatingly large copy operation that simply can't be avoided. If it comes to that, you may be able to do a parallel mem-move attack or something on it if VERY careful about it.
2) try another library or stick to only using the library code, and eat a performance penalty (may be heavier but the coding effort is lower). You can try the reshape, see how it performs, or google and ask if its been checked.