Storing constant, frequently accessed data.

Hey guys,

I'm redesigning some code and I'm wondering what the best ways to store and access certain data is. The characteristics are as followed:

1) Based on data from a file, a distance matrix (N x N) is calculated.
2) Once the matrix is complete, the data will never change (unless a new file is read, but I can work around that by iteratively calling the problem with a new datafile on as command line parameter).
3) The data from the matrix is accessed billions of times from pretty much every other line of code.

In my old version, I had a class "Data" which a sub-class "Data::Distance" and I would put a reference in every other class that needed it. Now, my class hierarchy will be much flatter (basically all logic will be in one class; other classes will be POD structs).

Given the characteristics of the Distance table, is there a way to store them in a very efficiently-accessible way? Does it matter if it's stored in the main class where all the action happens in contrast to being a different class? Does making it static improve the performance? Casting it to const? Anything?

Any and all tips are welcome. Again, the data is accessed billions of times so even minor differences can save a lot of time.
Additional question that may be a bit more fruitful:

The (N x N) distance matrix will also have a (N x N) penalty matrix, where each x'th element relates to the same arc, so a full "biased" cost would be cost[x] = distance[x] + penalty[x].

Regular ("pure") distances are used much more often, but when penalties are required (I'm estimating between 10% and 50% of the time, but I can't be sure) they're nearly always needed together.

I'm doubting between having two separate matrices (vectors/arrays) or building a single matrix of pairs (dist, pen). Is there a surefire way to know which will be better? I can test both relatively easily, but only when the entire code rework is done, and I'd like to start with the "most likely best" one in case it influences other decisions.

Code example: (Conceptual names. COST: double, COSTS: vector<COST>, NODE: unsigned)
Separate:
1
2
3
4
COSTS distances;
inline COST d(NODE i, NODE j) const { return distances[i*n+j]; }
COSTS penalties;
inline COST p(NODE i, NODE j) const { return penalties[i*n+j]; }


Paired:
1
2
3
4
5
struct ARCCOST { COST distance; COST penalty; };
typedef std::vector<ARCCOST> ARCCOSTS;
ARCCOSTS distancepenalties;
inline COST d(NODE i, NODE j) const { return distancepenalties[i*n+j].distance; }
inline COST p(NODE i, NODE j) const { return distancepenalties[i*n+j].penalty; }
Last edited on
Hi Gaminic,

I would probably make a class that represents the matrix. The constructor would "open" the file and store it. We can use dynamic memory allocation here because even though this is slow, we only do this once. I would then store the data into a 1d array. It takes a little longer to de-reference 2d arrays.

Because it's constant data, I would add get methods and no set methods.

If I wanted to open a new matrix, I'd just make a new instance of the same object.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
class ReadOnlyMatrix : boost::noncopyable
{
    int* data;
    int xsize;
    int ysize;
public:
    ReadOnlyMatrix(std::string& filename) 
    { 
        // Open file, get xsize, ysize
        data = new int[xsize * ysize];
        // write data from file
        // close file
    }

    ~ReadOnlyMatrix() { delete[] data; }

    const int size_x() const { return xsize; }

    const int size_y() const { return ysize; }

    // Returns the value at the desired location, 
    const int at(const int x, const int y) const { return data[ y * ysize + x]; }
};


By observing const-correctness, you can sometimes get better performance out of the compiler's optimizer. Note, I didn't throw a std::out_of_range exception in the at() function to speed things up a bit. Not sure if that really helps, but std::vector documentation seems to imply a speed difference between at() and operator[]().
Last edited on
Hey Stewbond,

This is pretty much what I had in the early version, aside that it was public (because it's global data and needs to be accessed by everything).

Does making it private and having the getters return const values offer optimization chances? The getters themselves were already const, but the return values weren't.

Not sure what difference that return value could make. As I understand it, returning an int (and it is an int) as a const int would require a cast?
Probably doesn't make too much of a difference.

I'd avoid making the data public just because we want to keep it read-only. This abides by the OOP principle of encapsulation. The at() method gives a fast and easy way to access each member.

You mentioned this resembled an early class you wrote. What was insufficient about it?
Last edited on
There's no real "sufficient" or "insufficient" in my work. I'm working on NP-hard problems and every second we shave off is a victory.

The previous class I felt may slow things down because it was an object passed as a reference to the logic classes, which ultimately is unnecessary (though I'm not sure there's any overhead involved).

I've noticed that how and where I store data matters more than I like (e.g. "tagging" certain elements so they're ignored (= less work) slowed down the algorithm until I went for an intuitively less efficient "tagging" structure).

Especially the second part of the question (two matrices vs one matrix of pairs) is one I think may have a big influence on the speed, though I fear I don't have enough low-level data to decide which form is better.
Passing by reference is actually quite fast.

If you are looking for speed, the low-level stuff regarding how parameters are passed is less important (just avoid copying data). The important parts are to reduce the number of iterations or loops required to achieve a result. When doing these things, I've always found that the small changes in syntax (i.e. i++ versus ++i) always made marginal difference, it was the primary algorithm that really determines how efficient your program is.
> The (N x N) distance matrix will also have a (N x N) penalty matrix

What is the typical order of magnitude of N? Will 2*N*N*sizeof(double) fit into the processor cache? If it does, it doesn't matter much what you do.

Otherwise, is the typical usage pattern
a. calculate distances for a large number of items, calculate penalties for a number of items (keep two separate matrices)
or is it
b. calculate distance and penalty for one item, distance and penalty for another item and so on? (keep a single matrix of pairs)

In other words, either by data organization, or by processing logic, (typically both), strive to achieve locality of reference.
http://en.wikipedia.org/wiki/Locality_of_reference

PS: std::valarray<> instead of std::vector<>?
Last edited on
@Stewbond:
Passing by reference is actually quite fast.

It's not *passed* by reference. Every using class holds a reference to a Data object (even though there's only one in existence). I worry that the data being stored in another object means it's further away from the data of the "accessing" class.

@JLBorges
N varies from set to set, but realistically we can say <10000 (and probably even <5000). Obviously the performance for those larger sets is what matters. I fear cache will not be an option.

Accessing is highly erratic, so caching behavior is just generally poor.

The usage pattern is:
* When a "penalty" is accessed (or updated), its respective "distance" is accessed (but never changed). This makes me want to pair them up.
* Distances are read much more frequently than penalties. My best estimates would be 1/3rd of the "distance" reads are in combination with its "penalty", 2/3rds are without the penalty. This makes me want to split them up, so the 1/3rd doesn't slow down the 2/3rd.

Since distances never change, use redundancy, perhaps?

Keep a stand-alone matrix of distances and another one containing penalty-distance pairs?

Assuming that you have adequate memory to allow this, of course.
That may be crazy enough to work. The program isn't exactly memory bound (the old version takes about 400mb on sets of 3k. I think 70mb is the distance matrix, so doubling up with a penalty/distance pair matrix would take another 140mb).

Also it's a rather easy change to make so testing it will be easier than usual. I'll post back (but it may take a few weeks to finish the new version).
Topic archived. No new replies allowed.