Howdy everybody,
As I've been contributing on this website since 2008 I thought I'd post the kind of bugs I work with.
My work is primarily in high performance multi-platform scientific development. I design and develop population/abundance/biomass models.
With our preference movement method each cell in the model NxM has a preference of it's population moving to N`xM`. In this particular model the number of preference calculations was 300million+ and I decided to cache them if the model could determine that all of the supporting values would be static for the iteration.
After caching it I had a minor change in the results coming from the model. So I put a bunch of checks in to compare the real-time values to the cached values with an acceptable difference of 1e-12. All >300million+ checks passed with no issues, but still the results were different... strange.
So I put in a straight double == double comparison and got about 1,000 values from the 300mil+. Not too bad but the printed values were identical so I knew the variance was tiny (less than 1e-12). Unfortunately even with this tiny variance the model output changed quite a bit.
After generating txt debug files that were ~500MBs that were not comparable and some around 50MBs each that had ~5-10 values that would be different between them I was very confused.
Looking through my calculations line by line they were perfect, so after sleeping for the night and a cup of coffee I figured it out, some quick code changes to verify and solved.
The problem.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
|
// Bad Code
void buildCache() {
vector<vector<double>> runningTotals;
for (int i = 0; i < world_height; ++i) {
for (int j = 0; j < world_width; ++j) {
runningTotal = 0.0;
for (int k = 0; k < cell_height; ++k) {
for (int l = 0; l < cell_width; ++l) {
cache[i][j][k][l] = cell_value;
runningTotal += cell_value;
}
}
runningTotals[i][j] = runningTotal; // This value was off by <1e-12
}
}
}
// Good Code
void buildCache() {
vector<vector<double>> runningTotals;
for (int i = world_height - 1; i >= 0; --i) {
for (int j = world_width - 1; j >= 0; --j) {
runningTotal = 0.0;
for (int k = cell_height - 1; k >= 0; --k) {
for (int l = cell_width - 1; l >= 0; --l) {
vCache[i][j][k][l] = cell_value;
running_total += cell_value;
}
}
runningTotals[i][j] = runningTotal; // This value was off by <1e-12
}
}
}
|
Now, considering that
- The code works perfectly
- This code is only illustrative, but it does show the problem
- Everything works, so it's not syntax (e.g I haven't missed an allocation)
Who thinks they know what the cause was? I'll give you guys a chance to figure out what it was before I post the actual cause.
FYI: Using the code I've posted you should be able to replicate the issue.