An interesting optimization that really does make a difference

I know for some that this will seem obvious, but I wanted to share something that I found quite surprising.

Often times I read things on "optimizing" your program, and a lot of the times it's bogus that is either false or is just hardly relevant (whether it ever was or it just isn't nowadays with today's machines). One thing that really did make a significant difference for me though was trying out something I read in an article called data locality.

Putting the thing in a nutshell quite quickly, it's about how processors have picked up quite a bit, but RAM is still lagging behind. Due to this, performance-critical programs should organize their data so the CPU doesn't waste cycles looking for data.

Basically, I changed a tile map editor I was working on from iterating through a vector of tile objects that wrap over sf::Sprite, to iterating through a renderable component that is synced with the tiles.

Sort of in short

1
2
for(...)
m_tiles[i].draw(window); // window.draw(m_sprite) 


to

1
2
// Have a contigous array 
m_sprites = new sf::Sprite[someSize];


1
2
3
4
5
6
MapRenderableComponent::draw(Map * map)
{
    for(...)
        window.draw(m_sprites[i]);
}



It's by no means complicated actually. I just thought I'd share because this actually brought me from a maximum of 150 FPS to 220 FPS

i5 intel processor duo-core
6 gigs of ddr3 RAM
Intel(R) HD Graphics Family 1696 MB <- I hate this integrated card
What's the value of someSize? I doubt it'd make much of a difference unless either sizeof(m_tiles[i]) or someSize was really large.
I think the issue is the difference between blah.draw(meh) and meh.draw(blah)
@helios

You're correct, but it actually doesn't really matter whether I'm allocating 100x100 tiles, or 500x500 tiles, the frame rate isn't much different because I don't draw anything outside of the view.

@LB
Do you think just passing a reference of sf::RenderWindow would do that?
I think that something might be going on with the inlining of the function call:
Austin J wrote:
1
2
for(...)
m_tiles[i].draw(window); // window.draw(m_sprite) 
What is decltype(m_tiles[i])? Is it a polymorphic type? What is decltype(m_tiles)? Is it a standard library container or your own concoction?
Last edited on
Yes, that's what I was doing because I didn't inherit from sf::Drawable, but I didn't think it'd make a difference of 70 fps. I suppose I'd have to do more testing to be sure.
@LB

No polymorphism, any tiles differences are data driven, no polymorphism.

m_tiles is my own thing, I do use STL containers I just don't use one there.

To be honest, I don't have a good knowledge of compilers, so I wouldn't have any clue how it determines when it wants to inline ( I believe they can even decline an inline from the programmer? )
Your first post isn't exactly clear on the changes you made - do you only change the body of the loop, or do you change how the sprite list is stored too?
The correct way to look at it would be to say that it made a difference of 2.12 ms, or of 46%. An increase of 70 fps is not the same when the base framerate is 10 fps as when it's 3000 fps.
@LB

Sorry, I could have worded it a bit better.

MapRenderableComponent holds a contigous array of sf::Sprites on the heap. Previously the tiles just held sf::Sprite as a member.

The change is actually mostly focused on the storage, not the loop.
The only time that should make a difference is if your tiles hold more than just the sprite. I guess it's reasonable to assume they do, though.
Topic archived. No new replies allowed.