An interesting optimization that really does make a difference

Oct 27, 2014 at 6:34pm
I know for some that this will seem obvious, but I wanted to share something that I found quite surprising.

Often times I read things on "optimizing" your program, and a lot of the times it's bogus that is either false or is just hardly relevant (whether it ever was or it just isn't nowadays with today's machines). One thing that really did make a significant difference for me though was trying out something I read in an article called data locality.

Putting the thing in a nutshell quite quickly, it's about how processors have picked up quite a bit, but RAM is still lagging behind. Due to this, performance-critical programs should organize their data so the CPU doesn't waste cycles looking for data.

Basically, I changed a tile map editor I was working on from iterating through a vector of tile objects that wrap over sf::Sprite, to iterating through a renderable component that is synced with the tiles.

Sort of in short

1
2
for(...)
m_tiles[i].draw(window); // window.draw(m_sprite) 


to

1
2
// Have a contigous array 
m_sprites = new sf::Sprite[someSize];


1
2
3
4
5
6
MapRenderableComponent::draw(Map * map)
{
    for(...)
        window.draw(m_sprites[i]);
}



It's by no means complicated actually. I just thought I'd share because this actually brought me from a maximum of 150 FPS to 220 FPS

i5 intel processor duo-core
6 gigs of ddr3 RAM
Intel(R) HD Graphics Family 1696 MB <- I hate this integrated card
Oct 27, 2014 at 6:51pm
What's the value of someSize? I doubt it'd make much of a difference unless either sizeof(m_tiles[i]) or someSize was really large.
Oct 27, 2014 at 7:26pm
I think the issue is the difference between blah.draw(meh) and meh.draw(blah)
Oct 27, 2014 at 7:46pm
@helios

You're correct, but it actually doesn't really matter whether I'm allocating 100x100 tiles, or 500x500 tiles, the frame rate isn't much different because I don't draw anything outside of the view.

@LB
Do you think just passing a reference of sf::RenderWindow would do that?
Oct 27, 2014 at 7:56pm
I think that something might be going on with the inlining of the function call:
Austin J wrote:
1
2
for(...)
m_tiles[i].draw(window); // window.draw(m_sprite) 
What is decltype(m_tiles[i])? Is it a polymorphic type? What is decltype(m_tiles)? Is it a standard library container or your own concoction?
Last edited on Oct 27, 2014 at 7:57pm
Oct 27, 2014 at 7:57pm
Yes, that's what I was doing because I didn't inherit from sf::Drawable, but I didn't think it'd make a difference of 70 fps. I suppose I'd have to do more testing to be sure.
Oct 27, 2014 at 8:09pm
@LB

No polymorphism, any tiles differences are data driven, no polymorphism.

m_tiles is my own thing, I do use STL containers I just don't use one there.

To be honest, I don't have a good knowledge of compilers, so I wouldn't have any clue how it determines when it wants to inline ( I believe they can even decline an inline from the programmer? )
Oct 27, 2014 at 8:22pm
Your first post isn't exactly clear on the changes you made - do you only change the body of the loop, or do you change how the sprite list is stored too?
Oct 27, 2014 at 8:27pm
The correct way to look at it would be to say that it made a difference of 2.12 ms, or of 46%. An increase of 70 fps is not the same when the base framerate is 10 fps as when it's 3000 fps.
Oct 27, 2014 at 8:28pm
@LB

Sorry, I could have worded it a bit better.

MapRenderableComponent holds a contigous array of sf::Sprites on the heap. Previously the tiles just held sf::Sprite as a member.

The change is actually mostly focused on the storage, not the loop.
Oct 28, 2014 at 12:14am
The only time that should make a difference is if your tiles hold more than just the sprite. I guess it's reasonable to assume they do, though.
Topic archived. No new replies allowed.