"Impove your programs"

closed account (28poGNh0)
hey everyone

I has these days a fever called "Improve your progams" .So many questions I have and I hope to get some answers .

I : If I have a for loop as this one :

for(unsigned int i=0;i<(strlen(szBuffer)+2);i++)
lets say that szBuffer's length = 10;

We all know the loop goes from 0 to 9,but my head says that everytime (strlen(szBuffer)+2) be calculated

is that right or (strlen(szBuffer)+2) be calculated once

thanks for reading
Last edited on
closed account (S6k9GNh0)
Yeah, store the result of strlen and iterate using that. strlen, in most implementations, just iterate through the string until it hits a null character, then returns giving the the length of the string plus the null character. Also, strlen isn't guaranteed to work if there is no null-terminated character. Next.
Last edited on
Most compilers will optimize it to only be evaluated once, though.
Last edited on
closed account (28poGNh0)
II : which of those two expressions is better

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
int main()
{
    const char *szBuffer = "string";

    cout << "First expressions" << endl;
    for(int round = 1;round < 6;round++)
    {
        for(unsigned int i=0;i<strlen(szBuffer);i++)
            cout << szBuffer[i];
        cout << endl;
    }

    cout << endl << "Second expressions" << endl;

    unsigned int var = strlen(szBuffer);

    for(int round = 1;round < 6;round++)
    {
        for(unsigned int i=0;i<var;i++)
            cout << szBuffer[i];
        cout << endl;
    }

    return 0;
}


and thanks for the reply BTW
Last edited on
Most compilers will optimize it to only be evaluated once, though.


How can they? Unless it's marked as constexpr, how can they know the function call can be safely omitted? Is it because it's a std lib function?

EDIT: I guess if it's inlined....
Last edited on
Most compilers will optimize it to only be evaluated once, though.

No they can't, it is an unsafe optimization because it could potentially change the behavior of the program. The only time it can be done is if the loop body makes no function calls and has no side effects modifying the string -- which is surprisingly difficult to prove.

Inlining is a different issue.
Last edited on
I stand corrected - I was assuming strlen was inline.
@Techno01: Since you posted actual loops, let's look at actual compilers

I'll modify your test this way, so that we can examine what the functions compile to separately
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
#include <cstring>
#include <iostream>
using namespace std;
void f1()
{
    const char *szBuffer = "string";

    cout << "First expressions" << endl;
    for(int round = 1;round < 6;round++)
    {
        for(unsigned int i=0;i<strlen(szBuffer);i++)
            cout << szBuffer[i];
        cout << endl;
    }
}

void f2()
{
    const char *szBuffer = "string";

    cout << endl << "Second expressions" << endl;

    unsigned int var = strlen(szBuffer);

    for(int round = 1;round < 6;round++)
    {
        for(unsigned int i=0;i<var;i++)
            cout << szBuffer[i];
        cout << endl;
    }

}

int main()
{
        f1();
        f2();
}


Here's what the main loop of each function became:
gcc 4.7.2 on linux, optimized for size (-Os), because with -O3 it just unrolled both loops completely, and that's too much code to post
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
f1()
.L3:
        movsbl  (%rbx), %esi
        movl    $cout, %edi
        incq    %rbx
        call    operator<<(std::ostream&, char)
        cmpq    $.LC0+6, %rbx
        jne     .L3
        movl    $cout, %edi
        call    endl(ostream&)
        decl    %ebp
        je      .L1
.L2:
        movl    $.LC0, %ebx
        jmp     .L3
.L1:
f2()
.L9:
        movsbl  (%rbx), %esi
        movl    $cout, %edi
        incq    %rbx
        call   operator<<(std::ostream&, char)
        cmpq    $.LC0+6, %rbx
        jne     .L9
        movl    $cout, %edi
        call    endl(ostream&)
        decl    %ebp
        je      .L7
.L8:
        movl    $.LC0, %ebx
        jmp     .L9
.L7:


Intel 13.1, linux, regular -Ofast

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
f1()
..B2.4:
        xorl      %ebp, %ebp
..B2.5:
        movl      $cout, %edi
        movsbl    .L_2__STRING.0(%rbp), %esi
        call      operator<<(std::ostream&, char)
..B2.6:
        incl      %ebp
        cmpq      $6, %rbp
        jb        ..B2.5 
..B2.7:
        movl      $cout, %edi
        movl      $endl(ostream&), %esi
        call      operator<<(ostream& (*)(ostream&))
..B2.8:
        incb      %bl
        cmpb      $6, %bl
        jl        ..B2.4
f2()
..B3.5:
        movb      $0, %bpl
..B3.6:
        movzbl    %bpl, %edx
        movl      $cout, %edi
        movsbl    .L_2__STRING.0(%rdx), %esi
        call      operator<<(std::ostream&, char)
..B3.7:
        incb      %bpl
        cmpb      $6, %bpl
        jb        ..B3.6
..B3.8:
        movl      $cout, %edi
        movl      $endl(ostream&), %esi
        call      operator<<(ostream& (*)(ostream&))
..B3.9:
        incb      %bl
        cmpb      $6, %bl
        jl        ..B3.5


I could run it on another platform, for variety, but I don't expect any differences between these two loops. If you care about performance, start by getting rid of all these calls to std::endl
Last edited on
1
2
3
4
5
6
7
8
9
10
11
12
void f3()
{
    const char *szBuffer = "string";

    cout << endl << "Third expressions" << endl;
    for(int round = 1;round < 6;round++)
    {
        copy( szBuffer, szBuffer+strlen(szBuffer),
              ostream_iterator<const char>(cout) );
        cout << endl;
    }
}

I don't know how to dump instructions in same format as Cubbi, but f1, f2, and f3 do compile to rather similar nevertheless.

I recall attending "program refinement" course last millennium. Purely theoretical, mostly lambda-expressions and syntactic sugar.
closed account (28poGNh0)
Thanks @cubbi and @keskiverto for the replies

@cubbi you're close to understand what I need to know

In my head I think that f2() is better that f1() function because in f1()

every round the second for loop must calculates strlen(szBuffer)

but in f2 is been already calculated and assingned to a local variable

am I right?

also is it better to do std::cout instead of using namespace std;

thanks for reading
Last edited on
In my head I think that f2() is better that f1() function because in f1()
every round the second for loop must calculates strlen(szBuffer)
but in f2 is been already calculated and assingned to a local variable
am I right?

It is exactly the same in this case on the platforms I tested, but you should be able to construct a test case in which it is exactly as you say. This test case is just too easy for the today's compilers to fully analyze.

Since I'm now at work, here's some non-intel goodness:
XL C++ 11.1, on AIX, with -Os (otherwise it unrolls everything like gcc)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
f1()
__L58:
    addi       r28,r0,0
    addi       r27,r31,-1
__L60:
    addi       r0,r0,6
    lbz        r4,1(r27)
    addi       r27,r27,1
    lwz        r3,T.238.std::cout(RTOC)
    ori        r0,r0,0x0000
    ori        r0,r0,0x0000
    bl         .std::operator(std::ostream&,char)
    ori        r0,r0,0x0000
    addi       r28,r28,1
    addi       r0,r0,6
    cmpl       0,0,r28,r0
    bc         BO_IF,CR0_LT,__L60
    lwz        r3,T.238.std::cout(RTOC)
    lwz        r0,0(r30)
    lwz        r11,8(r30)
    mtspr      CTR,r0
    lwz        RTOC,4(r30)
    bcctrl     BO_ALWAYS,CR0_LT
    lwz        RTOC,20(SP)
    addi       r29,r29,1
    cmpi       0,0,r29,6
    bc         BO_IF,CR0_LT,__L58
f2()
__L1410:
    addi       r28,r0,0
    lwz        r4,T.122.NO_SYMBOL(RTOC)
    addi       r29,r4,-1
    ori        r0,r0,0x0000
__L1420:
    addi       r0,r0,6
    lbz        r4,1(r29)
    addi       r29,r29,1
    lwz        r3,T.238.std::cout(RTOC)
    ori        r0,r0,0x0000
    bl         .std::operator(std::ostream&,char)
    ori        r0,r0,0x0000
    addi       r28,r28,1
    addi       r0,r0,6
    cmpl       0,0,r28,r0
    bc         BO_IF,CR0_LT,__L1420
    lwz        r3,T.238.cout(RTOC)
    lwz        r0,0(r31)
    lwz        r11,8(r31)
    mtspr      CTR,r0
    lwz        RTOC,4(r31)
    bcctrl     BO_ALWAYS,CR0_LT
    lwz        RTOC,20(SP)
    addi       r30,r30,1
    cmpi       0,0,r30,6
    bc         BO_IF,CR0_LT,__L1410

(there's no explicit call to endl in the loop code because it sneakily loaded a pointer to it in r31 at the top of each function.. but like intel and gcc, it simply uses the constant 6 instead of strlen(szBuffer))
closed account (3qX21hU5)
Just a question for the OP are you asking these questions from a C or C++ perspective? If its from a C++ perspective why not use strings? Just curious.
I prefer to write something like:
1
2
3
4
const size_t bfSize(strlen(szBuffer));
for(size_t i= 0; i < bfSize; ++i) {
    /*...*/
}

This way:
1) Anybody reading your code knows that bfSize is constant and will not change during loop execution.
2) I will not break optimization and/or change program behavior by adding something like strncat(szBuffer, /*...*/) inside the loop. (However getting too worried about optimization in that case is probably example of premature optimization)
Last edited on
Topic archived. No new replies allowed.