Does strtok() changes the string passed by value.

I am passing a string by value to func() which uses strtok() function.
But when I am printing back the string in main() function, its modified !!
Here is the code :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
bool func(string s2)
{   
    char *p = strtok(const_cast<char*>(s2.c_str()), " ");
    return true;
}

int main()
{   
    char buf[100];

    string s1 = "add 20";

    bool check = func(s1);
    cout << s1 << endl;
    
    return 0;
} 


I have two questions
1) Is s1 not passed by value to func()? Is a copy of s1 not created in this case.
2) If s1 is passed by value, then how the changes done by strtok in func() can affect s1 in main().

Any help is highly appreciated. Thanks :)
Last edited on
strtok() does modify the array pointed to by the passed pointer, therefore line 3 makes the behavior of the program undefined. Writing to a pointer whose constness has been dropped causes undefined behavior.
There's no point in reasoning about what's happening because the compiler is free to generate code that violates the guarantees normally provided by the language.
@helios

Thanks for replying :)

as mentioned by you -
"strtok() does modify the array pointed to by the passed pointer" -- even if it modifies the array pointed by passed pointer, it should modify the array of s2, and not s1, as s1 is passed by value.

I think s2 is a copy of s1 and even if strtok() modifies array, s2 should be modified and not s1.

Please let me know if I am wrong when I say that s2 is a "copy" of s1 and hence any change to s2 or it's array representation should not affect s1.

Thanks
Last edited on
There is no undefined behaviour here; the original object being modified is not a const object.

The contents of the string s2 would be modified, but it is never accessed after that; its life-time is over when the function exits. The code is perfectly fine.


> Is s1 not passed by value to func()? Is a copy of s1 not created in this case.

Yes.


> 2) If s1 is passed by value, then how the changes done by strtok in func() can affect s1 in main().

It can't. s1 in main() would be unchanged.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
#include <iostream>
#include <string>
#include <cstring>
#include <iomanip>
#include <cassert>

bool func( std::string s2 )
{
    char *p = strtok( const_cast<char*>( s2.c_str() ), " " );
    return true;
}

int main()
{
    std::string s1 = "add 20";
    std::cout << std::quoted(s1) << '\n' ;

    bool check = func(s1);
    std::cout << std::quoted(s1) << '\n' ;
    assert( s1[3] == ' ' ) ;
}

http://coliru.stacked-crooked.com/a/20713b0a38e7ee82
I think s2 is a copy of s1 and even if strtok() modifies array, s2 should be modified and not s1.
Like I said, the behavior of the program is undefined. It's not possible to reason about a the behavior of a program with undefined behavior, at least not without understanding the internal workings of the compiler.

For example, suppose I'm an optimizing compiler and I'm compiling func(). Here's a possible line of reasoning I'm allowed to follow by the language:
1. I see that func() takes a non-trivial object by value.
2. Can the copy be omitted?
3. What members of std::string are used?
4. Answer to 3: ["std::string::c_str() const"]
5. All members of std::string used in func() are const function members.
6. Answer to 2: Since the formal argument is never modified, the copy is redundant.
7. Replace the pass-by-value with a pass-by-reference-to-const.

The const_cast<char*>(s2.c_str()) bit is just wrong. If you want to pass a pointer to an std::string's internal array, to a function that may modify its parameter, you should use &s2[0] instead. But note that s2.size() should be at least 1.
JLBorges wrote:
It can't. s1 in main() would be unchanged.

output on current gcc https://wandbox.org/permlink/KOWwvbdE4uhWRNCi
"add 20"
"add 20"

output on gcc 4.8.5 (and earlier) https://wandbox.org/permlink/Yj0JtD9hlfUTLIUG
prog.exe: prog.cc:20: int main(): Assertion `s1[3] == ' '' failed.
add 20
add20
Aborted


It was (formally, still is) undefined behavior to modify any char through the pointer returned by c_str() before C++11, and in practice, it led to modifying the shared copy-on-write representation of the string and all its copies in pre-C++11 GNU libstdc++

To work around, obtain the writeable pointer using &s2[0].. or just don't use strtok
Last edited on
Thanks alot helios for elaborated explanation.
Thanks Cubbi for adding further :)
Thanks, Cubbi!
Topic archived. No new replies allowed.