If it's absolutely necessary for the API to not break, it's possible to change the implementation of account::name() through a pimpl and perhaps TLS. It won't be pretty, but existing code need not break.
I found the code for that COW article:
http://www.gotw.ca/gotw/045code.zip
It just needed to be tweaked a little bit to build on VS 2015.
I find it interesting that I can't make the results budge no matter how high I crank the string size, even though I can see the program taking longer. I'm going to play around with it to see what exactly it's measuring.
EDIT: Here are my results. I tested in x86-64 using MSVC 14.0, with optimizations turned on, on an Ivy Bridge Core i5 3570.
1. With one exception, at small string sizes my results match those in
http://www.gotw.ca/gotw/045.htm
2. COW_AtomicInt is, overall, the fastest of the COW implementations. When it loses to COW_AtomicInt2, it only loses by a tiny bit (10-20%).
3. At large string sizes (10k), COW is much faster than Plain for some operations, and somewhat slower for others. I could not test Plain_FastAlloc because it crashes for large string sizes. However, in my opinion at large string sizes the problem is not allocation, but copying.
4. Here's how COW compares with Plain:
4.a. Simple const copy: COW is between 4 and 28 times faster. Oddly, when I multiplied the iterations by 10, Plain's time went up by 10x, but COW's only went up by 2.62x. I'm guessing it's playing better with the cache. Tipping point: 1 character.
4.b. Append: Plain is 2.77 times faster.
4.c. const operator[](): COW is approximately infinitely slower (COW's time scales with iteration count, but Plain's time doesn't). I imagine this is caused by atomic operations being much slower.
4.d. Mutating copy (66%): COW is about 4 times faster. Tipping point: ~500 characters.
4.e. Mutating copy (50%): COW is 4.15 times faster. Tipping point: ~500 character.
In conclusion: it's worthwhile to use COW if you need to manipulate large buffers in general (an example that comes to mind from experience is bitmaps), especially if you need to pass them around and don't want to manually keep track of who might need write access. I imagine there should be a simple way to fix the performance issue of const operator[](), but I haven't looked into it.