Using strings in switch statement

Hi,

I recently had this idea of how to use strings directly in a switch statement without too much clutter. See below.

My question is: is it a bad idea to use it in production code?
It somehow feels wrong to do that, yet I can't really see a big no-go for that concept. There's of course the performance issue with the hash algorithm in the background. Though, I can't find a good place for caching.

Any opinions?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
constexpr int KISShash(const char* str, size_t len)
{
    
    // put some super-simple hash algorithm here
    
    // dummy algorithm
    int sum = 0;
    for(size_t i=0; i<len; ++i) sum += str[i];

    return sum;    
}

// string literal suffix to create hash at compile time
constexpr int operator "" _do (const char* str, size_t len)
{
    return KISShash(str, len);
}

// function to create hash at runtime
int sw(const std::string& str)
{
    return KISShash(str.c_str(), str.length());
}


int main()
{
    
    std::string str = "foo";
    
    switch(sw(str)){
        
        case "foo"_do:
            std::cout << "hello world\n";
            break;
            
        case "bar"_do:
            // do something cool
            break;
            
        case 42:
            // do whatever
            break;
    }

    return 0;
}
I think everyone has done something like that at some point.

Because it’s cool, that’s why!

But in actual production code you are adding complexity and, importantly, potential bugs to code to do something that Shouldn’t Happen™. In other words, letting this be needed (wanted?) is a design flaw.

For example, with the code you have, what if the code-user decides he could use a hug:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
    std::string str = "hug";
    
    switch(sw(str)){
        
        case "foo"_do:
            std::cout << "hello world\n";
            break;
            
        case "bar"_do:
            // do something cool
            break;
            
        case 42:
            // do whatever
            break;
    }

You would expect the code to refuse him, but it feels bad for him and gives him a hello world instead. (Try it!)

Or, what if we instead decide to make it possible to enjoy a good barbeque?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
    std::string str = "bbq";
    
    switch(sw(str)){
        
        case "foo"_do:
            std::cout << "hello world\n";
            break;
            
        case "bar"_do:
            // do something cool
            break;
            
        case "bbq"_do:
            // eat something hot
            break;
            
        case 42:
            // do whatever
            break;
    }

He might wind up doing something cool instead of getting the hot food he wanted. (Try it!)
Though, in this case, whichever he does might come down to compiler quirks more than anything else: you have no guarantee that the compiler will search the jump table in the same order that you specified it, see?


The best way is to use an actual hash table — one that knows how to deal with collisions, because all hashes collide — and use it instead.

1
2
3
4
5
6
7
8
std::map <std::string, some_function> string_to_function
{
  { "foo", say_hello },
  { "bar", do_something_cool },
  { "hug", get_a_hug }
};

if (string_to_function.count( "hug" )) string_to_function["hug"]();


Hope this helps.
Haha -- Grey Wolf -- I was thinking of that same exact thread before I saw your link, because I remember making it!
Last edited on
I think everyone has done something like that at some point.

Yep. I've seen a few. I just liked the idea to use string suffix overloading to make it pretty neat.

Don't get carried away by my "sum up all char's" example. It's of course just placeholder. I didn't want to put in an actual hash function.

There's an interesting one in Grey Wolfs version, I just noticed.

I like - and used - the function pointer in std::map idea, but colleagues of mine aren't too hot on associative containers and function pointers, so I was looking for something more intuitive. (Hence the "no clutter" approach)
It's of course just placeholder.

Yes, clearly. The problem is that you simply cannot compress any random string to an integer without loosing information — meaning you will have collisions.

If you have a specific use case where there are very strict restrictions on the strings allowed, then you can design a hash which will not collide. (Google around “perfect hash”.)

The problem is guaranteeing that the strings are part of the perfect hash.
This might be worth some play time. :^J


but colleagues of mine aren't too hot on associative containers and function pointers
Your colleagues are not smart good programmers.

Programmers with religious hobbies or always/never rules or gut feelings are the kinds that screw your code base over.

Associative arrays are one of — if not the — most important and influential data structures in all of computer programming history. Eschewing them for <unknown reasons> is foolish.

An array of function pointers is no more dangerous or weird than dynamic dispatch, one of the central characteristics of OOP.


Hence the "no clutter" approach

IMNSHO that is bullcrap language your colleagues have brainwashed into you. It is a logically-dubious way of hand-waving away “don’t want, don’t care”.

In programming parlance, “clutter” has a very specific meaning: http://wiki.c2.com/?LanguageIdiomClutter. For less technical purposes, it is an antonym of clean|http://catb.org/jargon/html/C/clean.html

In both cases, the idea is that unnecessary things are being introduced into a design, making it not simple/elegant/whatever inexact metric one wants to use.

However, I suspect that what your colleagues mean is: “don’t introduce extra functions I have to look up when a local block will do.”

I personally think the jump table looks cleaner, and matches a lot of my switch statement use anyway:

1
2
3
4
5
6
7
switch (x)
{
  case A: do_one(); break;
  case B: do_two(); break;
  case C: do_three();
}
std::map <X, std::function <void()> > action
{
  { A, do_one },
  { B, do_two },
  { C, do_three },
};
if (action.count( x )) action[x]();

Those both look very nice and clean to me.

─────────────────────────────────────────────────────────────────


I was about to hit “Submit” but then figured I could whip up something interesting, showing that it is entirely possible and cleanly efficient to make a “non-clutter” (however you want to define that) string switch. So... here it is:

1
2
3
4
5
6
7
8
9
10
11
12
#include <functional>
#include <string>
#include <unordered_map>

#define sswitch(s)    { std::string _s = s; bool _continue = false; std::unordered_map <std::string, std::function <void()> > ss {
#define scase(s)      }, { s,    [&]()
#define sdefault      }, { "\1", [&]()
#define sbreak        return
#define scontinue     _continue = true; return
#define sgoto(s)      if (ss.count( s )) {ss[s](); _s = ""; sbreak; }
#define sgoto_default sgoto("\1")
#define end_sswitch   }; if (ss.count( _s )) ss[_s](); else if (ss.count( "\1" )) ss["\1"](); if (_continue) continue; } 

This looks and behaves very much like a standard switch statement, but does a proper switch(string) without mis-matches.

There are only three significant differences:

  • There is no automatic fall-through.
     Use sbreak only to escape early.
     Use sgoto to perform what fall-through would accomplish.

  • To continue on the next iteration of a loop, you must use scontinue.

  • You must use end_sswitch to terminate the switch statement.
     This is a consequence of RLM (radical language modification).

Since this is a compile-time structure you even have the option to create a constexpr perfect hash with it. Replace the std::unordered_map with something appropriate and verify that the matched value actually matches your string in the end_sswitch test. (Google around “C++ perfect hash map” for things people have written.)

Finally, an example. Notice how closely it mirrors an integer switch.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
#include <algorithm>
#include <cctype>
#include <chrono>
#include <ciso646>
#include <iostream>
#include <limits>
#include <string>
#include <thread>

#include "string-switch.hpp"

int main()
{
  bool done = false;

  std::cout << 
    "Welcome to the MENU program.\n"
    "First-letter abbreviations are OK. Whitespace and letter case are ignored.\n";

  while (!done)
  {
    std::cout <<
      "\nMENU\n"
      "eat\n"
      "sleep\n"
      "derp\n"
      "quit\n"
      "? ";
    std::string s;
    getline( std::cin, s );
    std::cout << "\n";

    // Remove whitespace
    s.erase( std::remove_if( s.begin(), s.end(), []( char c ) { return std::isspace( c ); } ), s.end() );
    // tolower( s )
    for (char& c : s) c = std::tolower( c );


    sswitch (s)
    {
      scase("eat")
      {
        std::cout << 
         "The Elvis:\n"
         "Peel and slice a banana into 1/4\" pieces.\n"
         "Apply liberally to a Peanut Butter sandwich.\n"
         "Enjoy!\n";
      }

      scase("sleep")
      {
        std::cout << "how long (seconds)? ";
        int n;
        (std::cin >> n).ignore( std::numeric_limits <std::streamsize> ::max(), '\n' );
        if ((n < 0) or (n > 5))
        {
          std::cout << "No sleep for you!\n";
          sbreak;
        }
        std::cout << "sleeping..." << std::flush;
        std::this_thread::sleep_for( std::chrono::seconds( n ) );
        std::cout << "...awake!\n";
      }

      scase("derp") { sgoto_default; }

      scase("quit") { done = true; scontinue; }

      scase("e")    { sgoto("eat");   }
      scase("s")    { sgoto("sleep"); }
      scase("d")    { sgoto("derp");  }
      scase("q")    { sgoto("quit");  }

      sdefault { std::cout << "Um, not an option."; }
    }
    end_sswitch


    std::cout << "\nTry another one!\n";
  }

  std::cout << "Good bye!\n";
}

Enjoy!
it is not unusual at all to have a constant set of data, and learning to make a perfect hash of it is worth a few hours to build if its high performance code.

string or substrings up to a length of 8, you can directly convert byte into int and from there into to index. random generators are sweet for this stuff to, seed your random with some integer built from the data and pick off the first generated value and mash that into an index -- works extremely well in my experience, and its fast.

As far as production code goes, it depends on your shop/needs/ etc. If you need high performance code, then you have to relax your bans on some code blocks/ approaches. If you just want clean code that any new guy straight out of school can read and edit and modify safely, no matter how slowly it runs, banning such things may be sensible.

Whatever happens, its not worth your job to do it the right way if the right way is banned by dogmatic know it alls, but this may not be a great place to work forever either...

you can use Boolean expressions to emulate the switch and hash etc without all the extras as well.
functionpointer fp[] = {defaultfx,foofx,barfx};
functionpointer = fp[((int)(s=="foo") + (int)(s=="bar")*2];
which says that if s is foo, return 1, if s is bar, return 2, else return 0.
just exploit that Boolean expressions are 0 or 1 and math out the mapping you need to get it done from that....
The above is sort of playing compiler (its doing what a switch does by hand). But it directly avoids the switch problem (has to be integer type) and the contortions needed to make non integers useable in a switch. It can do switch fall-through too, but the logic gets nasty for complex needs.




Last edited on
@Duthomhas

... don’t introduce extra functions I have to look up...

That's exactly the situation. I have to succumb to the laziness of people who refuse to learn.

I actually like your sswitch() idea. Probably not for the problem at hand because I guess someone will hit me with a keyboard when I introduce those #define's.
Personally, I kinda like the preprocessor, though.

@jonnin
... if the right way is banned by dogmatic know it alls ...

If there were any know-it-alls. They're don't-wanna-learn-mores.
Last edited on
To solve your specific problem, take the easy way: an if..else if sequence. It will work, it will not likely have any significant disadvantage in speed, and it will pass code review.

Sort your strings in order from most likely to least likely needed. Then just chain them. It mirrors a switch visually and Just Works™.

1
2
3
4
5
6
7
8
9
10
11
12
  if (s == "foo")
  {
    std::cout << "Hello world!\n";
  }
  else if (s == "bar")
  {
    std::cout << "Awesomeness quotient + 1\n";
  }
  else // default
  {
    std::cout << "Whatever, dude.\n";
  }

There honestly is nothing wrong with code like this. And, importantly, it should get past the moron detection too.

Good luck!
Registered users can post here. Sign in or register to post.