c++ goto pattern

Pages: 12
That totally opaque code comes from an old parser. A bit of formal languages theory would be enough to know how shift and reduce operate. The point of that opaque code is achieving a flow which would be hardly doable in other manners.

What if the maintenance of the parser is transferred to an acolyte that doesn't have that bit of theory?

You do vouch that goto can make code easier to read. Then you give an example that apparently isn't easy. Does that support your case?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
int parse()
{
    Token   tok;

reading:
    tok = gettoken();
    if (tok == END)
        return ACCEPT;
shifting:
    if (shift(tok))
        goto reading;
reducing:
    if (reduce(tok))
        goto shifting;
    return ERROR;
}

//====================
// without gotos and without adding a dozen more variables or functions:
int parse()
{
  Token   tok;
  while ( true ) {
    tok = gettoken();
    if (tok == END) return ACCEPT;

    while ( ! shift(tok) ) {
      if ( ! reduce(tok) ) return ERROR;
    }
}
// disclaimer: unverified correctness 
In my (not so humble) opinion, the version with goto is cleaner, far easier to understand
and way more elegant that the grotesque abomination of an if with a negated condition nested inside a while with a negated condition nested inside a while with a condition that will never be false.
If you need this level of minor performance gains provided by gotos, you probably should be writing that block of code in assembly yourself. There is no contest between very carefully written assembly and c++ ... it is often more than double the speed. It is also costly to write and maintain, and good assembly programmers are few with not a lot of youngsters taking up the flag to replace the older folks. And its a royal pain to keep up with the changes in the hardware every iteration. Goto is a hard jump, and hard jumps have a bad habit of messing up cpu pipelines, which can cost a fair number of clock cycles. It depends on the code whether this is a hit or not, and whether the alternative does the same or not, but you can't 'optimize' with them blindly.

VERY few programs need this level of optimization. When they do, the cure is to get out of the high level language entirely. I used to write exclusively embedded real time code and I think I had to go to assembly a total of 4-5 times over 20 years. Once was 1 line to endian flip some integers. Once was to tap the clock register for a high performance timer. I can't recall the others, but they were tight inner loop optimizations for something or other that wasn't cutting it on the hardware at the time (which was around 1/100th of what we have today in raw flops). Today the c++ would have done it without the assembly and had time to check the web in between.

As someone already said, don't be an assembler.

Last edited on
I can't agree with that. It's not like the goto version doesn't have those constructs. It's just disguised them to make the code appear more linear. If you think the structured version is uglier it's because it accurately represents the ugly structure of the control flow.
In my (not so humble) opinion, the version with goto is cleaner, far easier to understand and way more elegant that the grotesque abomination of an if with a negated condition nested inside a while with a negated condition nested inside a while with a condition that will never be false.


I disagree with the "condition that will never be false" statement. There are correct infinite loops throughout computer code, and declaring a while (true) makes it pleasantly obvious where one is.

As to the other points, the code code be written as follows (still adding no additional variables or functions)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
int parse()
{
  Token tok;
  while ( true ) {
    tok = gettoken();
    if (tok == END) return ACCEPT;

    while ( true ) {
      if ( shift(tok) ) break;
      if ( reduce(tok) ) continue;
      return ERROR;
    }
  }
}


The fact remains that the structure of the logic in this function (as written) is simply difficult to capture elegantly, even in the goto form. With the gotos, the looping behavior is not as obvious as similar code with while loops. The more obvious the structure of the code, the easier it is to maintain.

If I were writing this code from scratch, I would probably turn the inner loop into a for loop to make the code more readable (but adding a variable). Additionally (or alternately), I might consider pulling the inner shift loop into its own function for readability purposes (not sure--that would take a little more analysis). Readability trumps the cost of an extra variable or function call any day (unless extreme size limitations dictate otherwise).

I was going to respond. Everyone in this thread knows how to parse an LR grammar, and what a shift-reduce parser is. I said it was opaque, not unreadable. Code snippets with little context are always used as a distraction to try to prove some contested point instead of dealing with criticism properly.

I was also going to respond further, but then I thought this: https://xkcd.com/386/

I don't have the time or interest to debate stupid crap with a real programmer using magic Bible-thumping powers. Instead, I'm going to ignore it and let it flow into all the karmic garbage that the internet regularly produces.

That is the sound of a one-handed clap.
"shift-reduce parser"? No, no recollection, nor intent to look it up. So count one that does not know.
IMO the version with gotos clearly reflects the state machine that it models. The other alternatives break down quickly as the complexity of the state machine increases. The goto field just gets longer.

I think @barnack's snippet is the best choice presented.
Last edited on
A state machine can generally be more closely modeled by a switch inside an infinite loop, though.
> The other alternatives break down quickly as the complexity of the state machine increases.

In C++, an alternative is to use the state design pattern: https://sourcemaking.com/design_patterns/state

C code often simulates his pattern by using action tables (what should we do first) and goto tables (where should we go to next).


In languages without built in support for RAII, goto is often the cleanest way to correctly manage resources in the presence of errors. I think that this is what both JessCPP and barnack were trying to illustrate in their original code samples. goto is frequently used in the Linux, FreeBSD and Microsoft kernel/driver code bases for this purpose.

Here's is an (extreme) example from the linux kernel:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/nfs/inode.c#n2130

init_nfs_fs() has ten gotos to ten different labels in 30 lines of code. In fact, one in every sixty lines (including blank lines and comment only lines) in nfs/inode.c is a goto

(init_nfs_fs() violates the Linux kernel coding guidelines: "Choose label names which say what the goto does or why the goto exists. An example of a good name could be out_free_buffer: if the goto frees buffer. Avoid using GW-BASIC names like err1: and err2:, as you would have to renumber them if you ever add or remove exit paths, and they make correctness difficult to verify anyway."
https://www.kernel.org/doc/html/v4.10/process/coding-style.html )
Last edited on
Topic archived. No new replies allowed.
Pages: 12