At Swimm, we perform static code analysis in order to understand large and complex codebases. While working on parsing COBOL code, I discovered some weird and surprising behaviors. As I started sharing them with friends, I found out they were interesting for other (geeks 😇) as well.

Let’s review one such surprising behavior.

COBOL’s PERFORM Statement for Non-COBOL Programmers

To understand this behavior, let’s quickly review how COBOL’s PERFORM statement works.

COBOL programs are organized into sections and paragraphs (similar to functions in modern languages). Each paragraph has a name followed by a period:

PARAGRAPH-NAME.

    STATEMENT-1.

    STATEMENT-2.

    ...

The PERFORM statement allows you to execute a sequence of paragraphs:

PERFORM paragraph-name THRU exit-paragraph.

This command tells the computer to:

  1. Jump to “paragraph-name”
  2. Execute all paragraphs from there until “exit-paragraph” is completed
  3. Return to the statement immediately following the PERFORM

It is similar to function calling in other languages, without using arguments – within a COBOL program the state is managed by global variables.

A COBOL Control Flow Puzzle

Take a look at this COBOL code segment and see if you can predict what would happen:

A. DISPLAY "1. Starting program".
   PERFORM D THRU E.
B. DISPLAY "5. Second stage".
   PERFORM C THRU F.
   DISPLAY "9. Program complete".
   STOP RUN.
C. DISPLAY "6. In paragraph C".
D. DISPLAY "2. In paragraph D".
   IF FIRST-RUN
      DISPLAY "3. Error on first run - going to B"
      MOVE FALSE TO FIRST-RUN
      GOTO B.
   DISPLAY "X. This line only executes on second run".
E. DISPLAY "4. In paragraph E".
F. DISPLAY "8. In paragraph F".

Assume FIRST-RUN is a condition that’s true on the first execution of paragraph D, but false on subsequent executions. What output would you expect to see?

Think about it for a moment before reading on.

The Surprising Answer

Most people would expect something like this:

1. Starting program

2. In paragraph D

3. Error on first run – going to B

5. Second stage

6. In paragraph C

2. In paragraph D

X. This line only executes on second run

4. In paragraph E

8. In paragraph F

9. Program complete

But the actual output is:

1. Starting program

2. In paragraph D

3. Error on first run – going to B

5. Second stage

6. In paragraph C

2. In paragraph D

X. This line only executes on second run

4. In paragraph E

5. Second stage       <– Wait, what? Why are we back at B?

6. In paragraph C

2. In paragraph D

X. This line only executes on second run

4. In paragraph E

8. In paragraph F

9. Program complete

Did you notice what happened? After executing paragraph E the first time, the program unexpectedly jumped back to B instead of continuing to F!

Technical Implementation Details: How PERFORM Actually Works

To understand the “armed mine” issue, we need to look at how COBOL implements the PERFORM statement internally:

  1. Control Block Structure: When COBOL encounters a PERFORM statement, it creates a control block containing:
  • The entry point (beginning paragraph address)
  • The exit point (ending paragraph address)
  • The continuation address (where to return after completion)
  1. PERFORM Stack: These control blocks are managed in a stack-like structure:
  • When a PERFORM executes, its control block is pushed onto the stack
  • When an exit paragraph is reached, the runtime checks if it matches the most recent exit point on the stack
  • If it matches, the control block is popped and execution returns to the saved continuation address
  1. Exit Paragraph Detection: When execution reaches any paragraph, COBOL:
  • Checks if this paragraph matches the current exit paragraph on the stack
  • If it matches, pops the stack and jumps to the saved return address
  • If not, it simply executes the paragraph and continues normally
  1. Normal Cleanup: In proper structured programming:
  • Each PERFORM statement gets a corresponding exit paragraph execution
  • Control blocks are properly pushed and popped in a balanced way
  • The stack empties naturally as execution proceeds

Now we can understand what happens when a GOTO disrupts this delicate stack management!

The “Armed Mine” Explanation

Now let’s walk through our example step by step with this technical understanding:

  1. Execution begins at statement A: PERFORM D THRU E
  • COBOL pushes a control block onto the stack:
    • Exit point: paragraph E
    • Continuation address: statement B
  • “1. Starting program” is displayed
  • Execution jumps to paragraph D
  1. In paragraph D, the condition FIRST-RUN is true:
  • “2. In paragraph D” is displayed
  • “3. Error on first run – going to B” is displayed
  • FIRST-RUN is set to FALSE
  • GOTO B jumps directly to paragraph B
  • Critical point: The control block for the PERFORM remains on the stack!
  • This is the “armed mine” – paragraph E is still associated with a return to B
  1. Execution continues at B:
  • “5. Second stage” is displayed
  • PERFORM C THRU F is executed
  • COBOL pushes another control block onto the stack:
    • Exit point: paragraph F
    • Continuation address: “DISPLAY ‘9. Program complete'”
  • The stack now has two control blocks: [E→B, F→”9. Program complete”]
  1. Paragraphs C and D execute:
  • “6. In paragraph C” is displayed
  • “2. In paragraph D” is displayed again
  • This time FIRST-RUN is false, so execution continues
  • “X. This line only executes on second run” is displayed
  1. When execution reaches paragraph E:
  • “4. In paragraph E” is displayed
  • COBOL checks if E matches any exit point on the stack
  • It matches the first PERFORM’s exit point!
  • The “mine” detonates:
    • COBOL pops the first control block
    • Execution jumps to the saved continuation address (B)
  • Note that the second PERFORM’s control block remains on the stack
  1. Execution repeats from B:
  • “5. Second stage” is displayed again
  • PERFORM C THRU F executes a second time
  • The stack now has two control blocks again
  • Paragraphs C through F execute fully this time
  • When F is reached, its control block is popped and execution returns to “DISPLAY ‘9. Program complete'”

This surprising behavior occurs because GOTO bypassed normal control flow, but didn’t clean up the PERFORM stack. The “armed mine” remains dormant until paragraph E is reached through a different path.

Technical Implementation Details

Behind the scenes, COBOL maintains a control stack structure that manages these execution paths:

  1. Control Block Structure: When a PERFORM statement is encountered, COBOL creates a control block containing:
  • The entry point (beginning paragraph address)
  • The exit point (ending paragraph address)
  • The continuation address (where to return after completion)
  1. PERFORM Stack: These control blocks are pushed onto a conceptual stack, with multiple active PERFORMs creating nested layers.
  1. The GOTO Problem: When a GOTO jumps out of a PERFORM range:
  • The control block remains active on the stack
  • The exit point (E in our example) is still associated with its continuation address (B)
  • When execution later reaches that exit point through a different path, it “detonates” the mine

Accurate Analysis of COBOL code is Complex

In this post we saw a specific example, where the PERFORM/GOTO interaction creates a form of implicit state that persists beyond the visible control flow of the program. It’s almost like a hidden variable that gets set during execution but isn’t visible in the source code.

When we parse and analyze legacy COBOL code, we’re not just dealing with the visible structure of the program but also with these invisible continuation addresses that can dramatically alter control flow in ways not obvious from reading the source.

To correctly analyze COBOL code and be able to understand it, we need to deeply understand its underlying mechanisms.