So, you need to make sense of a massive COBOL codebase. Perhaps you’re modernizing it, debugging an issue, or just trying to untangle decades of business logic.
If you’re here, you already know that COBOL still powers mission-critical systems in banking, retail, insurance, and beyond. In fact, COBOL supports 80% of in-person credit card transactions and 95% of all ATM transactions. For those who know COBOL, the real challenge isn’t about understanding the language—it’s deciphering layers of updates, patches, and undocumented logic built over decades. Manually mapping out a large COBOL system can take months. But what if AI could do the heavy lifting for you?
Most technical posts focus on how we solved a problem. This one is about you—what you need to efficiently understand a COBOL system and how AI can help transform a black box into a blueprint.
Understanding the essential: Gaining observability
To effectively navigate a COBOL mainframe application, you need a clear view of its core components. We found that most systems can be broken down into three main entities:
- Online operations: Screens and the workflows they interact with
- Batch operations: Jobs, often written in JCL, that handle high-volume, repetitive tasks
- Utilities: Copybooks and complex shared logic
If you want to understand what a mainframe application actually does, you can get a very good idea by focusing on these key areas:
- User screens: What screens are available to the user?
- Screen interactions: What actions can be performed on each screen?
- Batch operations: What batch jobs exist, and when do they run?
- Batch job logic: What does each batch job do?
- Shared logic: Are they complex routines used across multiple operations? If so, what do they handle?
Being able to answer these questions gives you a solid, high-level understanding of the application. But understanding isn’t just about collecting details—it’s about filtering the noise.
A system with thousands of screens can quickly become overwhelming. The key is to separate the signal from the noise, identifying patterns and grouping similar components together. Too much raw information isn’t helpful; the real goal is to extract the essence of how the system functions.
Consider this example:
When looking at a COBOL codebase, it is common to get a flat folder with all (or most of) the source files:
If you’re lucky, the files have extensions – making it easier to separate programs (.cbl) from copybooks (.cpy) and perhaps other files (e.g., .jcl). Even if that is the case, the filenames are usually cryptic.
Swimm’s approach helps you navigate the codebase in a few ways. The one is to group together jobs, screens, copybooks and programs:
Notice that Swimm doesn’t only group the files by type, but it also provides meaningful names – so you see Employee Data Entry in addition to the file’s name (EMPENT).
Then, when you select an entry point (say, a screen or a program) – you can see other files that relate to your entry point. So if you want to understand the payroll calculation process, you might start with the program that actually performs the calculation, and then find it can be triggered via a recurring JCL Job, or find a screen that triggers it as well. You will also see that it uses other programs that you may want to consider:
Drilling down: Understanding screen behavior
So, now you’ve identified a specific screen and know what it does at a high level. But what if you need to go deeper? Maybe you’re rewriting this screen’s logic on a modern platform, or modifying its behavior to meet new business requirements.
To fully understand a screen’s functionality, you need answers to key questions:
- What is the screen’s goal?
- What are the input limitations and validations?
- What happens when a user interacts with the screen?
- How is the underlying logic implemented? (e.g., how is the user’s account ID validated? How is the current balance retrieved?)
- What business rules apply? (we’ll get to this in part 2 of the series)
A truly useful document should provide all this information in a clear, structured way. Enter Swimm.
How Swimm brings a screen to life
1. Understanding the screen’s goal
Sometimes, just seeing a screen already gives you significant insight. Swimm reconstructs screens from the underlying code, presenting them in a way that mirrors how users interact with them.
Beyond visuals, Swimm also provides concise textual descriptions, like: “The Bill Payment screen (COBIL00) is a CICS COBOL program that allows users to pay their credit card balance in full through an online interface.”
This instantly clarifies the screen’s purpose, making it easier to grasp its function within the system.
2. Input limitations and validations
If you need to replicate the screen’s behavior or modify input rules, you’ll want to understand exactly how inputs are validated. Swimm automatically extracts this information, detailing field constraints like:
Account ID Input Field (ACTIDIN):
- Length: 11 characters
- Required field
- Becomes read-only after initial entry
Current Balance Display (CURBAL):
- Format: Signed numeric
- Read-only
Payment Confirmation (CONFIRM):
- Single-character input (yes or no)
- Case-insensitive validation
Swimm doesn’t just list validations—it connects them to their source. For example, the Payment Confirmation input field is defined in the screen’s BMS file:
CONFIRM DFHMDF ATTRB=(FSET,NORM,UNPROT), -
COLOR=GREEN, -
HILIGHT=UNDERLINE, -
LENGTH=1, -
POS=(15,60)
But the case-insensitive validation logic is implemented separately in the COBOL program:
EVALUATE CONFIRMI OF COBIL0AI
WHEN ‘Y’ WHEN ‘y’
SET CONF-PAY-YES TO TRUE
PERFORM READ-ACCTDAT-FILE
WHEN ‘N’ WHEN ‘n’
PERFORM CLEAR-CURRENT-SCREEN
MOVE ‘Y’ TO WS-ERR-FLG
WHEN SPACES
Swimm automatically links both layers of validation, giving you a full picture of input constraints and processing.
3. Understanding user interactions
What actually happens when a user interacts with the screen? Answering this requires both clarity and accuracy—a challenge Swimm tackles through visual flowcharts that illustrate all possible interactions.
For example, Swimm’s generated diagrams might show:
- The sequence of operations triggered by different inputs
- How data flows between screens and backend systems
- Conditional branches based on user actions
If you need a different level of detail, Swimm’s visualization tools can adapt to your needs, letting you refine the diagram to highlight the most relevant flows.
4. Step-by-step code walkthroughs
For deeper technical insights, a high-level diagram isn’t enough—you need to see the exact logic behind key operations. Swimm provides step-by-step walkthroughs, aligning code snippets with clear explanations.
For instance, if you’re analyzing how the user account ID is validated, Swimm will:
- Extract the relevant COBOL routines
- Provide inline comments explaining each step
- Highlight dependencies between different parts of the code
This approach allows you to trace a function’s execution path without manually searching through thousands of lines of legacy code.
Understanding batch operations
While individual screens provide critical insight into a mainframe application, much of the core business logic often resides in batch operations. These jobs process large volumes of data, execute key financial calculations, and generate essential system updates.
Let’s say you already have a high-level understanding of a batch operation. Now, you need to go deeper—whether to debug an issue, modify existing functionality, or rewrite the logic in a modern environment.
For example, consider Swimm’s auto-generated summary for CBACT04C: “The Interest Calculator (CBACT04C) is a COBOL batch program that calculates interest charges for credit card accounts based on transaction category balances. It processes account balances, applies appropriate interest rates, and generates interest transactions.”
While this does give you a broad, succinct understanding of the program’s purpose—many critical questions still remain:
- Does the process process new accounts differently?
- How is the interest rate calculated?
- What is the result of the interest transaction generated?
- What are the input files and formats?
- How are edge cases handled?
How Swimm helps
To answer the question: Does the program process new accounts differently? Swimm generates a flowchart that visually maps the decision logic. Instead of manually tracing COBOL conditionals and nested IF statements, you can instantly see:
- What changes when processing a new account vs. an existing one
- Which processing steps remain the same regardless of account type
- Where business rules apply to different categories of accounts
This lets you quickly identify critical paths in the batch job’s execution without sifting through thousands of lines of COBOL.
Breaking down the interest calculation logic
Understanding how interest is calculated requires more than just an overview—you need to see:
- The formula used to apply interest rates
- Any conditional adjustments based on account status, balance type, or transaction history
- Where rounding, thresholds, or caps are applied
Swimm extracts and annotates the relevant COBOL logic, mapping it to a step-by-step explanation. This eliminates the need for manual code spelunking, letting you focus on what truly matters.
Connecting input data to processing logic
Batch jobs don’t operate in isolation—they rely on structured input files. To fully understand a job, you need answers to:
- What files are used as input?
- What is the format of these files? (E.g., fixed-width, delimited, VSAM datasets)
- How are records parsed and processed?
- Which fields impact downstream calculations?
Swimm automatically links the batch program’s file-handling routines to their definitions, helping you trace:
- Where data comes from
- How records are validated and transformed
- Where processed data is written
This ensures that if you modify the job, you don’t accidentally disrupt upstream or downstream processes.
What about edge cases and exceptions?
Every batch process must deal with unexpected conditions, such as:
- Accounts with zero balances
- Transactions missing required fields
- Invalid or outdated interest rate tables
- System failures or partial job executions
Swimm highlights exception-handling routines, making it easy to see:
- How the program detects errors
- Whether errors trigger retries, logging, or alerts
- Whether certain failures halt processing or allow partial execution
Next up: Business logic
In part 2, we’ll dive into how you can use AI to extract decades of missing business logic. Stay tuned.