3 Quick COBOL Tutorials: From Basic to Advanced

What Is COBOL?

COBOL stands for Common Business Oriented Language and has been a mainstay in business computing for decades. Developed in the late 1950s, it was created to be readable by non-technical staff while providing capabilities for large-scale data processing. Its syntax is intended to resemble plain English, which simplifies maintenance and understanding.

Despite its age, the language maintains a significant presence, especially in industries like finance and government where legacy systems prevail. COBOL offers strong typing and structured programming, ensuring reliable and predictable outputs. Many financial transactions and business reports still rely on COBOL programs.

This is part of a series of articles about legacy code.

Is COBOL Still Relevant Today?

COBOL remains relevant primarily due to its extensive use in legacy systems that power critical operations in sectors like banking, insurance, and government. The cost and risk of migrating these applications to newer languages can be substantial, leading many organizations to maintain and update existing COBOL systems instead of replacing them.

This results in a continuing demand for COBOL programmers capable of integrating modern data processes with longstanding COBOL applications. Additionally, the language has undergone updates to integrate with modern systems, allowing it to interact with contemporary software and databases. Thus, COBOL can still contribute to new developments in software architecture.

Related content: Read our guide to COBOL migration

COBOL Tutorial #1: Learning the Basics

COBOL Program Structure

A COBOL program is structured into several divisions, each serving a purpose in the program’s operation. The standard divisions in a COBOL program are:

Identification division: Defines the program’s name and other metadata, such as the author and the date. It serves as a descriptive section and has no direct impact on the program’s logic.
Environment division: Specifies the environment in which the program will run. It includes sections like the CONFIGURATION SECTION, where users define the hardware and software configuration, and the INPUT-OUTPUT SECTION, which details the files and devices the program will interact with.
Data division: Defines all the data items that the program will use. This division is further divided into several sections:
- File section: Describes the layout of files that the program will read from or write to.
- Working-storage section: Declares variables and constants used in the program’s logic.
- Local-storage section: Similar to working-storage, but the variables here are re-initialized each time a program is invoked.
- Linkage section: Used for defining variables passed between programs.
Procedure division: This division contains the program’s executable code. Here, users write the instructions that process the data defined in the Data Division. The procedure division is structured into paragraphs and sections, making it easier to organize and manage large programs.

Writing Your First COBOL Program

Writing a COBOL program begins with setting up the basic structure, followed by coding the logic in the procedure division. Here’s a simple example to illustrate how to create a basic COBOL program:

IDENTIFICATION DIVISION.
PROGRAM-ID. HELLO-WORLD.

ENVIRONMENT DIVISION.

DATA DIVISION.

PROCEDURE DIVISION.
     DISPLAY 'Hello, World!'.  
     STOP RUN.

Explanation:

The IDENTIFICATION DIVISION declares the program name as HELLO-WORLD.
The ENVIRONMENT DIVISION is left empty in this basic example, as no special environment settings are required.
The DATA DIVISION is also empty since no data items are needed for this simple program.
The PROCEDURE DIVISION contains the executable instructions. Here, DISPLAY 'Hello, World!' outputs the text to the screen, and STOP RUN ends the program.

To compile and run this COBOL program, the developer would typically use a COBOL compiler provided by software like GnuCOBOL or Micro Focus COBOL.

COBOL Data Types

COBOL’s data types are designed to handle the structured data commonly found in business applications. The primary data types in COBOL include:

Numeric: Used to store numbers. COBOL distinguishes between different types of numeric data:
- PIC 9(n): Defines an integer with n digits.
- PIC 9(n)V9(m): Defines a fixed-point decimal number where n is the number of digits before the decimal point, and m is the number of digits after it.
- PIC S9(n): Represents a signed integer.
Alphanumeric: Used for text and strings.
- PIC X(n): Defines a string of n characters.
- PIC A(n): Specifies alphabetic characters only.
Alphabetic: This type stores only letters and spaces, defined by PIC A(n).
Comp-3: Also known as packed decimal, this is a storage-efficient format for numeric data, using a compressed representation.

Basic COBOL Statements

COBOL provides a range of basic statements for performing operations in the program. Some of the most commonly used statements include:

MOVE: Assigns a value to a variable.

MOVE 'John' TO CUSTOMER-NAME.

DISPLAY: Outputs data to the console or terminal.

DISPLAY 'Processing Complete'.

COMPUTE: Performs arithmetic operations and assigns the result to a variable.

COMPUTE TOTAL-COST = PRICE * QUANTITY.

IF… ELSE: Conditional statement used for decision-making.

IF AGE > 18

DISPLAY 'Adult'

ELSE

DISPLAY 'Minor'

PERFORM: Executes a specified section or paragraph of code. It can be used with loops and iterative processes.

PERFORM CALCULATE-TOTALS.
These statements form the core of COBOL programming and are used to build the logic and control flow in COBOL applications.

Tips from the expert

Omer Rosenbaum

CTO & Co-founder at Swimm

In my experience, here are tips that can help you better master COBOL and leverage its capabilities in modern software development:

Optimize file handling with block and buffering techniques: When working with large files, optimize I/O performance by using block reading and writing (BUFFERS) instead of processing records one at a time. This can drastically improve performance in data-heavy applications, which is a common use case for COBOL.

Use COPY and REPLACE statements for reusable code: To reduce redundancy and improve maintainability, use COBOL’s COPY and REPLACE statements to include reusable code blocks or common logic across multiple programs. This can be particularly useful for including standardized validation routines or error handling procedures.

Use the EVALUATE statement for complex decision-making: The EVALUATE statement, akin to a switch-case in other languages, simplifies handling multiple conditions and complex branching. This reduces nested IF statements and makes the decision paths clearer and more maintainable.

COBOL Tutorial #2: Working with Files

File Handling Basics

COBOL’s file handling capabilities are central to its use in business applications, where processing large volumes of data stored in files is common. COBOL supports various file organizations, including sequential, indexed, and relative files, which are crucial for different types of data processing tasks.

In COBOL, files are described in the File section of the Data division. Here, users define the structure and attributes of the files the program will interact with. Each file in COBOL is associated with a FD (File Description) entry, where developers specify the file’s logical name, record structure, and any other relevant details.

The actual file operations—such as opening, reading, writing, and closing files—are handled in the Procedure division. The OPEN statement is used to make a file available for processing, while the CLOSE statement is used to release the file after processing is complete. Between these two operations, use statements like READ, WRITE, and REWRITE to manipulate the file’s data.

File Operations

In COBOL, file operations are performed through a series of well-defined steps, each involving specific COBOL statements. Here are the key operations:

Open: Before you can perform any operation on a file, you must first open it. The OPEN statement can open a file in various modes: INPUT for reading, OUTPUT for writing, I-O for both reading and writing, and EXTEND for appending data to an existing file.

OPEN INPUT CUSTOMER-FILE.

Read: The READ statement is used to retrieve a record from a file. For sequential files, records are typically read one at a time in the order they appear in the file. For indexed or relative files, you can specify the key or record number.

READ CUSTOMER-FILE INTO WS-CUSTOMER-RECORD.

Write: The WRITE statement adds a new record to a file. In the case of sequential files, records are added at the end. For indexed files, records are placed according to the key values.

WRITE CUSTOMER-RECORD.

Rewrite: The REWRITE statement modifies an existing record in a file. This operation is only applicable for files opened in I-O mode and is commonly used in indexed or relative files.

REWRITE CUSTOMER-RECORD.

Close: Once all file operations are completed, the CLOSE statement is used to close the file, ensuring that all data is properly saved and resources are released.

CLOSE CUSTOMER-FILE.

These operations form the backbone of file processing in COBOL, allowing developers to manage data in various types of business applications.

Creating and Reading a File

To illustrate file handling in COBOL, let’s walk through a simple example where we create a file, write records to it, and then read those records back.

Step 1: Define the Identification and the Environment Divisions

We need to define program id and the environment divisions as shown below:

IDENTIFICATION DIVISION.

PROGRAM-ID. WriteCustomerFile.

ENVIRONMENT DIVISION.

INPUT-OUTPUT SECTION.

FILE-CONTROL.

    SELECT CUSTOMER-FILE ASSIGN TO 'customer.dat'

        ORGANIZATION IS LINE SEQUENTIAL.

Step 2: Define the File in the Data Division

First, you need to define the file structure in the File section of the Data division:

FILE SECTION.

FD  CUSTOMER-FILE.

01  CUSTOMER-RECORD.

05  CUSTOMER-ID     PIC 9(5). 

05  CUSTOMER-NAME   PIC X(20).

Step 3: Write Records to the File

Next, in the Procedure division, open the file for output, write several records, and then close the file:

PROCEDURE DIVISION.

OPEN OUTPUT CUSTOMER-FILE.  

MOVE 10001 TO CUSTOMER-ID.  

MOVE 'John Doe' TO CUSTOMER-NAME.  

WRITE CUSTOMER-RECORD.  

MOVE 10002 TO CUSTOMER-ID.  

MOVE 'Jane Smith' TO CUSTOMER-NAME. 

WRITE CUSTOMER-RECORD.

CLOSE CUSTOMER-FILE.

Assuming we store the above code in a file called writefile.cbl, we can compile it using the following command:

cobc -free -x -o writefile writefile.cbl

Now execute it using the following command and .dat wil be created:

./writefile

Step 3: Read Records from the File

Finally, to read the records back, create a separate program for reading file called readfile.cbl:

IDENTIFICATION DIVISION.

PROGRAM-ID. ReadCustomerFile.

ENVIRONMENT DIVISION.

INPUT-OUTPUT SECTION.

FILE-CONTROL.

    SELECT CUSTOMER-FILE ASSIGN TO 'customer.dat'

        ORGANIZATION IS LINE SEQUENTIAL.

DATA DIVISION.

FILE SECTION.

FD  CUSTOMER-FILE.

01  CUSTOMER-RECORD.

    05  CUSTOMER-ID     PIC 9(5). 

    05  CUSTOMER-NAME   PIC X(20).

WORKING-STORAGE SECTION.

01  EOF-FLAG           PIC X(3) VALUE 'NO'.

01  WS-CUSTOMER-RECORD PIC X(25).

PROCEDURE DIVISION.

OPEN INPUT CUSTOMER-FILE.  

PERFORM UNTIL EOF  

    READ CUSTOMER-FILE INTO WS-CUSTOMER-RECORD  

        AT END  

            MOVE 'YES' TO EOF 

        NOT AT END  

            DISPLAY CUSTOMER-RECORD  

    END-READ  

END-PERFORM.  

CLOSE CUSTOMER-FILE.

STOP RUN.

Assuming we store the above code in the file readfile.cbl, we can compile it using the following command:

cobc -free -x -o readfile readfile.cbl

Execute it using the following command:

./readfile

Tutorial #3: Advanced Data Handling in COBOL

Working with Arrays (Tables) in COBOL

In COBOL, arrays are referred to as “tables,” and they are used to store multiple occurrences of the same type of data. Tables are particularly useful for handling repetitive data structures, such as lists of customers, items, or transactions.

To define a table in COBOL, use the OCCURS clause in the Data division. Here’s a basic example:

01  CUSTOMER-TABLE.
05  CUSTOMER-RECORD OCCURS 100 TIMES.  
    10  CUSTOMER-ID     PIC 9(5).  
    10  CUSTOMER-NAME   PIC X(20).

In this example, CUSTOMER-TABLE is an array of 100 CUSTOMER-RECORD entries, each containing a CUSTOMER-ID and CUSTOMER-NAME. You can access individual elements in the table by specifying their index.

To process elements in a table, you typically use the PERFORM statement in combination with an index variable:

PERFORM VARYING IDX FROM 1 BY 1 UNTIL IDX > 100
	DISPLAY CUSTOMER-ID(IDX) CUSTOMER-NAME(IDX)
END-PERFORM.

The following code performs initialization, data manipulation and display.

IDENTIFICATION DIVISION.
PROGRAM-ID. CustomerTableExample.

DATA DIVISION.
WORKING-STORAGE SECTION.

01  CUSTOMER-TABLE.
   05  CUSTOMER-RECORD OCCURS 100 TIMES.
       10  CUSTOMER-ID     PIC 9(5).
       10  CUSTOMER-NAME   PIC X(20).

01  IDX          PIC 9(3) VALUE 1.

PROCEDURE DIVISION.

INITIALIZE-CUSTOMERS.
   MOVE 1 TO CUSTOMER-ID(1)
   MOVE 'John Doe' TO CUSTOMER-NAME(1)

   MOVE 2 TO CUSTOMER-ID(2)
   MOVE 'Jane Smith' TO CUSTOMER-NAME(2)

   MOVE 3 TO CUSTOMER-ID(3)
   MOVE 'Alice Johnson' TO CUSTOMER-NAME(3)

   MOVE 4 TO CUSTOMER-ID(4)
   MOVE 'Bob Brown' TO CUSTOMER-NAME(4).

DISPLAY-CUSTOMERS.
   PERFORM VARYING IDX FROM 1 BY 1 UNTIL IDX > 4
       DISPLAY "Customer ID: " CUSTOMER-ID(IDX)
       DISPLAY "Customer Name: " CUSTOMER-NAME(IDX)
       DISPLAY "-------------------------------"
   END-PERFORM.

STOP RUN.

Assuming we save this code as CustomerTableExample.cbl, we can compile it using the following command:

cobc -free -x -o customertable CustomerTable.cbl

Execute it using this command:

./customertable

Tables in COBOL allow for efficient data handling, especially when dealing with large datasets that require repetitive operations. Mastery of tables is essential for developers working on systems that involve complex data processing.

Using COBOL’s STRING and UNSTRING Operations

COBOL provides STRING and UNSTRING operations for manipulating text data. These operations are useful for tasks such as parsing input, formatting output, and handling complex string manipulations.

String operation: The STRING statement concatenates multiple strings or variables into a single destination variable. It’s useful when you need to build a complex string from multiple sources.

Example:

IDENTIFICATION DIVISION.
PROGRAM-ID. StringExample.

DATA DIVISION.
WORKING-STORAGE SECTION.

01 FIRST-NAME   PIC X(20) VALUE "John".
01 LAST-NAME    PIC X(20) VALUE "Doe".
01 FULL-NAME    PIC X(40) VALUE SPACES.

PROCEDURE DIVISION.
MAIN-LOGIC.
   STRING FIRST-NAME DELIMITED BY SPACE
          LAST-NAME DELIMITED BY SPACE
          INTO FULL-NAME
   END-STRING

   DISPLAY "First Name: " FIRST-NAME
   DISPLAY "Last Name : " LAST-NAME
   DISPLAY "Full Name : " FULL-NAME

   STOP RUN.

Let’s store this code as stringops.cbl. We can compile it using the following command:

cobc -free -x -o stringops stringops.cbl

Execute it using this command:

./stringops

In this example, FIRST-NAME and LAST-NAME are concatenated into FULL-NAME, with spaces as delimiters.

Unstring operation: The UNSTRING statement breaks a string into multiple fields based on specified delimiters. This operation is useful for parsing input strings into individual components.

Example:

IDENTIFICATION DIVISION.
PROGRAM-ID. UnstringExample.

DATA DIVISION.
WORKING-STORAGE SECTION.
01 FULL-NAME          PIC X(30) VALUE "John Doe".
01 FIRST-NAME         PIC X(15).
01 LAST-NAME          PIC X(15).
01 DELIMITER-INDEX    PIC 9(2) VALUE 1.

PROCEDURE DIVISION.
   DISPLAY "Full Name: " FULL-NAME

   UNSTRING FULL-NAME DELIMITED BY SPACE
       INTO FIRST-NAME LAST-NAME
       WITH POINTER DELIMITER-INDEX
   END-UNSTRING.

   DISPLAY "First Name: " FIRST-NAME
   DISPLAY "Last Name: " LAST-NAME

   STOP RUN.

Let’s store this code in a file called unstring.cbl. We can compile it using the following command:

cobc -free -x -o unstring unstring.cbl

We can execute the code using the following command:

./unstring

Here, FULL-NAME is split into FIRST-NAME and LAST-NAME based on spaces. This operation is crucial when dealing with data input that comes in a single string format but needs to be processed as separate fields.

Integrating COBOL with Modern Architectures

Here are some of the ways that organizations can integrate their COBOL systems with a modern IT architecture.

Modularization and Service-Oriented Architecture (SOA)

COBOL programs, traditionally monolithic, can be modularized to integrate with Service-Oriented Architecture (SOA), a design pattern that promotes the development of services with well-defined interfaces. By breaking down COBOL applications into smaller, self-contained modules, organizations can expose core business functionalities as services that other applications can consume.

This modular approach allows COBOL systems to interact with modern applications, including those built on newer languages and frameworks. To implement SOA with COBOL, developers typically encapsulate COBOL logic within web services, often using middleware tools that enable communication between COBOL applications and other services via standard protocols like SOAP or REST.

Modernize the Data Layer

Traditionally, COBOL applications relied heavily on flat files or legacy databases like IBM’s VSAM or hierarchical databases such as IMS. However, to improve performance, scalability, and integration capabilities, organizations are increasingly moving towards modern relational databases (RDBMS) and NoSQL databases.

This modernization involves migrating data from legacy formats to modern databases, often using ETL (Extract, Transform, Load) processes. Additionally, COBOL programs can be updated to interact with these databases using modern APIs, such as ODBC or JDBC, which enable the COBOL application to perform SQL queries directly.

Microservices and Containerization

Microservices architecture involves decomposing large COBOL applications into smaller, independent services that can be developed, deployed, and scaled individually. This approach is beneficial for large enterprises that need to enhance agility and maintainability in their IT infrastructure.

Containerization, using technologies like Docker, allows COBOL applications to be packaged with their dependencies into containers that can run consistently across various environments. This approach simplifies the deployment process, enhances scalability, and allows COBOL applications to be part of cloud-native environments, interacting with other services in a microservices architecture.

Code Refactoring

Over the years, COBOL programs can accumulate technical debt due to outdated coding practices, lack of documentation, or the incorporation of patches and quick fixes. Refactoring involves restructuring existing COBOL code to improve its readability, maintainability, and performance without changing its external behavior.

Refactoring might include eliminating redundant code, breaking down large routines into smaller, more manageable procedures, and updating code to conform to modern COBOL standards. This process improves the code quality and makes it easier to integrate the COBOL application with other systems.

Documenting legacy code with Swimm

Legacy code represents a significant challenge as well as an opportunity for software organizations. Managing legacy code entails more than just dealing with untidy or outdated code; it involves transforming it into a reliable and efficient foundation that supports future development initiatives. Handling legacy code effectively can lead to considerable long-term savings and heightened productivity, marking it as a strategic priority for any R&D organization.

Swimm is a tool for enterprise developers that will help your team document legacy code in an easy, manageable, and effective way. Utilize AI to create docs about legacy code and then discover those docs directly in your IDE. Those docs then stay up to date with ongoing code changes, boosting your team’s productivity and enhancing the quality of your codebase.