4.2. Quick Tour

4.2.1. Hello World

Here is the Cobol version of "Hello World".

       IDENTIFICATION DIVISION.
       PROGRAM-ID. hello.
       ENVIRONMENT DIVISION.
       DATA DIVISION.
       PROCEDURE DIVISION.
           DISPLAY "Hello World!"
           STOP RUN.

It can not beat the "scripting" languages in terms of conciseness, but at least the program looks well organized and readable. These are indeed two characteristics of Cobol programs. A Cobol program is organized in a hierarchical structure consisting of divisions, sections, paragraphs, and sentences. At the top of the hierarchy are the four divisions which we can already see in our first program. The identification section gives some general information about the program (in our case just its name). The next two division, environment and data, are dealing with files. Since our "Hello World" program just writes to the screen, these divisions are empty. The actual program is contained in the procedure division. In our example, it contains the two statements to print the message and stop the program. In the following examples we often omit the divisions which are not required or clear from the preceding examples.

The format of a Cobol program is line-oriented (showing Cobol's punchcard origins). A line is divided into five areas as described in Table 4-1>.

The line and program identification area were useful in case two piles of punchcards got mixed up. When using the areas A and B it is decisive where the code starts. As we have seen in the "Hello World" program, the organizational identifiers start in area A, and the code statements in area B.

4.2.2. Variables and Arithmetic

Cobol is tailored for a particular kind of application: processing files consisting of character data organized in fixed length records. This approach leads to a different way of defining data structures when compared to "modern" languages. To describe a field in a record of characters we need to know how many characters are used and how these characters are interpreted (e.g., as integers, decimal numbers, or strings). In Cobol terms, we have to define the picture of a field.

       IDENTIFICATION DIVISION.
       PROGRAM-ID.    addition.
       ENVIRONMENT DIVISION.
       DATA DIVISION.
       WORKING-STORAGE SECTION.
       77  A  PICTURE 99V999.
       PROCEDURE DIVISION.
           MOVE 1.5 TO A
           ADD  1.5 TO A
           DISPLAY "A=", A
           STOP RUN.

Running the program results in the output A=03.000. The most interesting line is the definition of the variable A in the working storage section. It starts with the elementary level 77 (more on levels below) followed by the name of the variable, the PICTURE keyword (most often abbreviated PIC), and the picture format 99V999. The format strings use a syntax which makes it easy to literally picture the field. Each 9 stands for a decimal digit and the V indicated the position of the decimal point. Hence, our variable A represents a decimal number which occupies five characters, each of which must be a decimal digit, and the last three digits are interpreted as the decimals. The string 12345 is interpreted as the decimal number 12.345 (which explains the way the result of our program is displayed). Next to the 9 representing a decimal digit, the letter X representing an arbitrary character is the most often used format symbol. We will see plenty of examples in the following sections.

Apart from the picture clause, the program demonstrates the rather wordy (but readable) way to perform calculations with Cobol. Assignment uses the MOVE TO statement, and similarly the ADD TO statement is used to add a value to a variable. There are equally expressive ways to subtract, multiply, and divide.

       WORKING-STORAGE SECTION.
       77  A  PICTURE 99V999.
       77  B  PICTURE 99V999.
       PROCEDURE DIVISION.
           MOVE 3.5 TO A
           MULTIPLY 2 BY A
           DISPLAY "A=", A
           MULTIPLY A BY 3 GIVING B
           DISPLAY "B=", B
           STOP RUN.

result:
A=07.000
B=21.000

Without the GIVING clause, the result of the multiplication is stored in the second operand (which therefore must be a variable). The division statement DIVIDE has even more variations using either BY (mathematical order of arguments, dividend first) or INTO (opposide order).

       WORKING-STORAGE SECTION.
       77  A  PICTURE 99V999.
       77  B  PICTURE 99V999.
       PROCEDURE DIVISION.
           MOVE 50 TO A
           DIVIDE 10 INTO A
           DISPLAY "A=", A
           DIVIDE A BY 2 GIVING B
           DISPLAY "B=", B
           STOP RUN.

result:
A=05.000
B=02.500

The DIVIDE INTO version stores the result by default (without a GIVING clause) in the second operand.

ADD and SUBTRACT can also take multiple arguments. Also note the special constant ZEROS used to reset the variable A.

       DATA DIVISION.
       WORKING-STORAGE SECTION.
       77  A  PICTURE 99V999.
       PROCEDURE DIVISION.
           MOVE ZEROS TO A
           ADD  1.5 2.5 10 TO A
           DISPLAY "A=", A
           SUBTRACT 3 1 FROM A
           DISPLAY "A=", A
           STOP RUN.

result:
A=14.000
A=10.000

For more involved computations, Cobol offers the COMPUTE statement which allows us to use arithmetic formulas as in the following example. And, yes, Cobol uses the correct precedence rules.

       DATA DIVISION.
       WORKING-STORAGE SECTION.
       77  A  PICTURE 99V999 VALUE 10.0 .
       PROCEDURE DIVISION.
           COMPUTE A = 4 + 1.5 * 3
           DISPLAY "A=", A
           STOP RUN.

result:
A=18.500

The example also demonstrates an initializer for the variable A using a VALUE clause.

All these computations were within the limits of our variable A. But what happens if the result does not fit into the assigned field?

       DATA DIVISION.
       WORKING-STORAGE SECTION.
       77  A  PICTURE 99V999 VALUE 10.
       PROCEDURE DIVISION.
           MULTIPLY 10 BY A
           DISPLAY "A=", A
           STOP RUN.

result:
00.000

The result vanishes! We will see later on, how to detect overflows in the program and act accordingly. For now, we must assume that the fields were defined big enough.

4.2.3. Subroutines and Control Statements

After some basic expressions, we have usually tackled functions as the main means to structure a program. Cobol does not support functions with arguments and return values, but uses subroutines to organize a program into smaller units. Subroutines are simply paragraphs of the procedure division. Each paragraph is introduced with a paragraph name starting in area A and consists of a sequences of statements (all beginning in area B). Without arguments and return values, the communication between these units relies on the shared access to the objects defined in the data devision. The statements in a paragraph can be called (making it a subroutine) using the PERFORM statement.

       IDENTIFICATION DIVISION.
       PROGRAM-ID. hello.
       ENVIRONMENT DIVISION.
       DATA DIVISION.
       PROCEDURE DIVISION.
       MAIN.
           PERFORM DISPLAY-HELLO
           PERFORM DISPLAY-HELLO
           PERFORM DISPLAY-BYE
           PERFORM DISPLAY-BYE.
           STOP RUN.
       DISPLAY-HELLO.
           DISPLAY "Hello World!".
       DISPLAY-BYE.
           DISPLAY "Bye!".

result:
Hello World!
Hello World!
Bye!
Bye!

The first paragraph is executed when starting the program (it does not have to be called MAIN. The PERFORM statement executes the named paragraph. At the end of the paragraph, control is returned to the calling paragraph. Note that the STOP command is required at the end of the main routine. Otherwise, the program will continue and execute the two subroutines (resulting in another "Hello World!" and "Bye!" message).

If we want to repeat an action multiple times, we can use add the TIMES clause to the PERFORM statement.

       PROCEDURE DIVISION.
       MAIN.
           PERFORM DISPLAY-HELLO 3 TIMES
           STOP RUN.
       DISPLAY-HELLO.
           DISPLAY "Hello World!".

result:
Hello World!
Hello World!
Hello World!

To repeat an action while a certain condition holds, we combine the PERFORM statement with the UNTIL clause followed by the condition.

       IDENTIFICATION DIVISION.
       PROGRAM-ID. count.
       ENVIRONMENT DIVISION.
       DATA DIVISION.
       WORKING-STORAGE SECTION.
       77  I  PICTURE 99 VALUE 0.
       PROCEDURE DIVISION.
       MAIN.
           PERFORM LOOP UNTIL I = 5
           STOP RUN.
       LOOP.
           DISPLAY "I=", I
           ADD 1 TO I.

result:
I=00
I=01
I=02
I=03
I=04

Cobol85 also has the equivalent of a repeat-until or do-while loop found in other languages by just adding WITH TEST AFTER to the PERFORM clause.

       WORKING-STORAGE SECTION.
       77  I  PICTURE 99 VALUE 0.
       PROCEDURE DIVISION.
       MAIN.
           PERFORM LOOP UNTIL I = 5
           STOP RUN.
       LOOP.
           DISPLAY "I=", I
           ADD 1 TO I.

For this example it does not make any difference whether we check the condition before or after the loop, but in some cases we want the loop to be executed at least once or the condition only makes sense at the end of the loop.

We can achieve the same thing without a subroutine using the second form of PERFORM UNTIL which takes a sequence of statements (the body of the loop) instead of the subroutine.

       WORKING-STORAGE SECTION.
       77  I  PICTURE 99 VALUE 0.
       PROCEDURE DIVISION.
       MAIN.
           PERFORM WITH TEST AFTER UNTIL I = 5
               DISPLAY "I=", I
               ADD 1 TO I
           END-PERFORM
           STOP RUN.

result:
I=00
I=01
I=02
I=03
I=04

In fact, Cobol also offers an integer loop using the VARYING clause of the PERFORM statement.

       IDENTIFICATION DIVISION.
       PROGRAM-ID. hello.
       ENVIRONMENT DIVISION.
       DATA DIVISION.
       WORKING-STORAGE SECTION.
       77  I  PICTURE 99 VALUE 0.
       PROCEDURE DIVISION.
       MAIN.
           PERFORM VARYING I FROM 1 BY 2 UNTIL I > 10
               DISPLAY "I=", I
           END-PERFORM
           STOP RUN.

result:
I=01
I=03
I=05
I=07
I=09

It is also possible to nest multiple integer loops, for example, when indexing a multidimensional array.

       IDENTIFICATION DIVISION.
       PROGRAM-ID. hello.
       ENVIRONMENT DIVISION.
       DATA DIVISION.
       WORKING-STORAGE SECTION.
       77  I  PICTURE 99.
       77  J  PICTURE 99.
       PROCEDURE DIVISION.
       MAIN.
           PERFORM
               VARYING I FROM 1 BY 1 UNTIL I > 3
               AFTER   J FROM I BY 1 UNTIL J > 3
               DISPLAY "I=", I, " J=", J
           END-PERFORM
           STOP RUN.

result:
I=01 J=01
I=01 J=02
I=01 J=03
I=02 J=02
I=02 J=03
I=03 J=03

Continuing with control statements, Cobol of course supports the basic if-then-else.

       IDENTIFICATION DIVISION.
       PROGRAM-ID. hello.
       ENVIRONMENT DIVISION.
       DATA DIVISION.
       PROCEDURE DIVISION.
       MAIN.
           IF 10 IS LESS THAN 100
           THEN
               DISPLAY "Yes, that's right"
           ELSE
               DISPLAY "No, that's wrong"
           END-IF
           STOP RUN.

Note that lengthy (but readable) IS LESS THAN can be replaced by a < sign. Multiple if statement can be combined in a case statement which is called EVALUATE in Cobol.

       IDENTIFICATION DIVISION.
       PROGRAM-ID. hello.
       ENVIRONMENT DIVISION.
       DATA DIVISION.
       WORKING-STORAGE SECTION.
       77  INPUT-VALUE  PICTURE 99 VALUE 0.
       PROCEDURE DIVISION.
       MAIN.
           DISPLAY "value: "
           ACCEPT INPUT-VALUE
           EVALUATE INPUT-VALUE
               WHEN 1
                   DISPLAY "ONE"
               WHEN 2
                   DISPLAY "TWO"
               WHEN 3
                   DISPLAY "THREE"
               WHEN OTHER
                   DISPLAY "MORE"
           END-EVALUATE
           STOP RUN.

We use the opportunity to write our first interactive program which asks with the ACCEPT command for the value to be used in the case statement.

4.2.4. Data Structures

In the first section we have defined a single variable in the working storage section of the data devision. We have used the level number 77 to indicate that the variable is elementary field. The level number will become much clearer when looking at the following definition of a nested structure modelling a person.

       IDENTIFICATION DIVISION.
       PROGRAM-ID.    addition.
       ENVIRONMENT DIVISION.
       DATA DIVISION.
       WORKING-STORAGE SECTION.
       01  PERSON.
           05  FIRST-NAME          PIC X(20).
           05  LAST-NAME           PIC X(20).
           05  ADDR.
               10  POSTAL-CODE     PIC X(5).
               10  CITY            PIC X(20).
               10  STREET          PIC X(20).
               10  STREET-NO       PIC X(5).
       PROCEDURE DIVISION.
           MOVE SPACES TO PERSON
           MOVE '40547DUESSELDORF' TO ADDR
           DISPLAY 'CITY=', CITY
           STOP RUN.

A person consists of a first name, a last name, and an address. An address is comprised of postal-code, city, street, and street number. The nested structure is defined in Cobol using level numbers. Fields with the same level number belong to the same level of the nested structure. In our example we start with level 01 which has to be in area A. Since this field does not have a picture definition is must be a structure. On the next level (we have chosen the level number 05) are first name, last name, and address. The first two field are elementary fields and therefore must have a picture clause defining their size and format. The address field is again a structure which needs to be further decomposed into elementary fields. Note the use of the size subscripts simplifying the field formats.

We can access the structure on all levels. The first statement of the procedure division resets the whole person structure. Next, we set the address. And finally, we display the city as an individual field. If the field names are unique (as in our example), the fields do not have to be qualified with their surrounding structure (in C we would need to write person.addr.city to access the city). If the same field name occurs in multiple places, it has to be qualified using the OF (or synonymously IN) clause.

       IDENTIFICATION DIVISION.
       PROGRAM-ID.   persons.
       ENVIRONMENT DIVISION.
       DATA DIVISION.
       WORKING-STORAGE SECTION.
       01  PERSON-1.
           05  FIRST-NAME          PIC X(20).
           05  LAST-NAME           PIC X(20).
       01  PERSON-2.
           05  FIRST-NAME          PIC X(20).
           05  LAST-NAME           PIC X(20).
       PROCEDURE DIVISION.
           MOVE 'Homer' TO FIRST-NAME OF PERSON-2
           STOP RUN.

Besides composing fields to structures, Cobol supports arrays. In the simplest case, we can define an array of fixed size of some elementary field by adding an OCCURS clause to the field definition.

       IDENTIFICATION DIVISION.
       PROGRAM-ID.    addition.
       ENVIRONMENT DIVISION.
       DATA DIVISION.
       WORKING-STORAGE SECTION.
       01  MY-ARRAY.
           05 A OCCURS 10 TIMES    PIC 99.
       77  I                       PIC 99 VALUE 0.
       PROCEDURE DIVISION.
           PERFORM VARYING I FROM 1 BY 1 UNTIL I > 10
               COMPUTE A (I) = 5 + 2 * I
           END-PERFORM
           DISPLAY "A(5)=", A (5)
           STOP RUN.

result:
A(5)=15

The individual elements of the array are accessed using the index in parentheses. Like Fortran and Smalltalk (and unlike Lisp and the C family), indexing starts at one. There should be white space before the open parenthesis (although Tiny Cobol does not complain if we omit the space), and we must not put any white space inside of the parentheses.

Similarly, we can define fixed length arrays of structures by adding the OCCURS to the structure level.

       IDENTIFICATION DIVISION.
       PROGRAM-ID.    addition.
       ENVIRONMENT DIVISION.
       DATA DIVISION.
       WORKING-STORAGE SECTION.
       01  PERSON OCCURS 10 TIMES.
           05  FIRST-NAME          PIC X(20).
           05  LAST-NAME           PIC X(20).
       PROCEDURE DIVISION.
           MOVE 'Homer' TO FIRST-NAME OF PERSON(3)
           DISPLAY 'Third person:', FIRST-NAME OF PERSON(3)
           STOP RUN.

4.2.5. Files and Records

Since file (batch) processing is such a dominant area for Cobol programs, most Cobol books start with what we will cover next: the definition of files and records. Here is a small program which reads a file containing the items of an invoice and computes the total of the invoice.

       IDENTIFICATION DIVISION.
       PROGRAM-ID. invoice.
       ENVIRONMENT DIVISION.
       INPUT-OUTPUT SECTION.
       FILE-CONTROL. SELECT INVOICE ASSIGN TO 'invoice.dat'
                     ORGANIZATION IS LINE SEQUENTIAL.
       DATA DIVISION.
       FILE SECTION.
       FD  INVOICE     LABEL RECORDS ARE STANDARD.
       01  ITEM.
           05  NAME                PIC X(20).
           05  AMOUNT              PIC 9(3)V.
           05  PRICE               PIC 9999V99.
       WORKING-STORAGE SECTION.
       77  MORE-RECORDS            PIC XXX     VALUE 'YES'.
       77  TOTAL                   PIC 9(5)V99  VALUE ZEROS.
       PROCEDURE DIVISION.
       MAIN.
           OPEN INPUT INVOICE
           PERFORM UNTIL MORE-RECORDS = 'NO '
               READ INVOICE
                    AT END
                       MOVE 'NO ' TO MORE-RECORDS
                    NOT AT END
                       COMPUTE TOTAL = TOTAL + AMOUNT * PRICE
           END-PERFORM
           DISPLAY 'TOTAL=', TOTAL
           CLOSE INVOICE
           STOP RUN.                 

The file contains records of fixed length. Each record has three fields: the name of the item, the amount of items bought, and the price per item. We recognize the structure definition in the file section which looks just like the structure definitions in the working storage section we have used before. The new part is the mapping to a file. Cobol separates the logical file from the physical implementation. The logical view is defined in the file section of the data division. Each logical file is defined by a file descriptor (FD) paragraph in the file section. The environment division contains the mappings of the logical files defined in the data division to the physical files controlled by the operating system. For each file, there is a FILE-CONTROL paragraph with a SELECT statement which assigns the name of the logical file to a physical file name. The environment division is the only part of the Cobol program which depends on the operating system and has to be adapted when migrating to a new environment.

Since we are running on a PC, we have added the ORGANIZATION IS LINE SEQUENTIAL clause which causes Cobol to interpret each line in the file as a record. Without this instruction, there is no separation (newline) between the records.

Once we have defined the record structure of our file, a simple READ statement reads a new record into the structure so that we can access the individual fields. The READ statement takes two blocks: one for the normal case when a new record has been read and one for reaching the end of the file. In our case we either update the total or set the MORE-RECORDS flag to NO.