4.3. More Features

4.3.1. Table Search

Since batch programs often deal with reference data kept in tables, e.g., a list of prices, Cobol makes it easy to look up data in tables. One solution is the use of indexed arrays in combination with the SEARCH statement.

       IDENTIFICATION DIVISION.
       PROGRAM-ID.    price-table.
       ENVIRONMENT DIVISION.
       DATA DIVISION.
       WORKING-STORAGE SECTION.
       01  ITEM-TABLE.
           05 ITEMS OCCURS 3 TIMES INDEXED BY ITEM-INDEX.
               10  ITEM-NAME       PIC X(20).
               10  ITEM-PRICE      PIC 999V99.
       77  ITEM-INPUT  PIC X(20).
       PROCEDURE DIVISION.
       MAIN.
           PERFORM INIT-PRICE-TABLE.
           DISPLAY 'ENTER NAME OF ITEM:'
           ACCEPT ITEM-INPUT
           SET ITEM-INDEX TO 1
           SEARCH ITEMS
               AT END DISPLAY 'UNKNOWN ITEM'
               WHEN ITEM-INPUT = ITEM-NAME (ITEM-INDEX)
                   DISPLAY 'PRICE=', ITEM-PRICE (ITEM-INDEX)
           STOP RUN.
       INIT-PRICE-TABLE.
           MOVE 'APPLE'    TO ITEM-NAME  (1)
           MOVE 0.50       TO ITEM-PRICE (1)
           MOVE 'ORANGE'   TO ITEM-NAME  (2)
           MOVE 1.50       TO ITEM-PRICE (2)
           MOVE 'PEAR'     TO ITEM-NAME  (3)
           MOVE 0.75       TO ITEM-PRICE (4).

The working storage section defines a table which contains the prices for some items. To keep the example simple, we fill our price table with hard-coded values in the program itself. Normally, the table would be read from reference data stored in a file or database.

The first new construct is the index attached to the array of ITEMS. We can view it as a loop variable for the array. We can set the index using the SET statement, but we do not have to define the index variable explicitly in the working storage section as we have done earlier. The compiler takes care of the correct size of the variable.

The index is used implicitly in the SEARCH statement. The SEARCH statement is comparable to the for-in loops in languages like Python or Perl, only that the loop variable is not part of the loop statement, but defined as part of the array. The interesting part is the body of the SEARCH statement. It consists of the (optional) AT END clause telling what to do if the end of the array is reached and any number of WHEN clauses defining the search conditions and actions. For each item in the table, the search loop checks the conditions of the WHEN clauses. If the condition is true, the associated action is performed and the search loop left.

While we could easily achieve the same with a PERFORM statement, the SEARCH command turns the focus away from the (imperative) loop to what we are looking for. Because of this, it is easy to exchange the algorithm used to find the entry. Adding the keyword ALL to the SEARCH statement, we switch from linear to binary search.

       IDENTIFICATION DIVISION.
       PROGRAM-ID.    price-table.
       ENVIRONMENT DIVISION.
       DATA DIVISION.
       WORKING-STORAGE SECTION.
       01  ITEM-TABLE.
           05 ITEMS OCCURS 3 TIMES
                   ASCENDING KEY IS ITEM-NAME INDEXED BY ITEM-INDEX.
               10  ITEM-NAME       PIC X(20).
               10  ITEM-PRICE      PIC 999V99.
       77  ITEM-INPUT  PIC X(20).
       PROCEDURE DIVISION.
       MAIN.
           PERFORM INIT-PRICE-TABLE.
           DISPLAY 'ENTER NAME OF ITEM:'
           ACCEPT ITEM-INPUT
           SEARCH ALL ITEMS
               AT END DISPLAY 'UNKNOWN ITEM'
               WHEN ITEM-INPUT = ITEM-NAME (ITEM-INDEX)
                   DISPLAY 'PRICE=', ITEM-PRICE (ITEM-INDEX)
           STOP RUN.
       INIT-PRICE-TABLE.
           MOVE 'APPLE'    TO ITEM-NAME  (1)
           MOVE 0.50       TO ITEM-PRICE (1)
           MOVE 'ORANGE'   TO ITEM-NAME  (2)
           MOVE 1.50       TO ITEM-PRICE (2)
           MOVE 'PEAR'     TO ITEM-NAME  (3)
           MOVE 0.75       TO ITEM-PRICE (4).

Of course, the binary search requires the table to be sorted (which we ensured in the initialization routine), and we have to specify the sort order of the table in the working storage section.

4.3.2. Macros (Copy Books)

Cobol's macro mechanism uses copy books and the COPY statement. Suppose we need the same structure or subroutine over and over again in multiple Cobol programs. Instead of copying the source code and performing some modifications manually, we can store the reused code in a separate file (a copy book) and call the COPY statement to let the Cobol compiler copy the code for us including some adaptations we might need. Here is the "Hello World!" of copy books:

01 HELLO-CONSTANTS.
    05 HELLO-TEXT PIC X(20) VALUE 'Hello World!'.

The text is stored in a file called hello.cpy (this is for Tine Cobol; different environments might have different ways to handle copy books). And here is the main program using it.

       IDENTIFICATION DIVISION.
       PROGRAM-ID.    copy-sample.
       ENVIRONMENT DIVISION.
       DATA DIVISION.
       WORKING-STORAGE SECTION.
       COPY hello REPLACING
           HELLO-TEXT BY HELLO-MESSAGE
           ==World== BY ==You==.
       77  A  PICTURE 99V999.
       PROCEDURE DIVISION.
           DISPLAY HELLO-MESSAGE.
           STOP RUN.

We reference the copy by its base name (without the .cpy suffix) and perform two replacements. Identifiers such as HELLO-TEXT can be replaced directly. The textual replacement inside of the message string uses == to delimit the original string and its replacement.

Copy books can not only be used for the data division, but for the environment and procedure division as well. As you can imagine, copy books are a very powerful mechanism to avoid code duplication. They are used heavily in large Cobol applications.

4.3.3. Subprograms

We have said in the beginning that Cobol does not have the notion of functions with arguments and return values. However, a similar effect can be achived by calling programs.

To turn a program into a subprogram which can be called by other programs, we have to declare the arguments of the subprogram in the LINKAGE section of the data division (right before the procedure devision so that it looks like a signature for the program).

       IDENTIFICATION DIVISION.
       PROGRAM-ID.  SUBPROG.
       ENVIRONMENT DIVISION.
       DATA DIVISION.
       LINKAGE SECTION.
       01  X   PIC X(20).
       PROCEDURE DIVISION USING X.
           DISPLAY 'X=', X
           MOVE 'We were here!' TO X
           EXIT PROGRAM.

The header of the procedure division contains the "parameter list" in the USING clause. Parameters to subprograms are always in-out parameters (they are passed by reference). The subprogram can read them and set them to new values. The calling program now calls the subprogram by name specifying which fields to pass as the arguments to the subprogram.

       IDENTIFICATION DIVISION.
       PROGRAM-ID.    call-sample.
       ENVIRONMENT DIVISION.
       DATA DIVISION.
       WORKING-STORAGE SECTION.
       01  A   PIC X(20) VALUE 'Hello World!'.
       PROCEDURE DIVISION.
           CALL 'SUBPROG' USING A
           DISPLAY 'A=', A
           STOP RUN.

result:
X=Hello World!
A=We were here

First, the subprogram prints the original message as passed from the main program. After returning from the subprogram, the main program prints the message which has been changed by the subprogram.

4.3.4. Sort and Merge

A common task for a batch application is to sort a file consisting of fixed length records with respect to some key fields. The following example sorts a file with respect to the first two characters of each record (of 40 characters).

       IDENTIFICATION DIVISION.
       PROGRAM-ID. sort-sample.
       ENVIRONMENT DIVISION.
       INPUT-OUTPUT SECTION.
       FILE-CONTROL.
           SELECT UNSORTED-FILE    ASSIGN TO 'unsorted.dat'
                  ORGANIZATION IS LINE SEQUENTIAL.
           SELECT SORT-FILE        ASSIGN TO 'sort.dat'
                  ORGANIZATION IS LINE SEQUENTIAL.
           SELECT SORTED-FILE      ASSIGN TO 'sorted.dat'
                  ORGANIZATION IS LINE SEQUENTIAL.
       DATA DIVISION.
       FILE SECTION.
       FD  UNSORTED-FILE LABEL RECORDS ARE STANDARD.
       01  UNSORTED-REC            PIC X(40).
       SD  SORT-FILE.
       01  SORT-REC.
           05  SORT-NO             PIC XX.
           05  FILLER              PIC X(38).
       FD  SORTED-FILE  LABEL RECORDS ARE STANDARD.
       01  SORTED-REC              PIC X(40).
       PROCEDURE DIVISION.
       MAIN.
           SORT  SORT-FILE
               ASCENDING KEY SORT-NO
               USING UNSORTED-FILE
               GIVING SORTED-FILE
           STOP RUN.                 

Three files have to be provided. The original unsorted file, the file for the sorted result, and the sort file used for intermediate storage. The sort key (here: SORT-NO) has to be defined for the sort file only. For the unsorted input file and the sorted result, we only specify the length of the record.

In the SORT statement, the list of sort keys is defined by the KEY clauses specifying either ASCENDING or DESCENDING sort order. The correct syntax requires the keyword ON before the first key clause, but Tiny Cobol does not accept this. If the unsorted input file unsorted.dat contains the following four records

20xxxxxxxxxxxfirst recordxxxxxxxxxxxxxxx
10xxxxxxxxxxxsecond recordxxxxxxxxxxxxxx
15xxxxxxxxxxxthird recordxxxxxxxxxxxxxxx
05xxxxxxxxxxxforth recordxxxxxxxxxxxxxxx

the result sorted.dat will be a nicely sorted file containing the records sorted with respect to the first two characters.

05xxxxxxxxxxxforth recordxxxxxxxxxxxxxxx
10xxxxxxxxxxxsecond recordxxxxxxxxxxxxxx
15xxxxxxxxxxxthird recordxxxxxxxxxxxxxxx
20xxxxxxxxxxxfirst recordxxxxxxxxxxxxxxx

The sort command provides hooks for input and output. Instead of reading from a file, a subroutine can be called which fills the sort file. In the following example, the input routine filters out records whose amount field is zero.

       IDENTIFICATION DIVISION.
       PROGRAM-ID. sort-sample.
       ENVIRONMENT DIVISION.
       INPUT-OUTPUT SECTION.
       FILE-CONTROL.
           SELECT INPUT-FILE    ASSIGN TO 'input.dat'
                  ORGANIZATION IS LINE SEQUENTIAL.
           SELECT SORT-FILE        ASSIGN TO 'sort.dat'
                  ORGANIZATION IS LINE SEQUENTIAL.
           SELECT SORTED-FILE      ASSIGN TO 'sorted.dat'
                  ORGANIZATION IS LINE SEQUENTIAL.
       DATA DIVISION.
       FILE SECTION.
       FD  INPUT-FILE LABEL RECORDS ARE STANDARD.
       01  INPUT-REC.
           05  INPUT-TYPE       PIC XX.
           05  INPUT-AMOUNT     PIC 9(8).
           05  FILLER              PIC X(30).
       SD  SORT-FILE.
       01  SORT-REC.
           05  SORT-NO             PIC XX.
           05  FILLER              PIC X(38).
       FD  SORTED-FILE  LABEL RECORDS ARE STANDARD.
       01  SORTED-REC              PIC X(40).
       WORKING-STORAGE SECTION.
       77  MORE-RECORDS PIC X(3) VALUE 'YES'.
       PROCEDURE DIVISION.
       MAIN.
           SORT SORT-FILE
               ASCENDING KEY SORT-NO
               INPUT PROCEDURE READ-INPUT
               GIVING SORTED-FILE
           STOP RUN.
       READ-INPUT.
           OPEN INPUT INPUT-FILE
           PERFORM UNTIL MORE-RECORDS = 'NO '
               READ INPUT-FILE
                   AT END
                       MOVE 'NO ' TO MORE-RECORDS
                   NOT AT END
                       PERFORM HANDLE-INPUT-RECORD
           END-PERFORM
           CLOSE INPUT-FILE.
       HANDLE-INPUT-RECORD.
           IF INPUT-AMOUNT = ZEROS
           THEN
               CONTINUE
           ELSE
               MOVE INPUT-REC TO SORT-REC
               RELEASE SORT-REC
           END-IF.

We have replaced the USING clause specifying the input file by the input procedure READ-INPUT. This subroutine contains the standard loop reading the input file and calls HANDLE-INPUT-RECORD for each record in the input file. This subroutine only copies records with a non-empty amount field to the sort record and tell the sort input procedure to advance using the RELEASE statement. The RELEASE statement is equivalent to the WRITE statement used for writing to normal files.

Similarly, we can use the output procedure hook to use our own routine to handle the sorted data instead of the standard output to a file using the GIVING clause.

       IDENTIFICATION DIVISION.
       PROGRAM-ID. sort-sample.
       ENVIRONMENT DIVISION.
       INPUT-OUTPUT SECTION.
       FILE-CONTROL.
           SELECT UNSORTED-FILE    ASSIGN TO 'unsorted.dat'
                  ORGANIZATION IS LINE SEQUENTIAL.
           SELECT SORT-FILE        ASSIGN TO 'sort.dat'
                  ORGANIZATION IS LINE SEQUENTIAL.
       DATA DIVISION.
       FILE SECTION.
       FD  UNSORTED-FILE LABEL RECORDS ARE STANDARD.
       01  UNSORTED-REC            PIC X(40).
       SD  SORT-FILE.
       01  SORT-REC.
           05  SORT-NO             PIC XX.
           05  FILLER              PIC X(38).
       WORKING-STORAGE SECTION.
       77  MORE-RECORDS PIC X(3) VALUE 'YES'.
       PROCEDURE DIVISION.
       MAIN.
           SORT  SORT-FILE
               ASCENDING KEY SORT-NO
               USING UNSORTED-FILE
               OUTPUT PROCEDURE WRITE-OUTPUT
           STOP RUN.
       WRITE-OUTPUT.
           PERFORM UNTIL MORE-RECORDS = 'NO '
               RETURN SORT-FILE
                   AT END
                       MOVE 'NO ' TO MORE-RECORDS
                   NOT AT END
                       PERFORM HANDLE-OUTPUT-RECORD
           END-PERFORM.
       HANDLE-OUTPUT-RECORD.
           DISPLAY 'NO=', SORT-NO.

result:
NO=05
NO=10
NO=15
NO=20

Similar to the input routine which uses RELEASE instead of WRITE to write to the special sort file, the output routine reads a record from the sort file using the RETURN statement instead of the normal READ. Otherwise, the output routine contains the usual read loop. In the example, we just display the sort number of each record.

Often, we would like to merge two sorted files into one, for example, when applying some updates to a large master file. Similar to the sort command, Cobol provides a special statement for this task. It works just like the sort command only that it takes multiple files in the USING clause.

       IDENTIFICATION DIVISION.
       PROGRAM-ID. sort-sample.
       ENVIRONMENT DIVISION.
       INPUT-OUTPUT SECTION.
       FILE-CONTROL.
           SELECT INPUT-FILE-1     ASSIGN TO 'input1.dat'
                  ORGANIZATION IS LINE SEQUENTIAL.
           SELECT INPUT-FILE-2     ASSIGN TO 'input2.dat'
                  ORGANIZATION IS LINE SEQUENTIAL.
           SELECT SORT-FILE        ASSIGN TO 'sort.dat'
                  ORGANIZATION IS LINE SEQUENTIAL.
           SELECT MERGED-FILE      ASSIGN TO 'merged.dat'
                  ORGANIZATION IS LINE SEQUENTIAL.
       DATA DIVISION.
       FILE SECTION.
       FD  INPUT-FILE-1 LABEL RECORDS ARE STANDARD.
       01  INPUT-REC            PIC X(40).
       FD  INPUT-FILE-2 LABEL RECORDS ARE STANDARD.
       01  INPUT-REC            PIC X(40).
       SD  SORT-FILE.
       01  SORT-REC.
           05  SORT-NO             PIC XX.
           05  FILLER              PIC X(38).
       FD  MERGED-FILE  LABEL RECORDS ARE STANDARD.
       01  MERGED-REC              PIC X(40).
       PROCEDURE DIVISION.
       MAIN.
           SORT  SORT-FILE
               ASCENDING KEY SORT-NO
               USING INPUT-FILE-1, INPUT-FILE-2
               GIVING MERGED-FILE
           STOP RUN.                 

Merging the "master" file input1.dat containing the four records

10xxxxxx input1 record 1 xxxxxxxxxxxxxxx
20xxxxxx input1 record 2 xxxxxxxxxxxxxxx
30xxxxxx input1 record 3 xxxxxxxxxxxxxxx
40xxxxxx input1 record 4xxxxxxxxxxxxxxxx

with the "update" file input2.dat containing the two records

05xxxxxx input2 record 1 xxxxxxxxxxxxxxx
25xxxxxx input2 record 4 xxxxxxxxxxxxxxx

results in the new sorted file merged.dat.

05xxxxxx input2 record 1 xxxxxxxxxxxxxxx
10xxxxxx input1 record 1 xxxxxxxxxxxxxxx
20xxxxxx input1 record 2 xxxxxxxxxxxxxxx
25xxxxxx input2 record 4 xxxxxxxxxxxxxxx
30xxxxxx input1 record 3 xxxxxxxxxxxxxxx
40xxxxxx input1 record 4xxxxxxxxxxxxxxxx

4.3.5. Screen Definitions

The interactive input and output we have used so far is very limited: we display some message with the DISPLAY statement and ask for a single variable with ACCEPT. These statements can also be used to define much more sophisticated dialogs (not quite what we consider graphical user interfaces these days, but the kind of terminal screen you typically see in banks). Instead of a single variable, you provide the name of a complete screen defined in the screen section of the data division. Here is the interactive demonstration of the case statement using a screen definition.

       IDENTIFICATION DIVISION.
       PROGRAM-ID. hello.
       ENVIRONMENT DIVISION.
       DATA DIVISION.
       WORKING-STORAGE SECTION.
       77  RESULT PICTURE X(10) VALUE SPACES.
       77  INPUT-VALUE  PICTURE 99 VALUE 0.
       SCREEN SECTION.
       01  INPUT-SCREEN.
           05  BLANK SCREEN.
           05  LINE 1 COLUMN 1 VALUE 'RESULT:'.
           05         COLUMN 12 PIC X(10) FROM RESULT.
           05  LINE 2 COLUMN 1 VALUE 'INPUT:'.
           05         COLUMN 8 PIC 99 TO INPUT-VALUE.
       PROCEDURE DIVISION.
       MAIN.
           PERFORM UNTIL INPUT-VALUE = 99
               DISPLAY INPUT-SCREEN
               ACCEPT INPUT-SCREEN
               EVALUATE INPUT-VALUE
                   WHEN 1
                       MOVE 'ONE' TO RESULT
                   WHEN 2
                       MOVE 'TWO' TO RESULT
                   WHEN 3
                       MOVE 'THREE' TO RESULT
                   WHEN OTHER
                       MOVE 'MORE' TO RESULT
               END-EVALUATE
           END-PERFORM
           STOP RUN.

When running the program, you will see an empty black screen showing the previous result (blank after startup) and asking for the new input value. The whole "look-and-feel" is defined in the screen section.