Print Shortlink

Denormalizing COBOL Copybooks using Python

I am working with Informatica PowerCenter on a day to day basis and sadly have to deal with data coming in who’se format is described by COBOL Copybooks. When a Copybook is imported PowerCenter decides that it will convert all the fields and groups having OCCURS statements into separate tables. A neat feature but when unwanted a big pain.

The only way to stop Informatica of doing so is by denormalizing the COBOL Copybook and thus removing all the OCCURS statements. Googling how this can be done results sadly in a handfull of posts advising me to do it manually, yikes! If you came to this page looking for an answer to a similar question look no more – for I have created a neat COBOL parser in Python that does denormalization too: Python-COBOL.

Before:

00000 * Example COBOL Copybook file                                     AAAAAAAA
00000  01  PAULUS-EXAMPLE-GROUP.                                        AAAAAAAA
00000       05  PAULUS-ANOTHER-GROUP OCCURS 0003 TIMES.                 AAAAAAAA
00000           10  PAULUS-FIELD-1 PIC X(3).                            AAAAAAAA
00000           10  PAULUS-FIELD-2 REDEFINES PAULUS-FIELD-1 PIC 9(3).   AAAAAAAA
00000           10  PAULUS-FIELD-3 OCCURS 0002 TIMES                    AAAAAAAA
00000                           PIC S9(3)V99.                           AAAAAAAA
00000       05  PAULUS-THIS-IS-ANOTHER-GROUP.                           AAAAAAAA
00000           10  PAULUS-YES PIC X(5).                                AAAAAAAA

After:

         01  EXAMPLE-GROUP.                                                     
           05  FIELD-2-1 PIC 9(3).                                              
           05  FIELD-3-1-1 PIC S9(3)V99.                                        
           05  FIELD-3-1-2 PIC S9(3)V99.                                        
           05  FIELD-2-2 PIC 9(3).                                              
           05  FIELD-3-2-1 PIC S9(3)V99.                                        
           05  FIELD-3-2-2 PIC S9(3)V99.                                        
           05  FIELD-2-3 PIC 9(3).                                              
           05  FIELD-3-3-1 PIC S9(3)V99.                                        
           05  FIELD-3-3-2 PIC S9(3)V99.                                        
           05  THIS-IS-ANOTHER-GROUP.                                           
             10  YES PIC X(5).

It doesn’t support all functions found in the Copybook, just the ones that I met on my path (REDEFINES, INDEXED BY, OCCURS) but it can easily be extended.

Python-COBOL on GitHub

Leave a Reply