DDS Physical Files to DDL Table Conversion – Why?

Have you ever thought about modernizing your database? A lot of IBM i companies have replaced the green screen with a graphical interface and converted their legacy monolithic programs into procedures and functions written in modern programming languages, but what about the database?

Data is often still stored in DDS (Data Description Specification) described physical files which grew over time and include a lot of redundancy, which requires additional programming effort. Clearly, some modernization is required. However, it is just as clear that the database cannot be converted and redesigned in a single step – everyday business must go on. Nevertheless, you have to start somewhere, and what you can do first is convert the DDS described physical files into DDL (Data Definition Language) defined tables and also convert DDS described keyed logical files into SQL indexes.

When we talk to customers or programmers about going from DDS to DDL, they quite often do not see any need. They reason that DDS described physical files and SQL defined tables can be accessed and used with SQL and native I/O in exactly the same way.

So why convert DDS described physical files into SQL tables?

DDS for defining database objects is only used on IBM i, while SQL is standard and all database manufacturers are committed to the standard. Even though there might be some SQL dialects, the differences between DDL commands for different databases are marginal.
Any programmer or database administrator coming out of school or from a different operating system or a different database first has to learn the DDS syntax, while SQL is widely taught and used.
DDS is stabilized – i.e., the last enhancement was in release V5R3M0 to allow packed and zoned fields to be defined with up to 63 digits.
All current enhancements are done in SQL, and there are even enhancements in older releases (for example Identity Columns introduced with release V5R1M0) that are not available in DDS. The following list is only a brief selection of important features that can only be used in SQL:
- Additional Datatypes: CLOB, DBCLOB, BLOB (Large Objects), DECFLOAT, XML, ROWID
- Identity Columns: automatically generates a unique key value
- Auditing Columns: for example, ROW CHANGE TIMESTAMP automatically stores the latest update timestamp

A lot of companies still use cryptic 6-to-10-character column names, because RPGIII could only handle 6 characters and RPGIV (prior to release 6.1) only 10-character column names. Even though longer names could be generated in DDS (by using the keyword ALIAS), long names were rarely defined because of the RPG restrictions.

Quite often, these column names are cryptic or consist only of a few characters followed by a running number, which makes it difficult for new programmers to determine what the field’s content is.

With SQL, long column and database object names can be created with up to 128 characters, and they can coexist with short (maximum 10-character) system names.

Working with longer descriptive names makes your life a lot easier, but if you want to add long column names, you should plan the long names based on predefined naming conventions.

Otherwise, you may end up with different names such as CustNo, CustomerNumber, Customer_No, or Customer#, all describing a customer number column.

Some differences between DDS physical files and SQL tables that you need to know about

1. When writing, updating or deleting records with native I/O, a format name that is different from the table name is needed. According to the SQL standard, a SQL table does not include a record format. But due to a non-standard enhancement in Db2 for i, it is possible to add a record format to a SQL table by specifying the RCDFMT keyword in the CREATE TABLE statement.

2. A DDS described physical file can include a key definition. SQL tables are always unkeyed. But unique keys can be converted into a primary key, and there is no way to include non-unique keys in a SQL table.

When converting from DDS to DDL, you should first check the programs with a keyed access on the physical file. Then, create a new logical file or (even better) a SQL index with the appropriate key fields and change your RPG programs to access the new index or logical file instead of the physical file. You can then convert your DDS described physical file into a SQL table, ignoring your key information …without wasting your primary key, which may be important for implementing referential integrities in the future based on identity columns.

By the way, if you have a unique key on your physical file and you do not create an additional index, your RPG program can handle the primary key constraint exactly in the same way as it handled the unique key in the physical file.

3. Minor architectural differences in database objects deliver significant advantages.

When inserting data into a DDS described physical file, the data is not checked when it is written into the file, but it is checked when it is read.

In this way, it is possible to enter invalid numeric values (e.g. *BLANKS) into numeric fields. This situation can occur by executing the CPYF CL command with the Format Option (FMTOPT) *NOCHK. In old programs, we quite often run into this situation, specifically when the record data structure within the RPG program was not properly initialized. Even if your programs work correctly today, you still may have records with invalid numeric values in your DDS described physical files. This invalid data must be detected and rectified before converting the physical file to a SQL Table.

Conversely, when inserting data into a DDL defined table, the data is checked when it is written, but not checked when it is read, because the table can only include valid data. In this way, it is not possible to write invalid numeric data into numeric fields, not even with a CPYF with Format Option *NOCHK. So a SQL defined table includes only valid data, one big advantage.

But wait—there is also another advantage to converting DDS described physical files into SQL tables:

What happens if the data is checked when it is written, but not checked when it is read compared with the opposite? How many read operations and how many write/update operations will be performed? Maybe 80% read and 20% write/update?

Even though we talk about nano seconds, checking data takes time. If 80% of the database operations are no longer checked (because data is read from a SQL table) and only 20% of the database operations (write/update) are checked, file access becomes faster—regardless of whether the access is performed with SQL or native I/O.

In short, converting DDS described physical files to SQL tables is already highly beneficial, even if you only go that far.

After the conversion, we can start to include the new features that were added in the last releases, such as Identity Columns, Auditing Columns, Temporal Tables etc. We can also start/continue to include program logic into the database, for example by adding check constraints, referential integrities and triggers. And with that, we have already laid the first cornerstone for a future redesign of the database.