Debunking the "conventional wisdom" of data modelling, normalization, etc.

System Architecture

3-Tier 2/3: Change the Application-Data Relationship with Stored Procedures

Data Confidentiality, Data Integrity, Data Security, Oracle, PostgreSQL, Programming

The previous post discussed the evolution of the current architecture on the right of this diagram from what previously existed on the left:

IBM 3270TerminalCICSCobolDB2CRUDClosed,ProprietaryWebBrowserApplicationCodeDataBaseCRUDOpen,ConnectedInternetFirewall
3-Tier: Closed Versus Open Technology

and asked the question of whether that has been appropriate.

I do have views on that.

But I’m only going to focus on the area in which - after the best part of 40 years doing data analysis, modelling, and administration - I feel particularly qualified to have an opinion.

The Data Custodian Perspective

Before business records became computer-based, they were probably maintained on paper, often stored in filing cabinets and the like. In particular, properly maintaining the documentation was one of the responsibilities of the manager who oversaw the business transactions reflected by those records.

As business records moved onto computer storage, I always assumed that the business manager was still responsible for ensuring that proper record-keeping was occurring. And my role was as a data custodian looking after that manager’s record-keeping interests.

Acting on behalf of that manager, my responsibilities were in two areas:

Various measures can be deployed to reduce the risk, such as network filtering of the nodes with access to the database, minimising the privileges given to database accounts, etc. But they are only “speed bumps.”

And, as data custodian, I had to struggle with some of these measures - such as network filtering - being opaque to me.

I even looked at mechanisms within the database for filtering, and possibly rejecting, logins themselves based on their source, etc. But the proprietary database in use didn’t provide much support for that at the time.

This started me down the path of a fundamental change in the relationship between the database and the application layers in this “new world” of everything connected.

Data Confidentiality

Arguably this is the harder of the custodial responsibilities.

I can work to limit SELECT privileges against TABLEs.

I can use VIEWs to limit visibility to subsets of the rows and columns.

Some database implementations support additional rules (e.g., PostgresSQL’s POLICY, Oracle’s “row level security”) to dynamically filter result sets further.

But, ultimately, there is little I can do after data leaves the database.

(Making the database, in whole or part, “write only” (i.e., with no SELECT privileges) is obviously not an option in general!)

Data Integrity

Thinking about this is when things became particularly interesting.

Integrity involves ensuring only logically-consistent updates are applied. In SQL terms, this is managing DELETEs, INSERTs, and UPDATEs appropriately used as a group. Specifically, I need to know that an update routine:

  1. Checks the input parameters for correctness;

  2. Validates the change requested as appropriate given the current state of data;

  3. Applies the appropriate sequence of commands (DELETE, INSERT, UPDATE) for recording a logically-consistent state-change (i.e., update) in the database; and

  4. Performs a COMMIT - or ROLLBACK - of the changes before returning control to the caller.

The way to ensure this is to be confident that an update routine has been tested and promoted into the production space via appropriate controls. Testing should be expected to certify that the routine performs a proper update as described by the above rules before promotion occurs.

By way of contrast, I need to know - as examples - that the privileges are not being exercised via a command-line interface, nor via a spreadsheet with a database connection allowing someone to do bulk edits on a table.

Trusting Middle-Tier Update Routines … or Not

How can the database know that updates coming from the middle tier are from such a routine?

Such applications connect to the database via the same user/authentication mechanism as anyone/anything else.

But, with the current state of technology, there are no mechanisms - such as, perhaps, a trustable codefile signature exchange - included in that authentication. Therefore, we cannot guarantee that a routine is something we can trust.

To put it another way, there is no way for the database to authenticate that updates are coming from a properly-tested and promoted routine when it is running outside the database.

Updating via Database Stored Procedures

The obvious conclusion is: if there is no mechanism to be able to trust such a routine running outside the database, then the update routine has to be in the database.

We can assume that, like anything else in the database, it must’ve passed through the appropriate controls to be there. And can, therefore, be trusted.

And with all updating via database stored procedures, accounts used to connect to the database no longer need DELETE, INSERT, or UPDATE privileges.

Instead, accounts used to connect to the database will need EXECUTE privileges on the stored procedures.

This makes for the “Before” (left) and “After” (right) images:

WebBrowserApplicationCodeDataBaseCRUDOpen,ConnectedInternetFirewallWebBrowserApplicationCodeDataBaseSELECTEXECUTE Update ProcOpen,ConnectedInternetFirewall
3-Tier: Update Routines Moved Into the Database

The Implications

The differences from a security/data integrity point of view are:

In Summary

The bottom line is that the “attack surface” related to data integrity within the database has been vastly reduced by moving logically-consistent update routines into the database.

This simple change obviously has profound consequences. Looking at some of those is the subject of my 3rd, and final, piece in this series.