The regulatory requirements that emerged from the firestorm that triggered the Great Recession put banks under near-relentless pressure to generate and report increasingly detailed layers of data. The Financial Accounting Standards Board’s new Current Expected Credit Loss (CECL) standard, which extends necessary loss calculations over the lifetime of loans, and signals from Federal Reserve chairman Powell that the stress testing and resolution planning components of Dodd-Frank will be maintained, show granular data requirements are here to stay. While the current administration’s focus on deregulation might bring some relief, it is unlikely to turn the tide, let alone relieve the data burdens bankers already face.
The most challenging data management burden is rooted in duplication. The evolution of regulations has left banks with various bespoke databases across five core functions: credit, treasury, profitability analytics, financial reporting and regulatory reporting, with the same data inevitably appearing and processed in multiple places. This hodgepodge of bespoke marts simultaneously leads to both the duplication of data and processes, and the risk of inconsistencies -- which tend to rear their head at inopportune moments (i.e. when consistent data needs to be presented to regulators). For example, credit extracts core loan, customer and credit data; treasury pulls core cash flow data from all instruments; profitability departments pull the same instrument data as credit and treasury and add ledger information for allocations; financial reporting pulls ledgers and some subledgers for GAAP reporting; and regulatory reporting pulls the same data yet again to submit reports to regulators per prescribed templates.
Relearning the data alphabet
Data storage has evolved from simple flat files to databases to marts to warehouses, and now to ‘lakes’ housed in ‘big data’ environments, where all kinds of structured and unstructured data is tended by data scientists. Proprietary calculating and reporting solutions designed for different requirements complicate the data requirements picture even further.
Just as the complexity of housing data has evolved, so have the data management tools. Data management is typically thought of in three stages: Extract, Transform and Load (ETL). Considering the multiple levels of staging tables between data sources and storage areas, most real-world data management processes consist of much more than three sequential steps (e.g. ELETL or ELELTL) -- and this is just to get the detailed data. Additional steps -- Calculation (C), Aggregation (A) and Presentation (P) -- are needed throughout to meet today’s analytical and reporting requirements. When considered end to end, multiple occurrences of E, T, L, C, A and P are embedded in today’s data management processes – many of which are manual or semi-manual and still performed by senior management.
Banks should seize the moment
Fortunately for forward-thinking banks, the window before CECL kicks in provides a golden opportunity to transform the disparate data marts and processes underpinning key departments into a more integrated, future-proof approach benefiting not only compliance but also profitability and competitiveness. The fact that CECL will effectively force this integration makes it even more important to begin the transformation now. The question is, how?
Some advocate ELT as the solution, but this exacerbates duplication in rules and loses the value of clean, normalized, persistent data. Others advocate throwing everything into a new data lake and letting the data scientists fish out what’s needed. Others sadly continue to build or buy point systems that involve separate databases and management processes, adding another island to the others. But reality shows there is no ‘one size fits all’ approach, no single data ocean or program. Tactical solutions to problems that are both immediate and strategic are not transformational; neither are old processes under new names.
The path to effectively transforming data management is to combine tried and true processes and solutions with the selective deployment of new technologies which remove undesirable duplication of both rules and storage. A map of such an approach is below; ETL/ELT is applied to source data that is convened into staging tables or a data lake. Required and/or relevant data from the staging tables or lake is then transferred to a permanent, defined data mart where analysis, calculation and presentation can be conducted before the results are transmitted to various business functions and external recipients.