Regulator-Defensible Architecture
Eleven Years, Two Banks: An Empirical Post-Mortem on BCBS 239
What the industry's most important compliance statistic reveals about modern enterprise data architecture.
Answer summary
The Basel Committee's 2023 BCBS 239 progress report found that only two of 31 assessed G-SIBs were fully compliant with all Principles. That outcome is an architectural signal: the last decade optimized for scale, flexibility, ownership, metadata, and cloud convergence more than continuous regulator-defensible evidence production.
Key takeaways
- Only a small minority of G-SIBs are fully BCBS 239 compliant.
- Most architectural waves optimized for the wrong constraints.
- Lineage and evidence remain operationally unresolved.
- Compliance increasingly depends on continuous evidence production.
- The next architectural generation must treat evidence as infrastructure.
Eleven Years, Two Banks: An Empirical Post-Mortem on BCBS 239
The most important number in modern enterprise data architecture may be “2 of 31.”
TL;DR
- More than a decade after BCBS 239 was introduced, only a very small minority of G-SIBs are considered fully compliant.
- This is not only a governance failure. It is also an architectural signal.
- Over the last decade, enterprise data architecture optimized for scale, flexibility, decentralization, metadata automation, and cloud convergence, while regulator-defensible evidence production remained underdeveloped.
- Warehouses, lakes, lakehouses, mesh, and fabric architectures all solved real problems, but each left important BCBS 239 requirements only partially addressed.
- The hardest unresolved areas remain attribute-level lineage, evidence continuity, and enforcement across jurisdictions and trust boundaries.
- Supervisory expectations are moving from periodic governance artifacts toward continuously reproducible evidence.
- The next architectural generation will likely be defined less by storage abstraction and more by evidence-producing capability.
Opening observation
A few years ago, during a banking transformation workshop, an executive asked a question that sounded simple at first:
After all this investment, why is BCBS 239 still so difficult?
The answers came from several directions. Some people pointed to organizational silos. Others mentioned legacy systems, acquisitions, unclear data ownership, changing regulatory expectations, and the practical difficulty of tracing risk data across many platforms. Each explanation was partly true.
What stayed with me after the meeting was that the discussion treated the problem mainly as a delivery or governance failure. That framing felt incomplete. The more uncomfortable possibility was that many of the architectures built after BCBS 239 were never primarily optimized for the kind of evidence the regulation required.
The empirical outcome makes that possibility difficult to ignore. The Basel Committee’s 2023 progress report, based on 31 G-SIBs assessed as of June 2022, stated that only two banks were fully compliant with all BCBS 239 Principles.
The exact number is less important than the pattern it reveals. After more than a decade of transformation spending, cloud migration, lineage tooling, metadata platforms, governance redesigns, and operating model changes, the industry still struggles to produce regulator-defensible risk data consistently.
That should trigger a deeper architectural discussion than it usually does.
What BCBS 239 actually asked for
BCBS 239 was never merely a reporting regulation. It asked banks to demonstrate that risk data could be aggregated accurately, traced reliably, adapted quickly, governed consistently, and distributed appropriately, including under stress conditions.
That sounds manageable in policy language. Operationally, it is extremely demanding.
To satisfy those expectations simultaneously, a bank needs lineage continuity, semantic consistency, governance accountability, operational traceability, and reproducible evidence generation across a highly fragmented enterprise estate. This estate usually includes mainframes, warehouses, risk engines, reporting marts, vendor systems, spreadsheets, data lakes, cloud platforms, regional data stores, and post-merger integration layers.
The architectural implication was deeper than many organizations initially recognized. BCBS 239 effectively required evidence-producing data architecture before the industry had a mature vocabulary for such a thing.
Many banks interpreted the problem as a data governance initiative, a reporting modernization program, or a metadata management challenge. Those interpretations were not wrong, but they were incomplete. The regulation also exposed a structural issue: most enterprise data architectures were better at producing data products than at producing defensible evidence about how those data products came into existence.
The 2013–2026 architectural timeline
One useful way to understand the current situation is to look at the architectural waves that followed BCBS 239.
The point is not that these architectures failed. Most of them solved important problems and were rational responses to the constraints of their time. The more useful observation is that each generation optimized for a different primary constraint, while regulator-defensible evidence production remained a secondary or downstream concern.
The warehouse era: consistency over flexibility
In the earlier post-crisis years, many banks still relied heavily on centralized warehouse models. These environments optimized for standardization, controlled reporting, reconciliation, and centralized governance.
Their strengths were real. Warehouses often provided relatively stable semantics, constrained transformation logic, clearer ownership structures, and more predictable reporting pipelines. In some respects, older warehouse environments were structurally closer to BCBS 239 expectations than some later architectures, because their constraints made lineage and reporting control more manageable.
The limitation was adaptability. As data volumes grew, reporting requirements changed, cloud adoption accelerated, and banks tried to support more diverse analytical and machine learning workloads, centralized warehouse models became increasingly difficult to evolve. Their governance discipline was valuable, but their operational model was too rigid for the scale and heterogeneity that followed.
The data lake era: scale becomes the priority
The data lake era shifted the optimization target toward scale and flexibility. The core idea was straightforward: store large volumes of raw or semi-structured data cheaply, separate storage from compute, and defer modeling decisions until later.
This solved several important bottlenecks. Banks could ingest more data, retain more history, support new analytical workloads, and experiment with machine learning more easily. The architecture opened possibilities that warehouse-centric environments struggled to support.
It also introduced a new failure mode. Flexibility expanded faster than governance maturity.
Across many organizations, lakes accumulated schema drift, duplicated transformations, inconsistent ownership, and lineage gaps. The issue was not that lakes were inherently ungovernable. The issue was that many implementations optimized for analytical possibility before they had the operating discipline required for evidentiary certainty.
The industry gained scale, but often weakened traceability.
The lakehouse era: analytical convergence
The lakehouse generation attempted to reconcile the flexibility of lakes with some of the reliability properties of warehouses. Open table formats, ACID transactions, improved metadata layers, unified analytics, and stronger governance integration were all meaningful improvements.
This was real architectural progress. Lakehouses made many large-scale data environments more reliable, more performant, and easier to operate than earlier lake implementations.
However, the dominant optimization target remained analytical convergence. That distinction matters under BCBS 239.
A lakehouse can process petabytes efficiently, support machine learning pipelines, expose near-real-time analytics, and provide stronger metadata management, while still struggling with attribute-level lineage, evidence continuity, lawful-basis reconstruction, historical policy reproducibility, and cross-jurisdiction enforcement.
The lakehouse generation improved data processing substantially. It did not, by itself, solve evidentiary reconstruction.
The mesh era: organizational reality arrives
Data mesh introduced an important correction to earlier architectural thinking: organizational structure is part of the data architecture.
Domain ownership, federated governance, and data-as-a-product thinking addressed real problems in large enterprises. Central governance and platform teams were already becoming bottlenecks, and many organizations needed clearer accountability closer to the domains that understood the data.
Mesh helped with ownership. It made product boundaries more explicit and encouraged teams to treat data consumers more seriously.
The difficulty is that BCBS 239 creates cross-domain evidence requirements that cannot be delegated entirely to local ownership models. In highly regulated multinational environments, sovereignty, residency, evidence continuity, and supervisory traceability become concerns that cut across domains.
Mesh improved organizational alignment. It did not fully solve evidentiary topology.
The fabric era: metadata and orchestration
Data fabric architectures recognized another important reality: enterprise data estates would remain heterogeneous. Rather than assuming full consolidation, fabric approaches emphasized active metadata, automated lineage, semantic discovery, governance automation, and cross-system orchestration.
This improved visibility. In many organizations, metadata-driven governance and orchestration made fragmented environments easier to understand and operate.
The limitation is that metadata visibility does not automatically create regulator-defensible evidence. Automated lineage can help reconstruct relationships between systems, but it does not necessarily provide signed, immutable, policy-bound, time-attested evidence that can be verified later.
Fabric architectures strengthened discovery and orchestration. They did not fully resolve provability.
What these waves optimized for instead
Looking back, the architectural sequence becomes easier to understand.
| Architectural wave | Primary optimization |
|---|---|
| Warehouse | Consistency and reporting control |
| Lake | Scale and flexibility |
| Lakehouse | Analytical convergence |
| Mesh | Organizational ownership |
| Fabric | Metadata visibility and orchestration |
Each optimization was valuable. The missing primary optimization target was regulator-defensible evidence production.
That distinction matters because BCBS 239 increasingly behaves like an evidence architecture problem rather than a storage architecture problem.
The three requirements that still break most architectures
Three BCBS 239-related requirements continue to expose structural weaknesses in modern data estates.
1. Attribute-level lineage
Table-level lineage is already difficult in large estates. Attribute-level lineage across heterogeneous compute environments is substantially harder.
A single risk attribute may pass through source systems, ETL jobs, Spark transformations, warehouse SQL, data quality rules, reporting layers, spreadsheets, and manual adjustments before it appears in a risk report. Each platform may produce partial lineage in a different format. Some steps may be inferred rather than captured. Some historical context may no longer exist when the regulator asks for it.
Most organizations still rely on a mixture of tooling, manual enrichment, and expert reconstruction. That may be sufficient for internal understanding, but supervisory expectations increasingly require reproducibility.
2. Signed, verifiable evidence
Operational logs and regulator-defensible evidence are not the same thing.
Logs are usually optimized for debugging, observability, and incident analysis. They may be mutable in practice, retained inconsistently, tied to source-system semantics, or difficult to interpret outside the operational context that produced them.
Regulatory evidence increasingly requires stronger properties: immutability, time attestation, policy traceability, identity context, data-version reference, and long-term verifiability.
Most enterprise architectures still treat evidence as documentation assembled after the fact. BCBS 239 pressure suggests that evidence needs to become infrastructure.
3. Multi-jurisdiction enforcement
Modern Tier-1 banks no longer operate inside a single legal or operational topology.
They operate across sovereign cloud requirements, residency obligations, AI governance restrictions, cross-border access constraints, and jurisdiction-specific supervisory expectations. A lineage chain may cross regions, vendors, control frameworks, and legal entities.
Many architectures were originally designed around relatively unified trust zones. That assumption is increasingly weak.
Governance now needs to become topology-aware, residency-aware, and jurisdiction-aware. This makes BCBS 239 harder, because the evidence chain must remain coherent even when the data estate itself is legally and operationally segmented.
What this implies about the next architectural pattern
The industry often asks what the next storage abstraction will be. That question is still useful, but it is no longer sufficient.
The more important question may be how an enterprise continuously produces defensible evidence while remaining portable and AI-capable.
That shifts the optimization target from storage, compute, and orchestration toward evidence, portability, and governed consumption.
The compliance project then begins to change shape. Instead of periodic reconstruction, manual evidence assembly, and retrospective governance review, the architecture starts producing evidence continuously as a natural output of operation.
In practical terms, the compliance project becomes a query over an evidence-producing architecture.
Once framed this way, many current architectural assumptions begin to look incomplete. It also explains why SDOP places the Evidence Plane, Portability Layer, and Agentic Plane at the center of the pattern rather than treating them as secondary governance capabilities.
What I would do Monday morning
If I were leading a Tier-1 BCBS 239 remediation effort today, I would focus less on dashboards and more on evidence flows.
- Identify where lineage still depends on human reconstruction.
- Separate operational observability from regulatory evidence explicitly.
- Map which datasets and attributes cross residency or sovereign boundaries.
- Test whether portability exercises preserve lineage and policy context.
- Treat AI consumption as part of the risk-data evidence landscape rather than a separate future concern.
Most organizations already have many of the necessary tools. What they usually lack is an architecture optimized for the actual supervisory requirement.
The tradeoff most organizations underestimate
Architectures optimized for evidence production are operationally heavier.
Continuous evidence generation introduces cost, latency, governance friction, retention obligations, and platform engineering complexity. This is unavoidable if the evidence is expected to be durable, verifiable, and useful under supervisory scrutiny.
The relevant comparison, however, is not between a heavy architecture and a simple one. Large banks are already complex. The comparison is between explicit evidence infrastructure and endless remediation cycles, fragmented governance, inconsistent lineage reconstruction, and growing supervisory distrust.
The industry has spent more than a decade trying lighter-weight approaches. The results are visible.
Closing reflection
The most important signal in enterprise data architecture may not be the rise of AI, the spread of lakehouses, or the growth of sovereign cloud. It may be the persistence of BCBS 239 remediation after more than a decade of investment.
That persistence should force a reassessment of what enterprise data architecture is optimizing for.
If the next decade repeats the last, banks will continue accumulating more platforms, metadata tooling, governance workflows, and AI infrastructure while still reconstructing compliance manually at quarter-end.
At some point, the architecture itself has to change.
The reason is not that regulators require more documentation. The deeper reason is that the operational assumptions underneath the previous generation were not designed for evidentiary continuity.
Part of the SDOP series on regulator-defensible architecture, sovereign data systems, enterprise AI governance, and continuous modernization.
References
- BIS: Principles for effective risk data aggregation and risk reporting — Original BCBS 239 principles.
- BIS: Progress in adopting the BCBS 239 principles — Basel Committee 2023 progress report assessing 31 G-SIBs.
- ECB: Guide on effective risk data aggregation and risk reporting — ECB supervisory expectations for RDARR.
- FINMA Circular 2023/1: Operational risks and resilience - banks — Swiss operational risk and resilience circular.
- EUR-Lex: Regulation (EU) 2022/2554 on digital operational resilience — DORA regulation text.
- EUR-Lex: Regulation (EU) 2024/1689 Artificial Intelligence Act — EU AI Act official journal text.
- OpenLineage documentation — Open lineage framework referenced for lineage architecture.
- Apache Iceberg table specification — Open table format specification referenced in modern architecture discussion.
Author
Géza Kuti is a senior Data and AI executive based in Bülach (ZH), Switzerland, focused on data strategy, enterprise architecture, AI governance, hybrid cloud, and regulated delivery.
Related articles
DataOS, Mesh, Fabric, Lakehouse — and What Comes After
From warehouse to lakehouse to DataOS: an architectural genealogy of enterprise data systems and why the next generation must solve sovereignty, evidence, portability, and AI governance together.
Pattern, Not Product: What a Sovereign Data Operating Plane Actually Is
An introduction to the Sovereign Data Operating Plane (SDOP): a regulator-defensible architectural pattern composed of five elements and three differentiating pillars.
Three Pains Every Tier-1 CDO Is Failing Right Now
Why Tier-1 enterprises are simultaneously struggling with enterprise AI, regulator-defensible data architecture, and continuous modernization, and why current architectures solve at most one of the three.