● LIVE   Breaking News & Analysis
Bitvise
2026-05-20
Finance & Crypto

Monzo's Governed Data Mesh: How a Neobank Slashed Warehouse Costs by 40% with 12,000 dbt Models

Monzo implemented a governed data mesh using dbt across 100 teams and 12,000 models, cutting warehouse costs by 40% and boosting data delivery speed by 25%.

Monzo, the UK-based neobank, faced a data challenge common to fast-growing tech companies: scaling analytics across over 100 teams. Their solution? A governed data mesh that embraced the dbt framework, resulting in a 40% reduction in warehouse costs and a 25% improvement in data delivery speed. Below, we answer key questions about this innovative approach, from its core principles to practical outcomes.

What is a governed data mesh, and why did Monzo choose this model?

A governed data mesh is a decentralized architectural paradigm that treats data as a product, owned by individual domain teams. Unlike a traditional centralized data warehouse, where a single team controls all transformations and access, a mesh distributes ownership while enforcing global standards. Monzo adopted this model to overcome bottlenecks that arose as their engineering and product teams grew. Previously, a central data team struggled to keep pace with the diverse analytical needs of over 100 squads. By shifting to a mesh, Monzo allowed each domain (e.g., payments, fraud, marketing) to define and maintain its own data products using dbt. At the same time, they implemented a governance layer to ensure consistency, discoverability, and compliance. This blend of autonomy and control—what Monzo calls a meshy approach—enabled them to scale data operations without sacrificing reliability or increasing costs linearly.

Monzo's Governed Data Mesh: How a Neobank Slashed Warehouse Costs by 40% with 12,000 dbt Models
Source: www.infoq.com

How did Monzo implement the data mesh with more than 12,000 dbt models?

Monzo's implementation centered on the dbt (data build tool) ecosystem, which allowed teams to define transformations as modular, version-controlled SQL models. With over 12,000 such models, the key was establishing clear conventions for naming, testing, and documentation. Each team owned a set of models relevant to their domain, stored in separate repositories or folders within a monorepo. Monzo used dbt's built-in schema tests and custom assertions to maintain data quality. They also introduced a data contract approach: as teams published data models, they specified schemas, freshness SLAs, and ownership. To manage discoverability, Monzo built a custom data catalog that indexed all models and their dependencies. This allowed any team to find and reuse data products, reducing duplication. The entire pipeline was orchestrated using a combination of Airflow and dbt's own execution framework, with automated CI/CD pipelines to test and deploy changes across the mesh.

What were the key technical components of Monzo's meshy architecture?

Monzo's architecture rested on three pillars: dbt for transformation logic, BigQuery as the underlying warehouse, and a lightweight governance layer. Each domain team maintained its own dbt project, which could include multiple models, tests, and documentation. The governance layer enforced cross-cutting policies through a shared manifest that aggregated metadata from all projects. For example, Monzo required every model to have a description, a owner tag, and at least one test. They also used dbt's exposures feature to track how models were used in downstream dashboards and reports. To prevent runaway costs, they implemented a budget-based system: each team was allocated a compute budget for BigQuery, with real-time monitoring through custom dashboards. If a team's queries exceeded their budget, they would receive alerts and could drill down into which models or users were consuming the most resources. This combination of autonomy and guardrails was critical to scaling to 100+ teams.

How did Monzo ensure governance and data quality across 100 teams?

Governance at Monzo was not a top-down mandate but a set of shared practices codified in CI/CD pipelines and a central data catalog. Every time a team modified a dbt model, the pipeline ran a suite of automated tests: schema checks, uniqueness tests, referential integrity, and freshness thresholds. If any test failed, the deployment was blocked. Monzo also maintained a data quality scorecard that tracked the percentage of passing tests per domain, making it visible to leadership. Beyond testing, they required all data products to be registered in the catalog with metadata such as owner, purpose, and refresh schedule. This catalog served as the single source of truth, and teams could see which products were certified (i.e., had passed a review) versus experimental. Additionally, Monzo conducted periodic cross-team reviews of the most critical data products, ensuring alignment with company-wide definitions (e.g., what constitutes an active user). This balance of automated checks and social processes was key to maintaining trust in the mesh.

Monzo's Governed Data Mesh: How a Neobank Slashed Warehouse Costs by 40% with 12,000 dbt Models
Source: www.infoq.com

What were the actual cost and speed benefits Monzo experienced?

Monzo reported two headline metrics after the transition: a 40% reduction in warehouse costs and a 25% improvement in data delivery speed. The cost savings came from eliminating redundant storage and compute. In the old centralized model, the data team often created generic tables that were used by few teams; in the mesh, each domain optimized its own data products for its specific use cases, reducing total storage. Additionally, teams became more mindful of their BigQuery consumption because they owned their budgets. The speed improvement was measured as the time from a business question being raised to a model being available in production. With decentralized ownership, teams no longer had to submit tickets to a central team; they could build and deploy transformations within hours or days instead of weeks. Monzo also noted that the mesh reduced the cycle time for making changes to existing models, as teams had full control over their own pipelines.

What challenges did Monzo face during the migration to a data mesh?

Migrating from a central data warehouse to a governed data mesh was not without hurdles. One of the biggest challenges was cultural change: teams had to take ownership of data products, which required them to learn dbt, define tests, and maintain SLAs. Some initial resistance came from teams that preferred to consume pre-built tables rather than build their own. Monzo addressed this by providing training and a central data team that acted as consultants and quality gatekeepers rather than sole producers. Another challenge was dependency management. With over 12,000 models, understanding the network of upstream and downstream dependencies was complex. Monzo invested in tooling to visualize these dependencies and detect breaking changes before deployment. They also faced cost allocation issues: since BigQuery charges per query, attributing costs to the correct team required precise tagging, which was not always accurate initially. Over time, they improved their cost attribution models by integrating query logs with team identifiers from their dbt projects.

What lessons can other organizations learn from Monzo's data mesh journey?

Monzo's experience offers several takeaways for companies considering a data mesh. First, start with strong conventions and automated governance—they invested heavily in testing and metadata standards early, which prevented chaos as the mesh grew. Second, empower domain teams with the right tools (like dbt) and clear ownership boundaries; but also provide a support system, such as a centralized data engineering team that defines the how while teams decide the what. Third, measure success beyond just cost or speed: Monzo tracked developer satisfaction and the number of data products published per quarter. Fourth, be prepared for an iterative migration—they did not move all teams at once but rather started with a few pilot domains, refined the process, and then scaled. Finally, invest in observability and cost monitoring. Without real-time dashboards for both data quality and cost, teams would have lacked the feedback loops needed to sustain a healthy mesh. In essence, a governed data mesh is not a one-time project but an ongoing evolution of culture, tools, and practices.