Analytics Engineering
12 min read

dbt in Production: What Nobody Tells You

dbt has transformed how teams write and maintain SQL transformations. But the gap between a working dbt project and a production-ready one is wider than most teams realize.

dbtSQLData ModelingBest Practices

dbt has transformed how teams write and maintain SQL transformations. But the gap between a working dbt project and a production-ready one is wider than most teams realize. I've reviewed dozens of dbt projects across companies, and the same gaps appear over and over.

Testing Is Not Optional The teams I see struggling with dbt in production are the ones who treat tests as optional. Not-null and unique tests on every primary key. Accepted-values tests on every enum column. Referential integrity tests across your fact-dimension joins. This isn't busywork — it's the early warning system that tells you when an upstream source silently changed. Without tests, you're flying blind.

Model Materialization Strategy Choosing between view, table, incremental, and ephemeral isn't just a performance decision — it's an architectural one. Views are free at query time but expensive at read time. Tables are the opposite. Incremental models are powerful but introduce complexity around late-arriving data and full refreshes. A rule of thumb: start everything as a view, promote to table when query time exceeds 30 seconds, and reach for incremental only when your table exceeds 10M rows.

Modular Layering Is Non-Negotiable The projects that become unmaintainable are the ones with spaghetti SQL — models referencing other models two or three layers deep with no consistent structure. Adopt a clear layer system: staging (1:1 with source tables), intermediate (business logic), and mart (aggregated, consumer-ready). Never skip layers. Never let a mart model reference a raw source directly.

Environment Configuration Is Where Projects Break I've seen production deployments fail because someone hard-coded a schema name or left dev credentials in profiles.yml. Use dbt's environment variable system for everything environment-specific. Separate your dev, CI, and prod targets. Make your CI pipeline run dbt compile on every PR — it catches syntax errors before they reach production.

HN
Helana Nosratbakhsh
Senior Data Engineer & Advisor
Work with me