Skip to content

0060: Metabase Model Strategy

STATUS

Accepted

CONTEXT

Metabase is a core tool for the organization — used for analysis, reporting, data exploration, and prototyping. For it to remain responsive and useful, we need to ensure that heavy or recurring queries are handled efficiently and that the environment stays clean and performant.

Metabase models and dbt models both serve the purpose of creating reusable datasets, but they have different strengths and trade-offs. Without a clear standard, teams default to creating Metabase models for everything, leading to:

  • Database performance degradation when models query large granular tables (e.g., adgem_analytics.analytics.events, adaction_analysis.tune.event_stream) on every dashboard load.
  • Persistence mode adding workload to the database with models that are obsolete or rarely used.
  • Loss of code knowledge when models live only in Metabase with no version control.
  • Difficulty optimizing or refactoring complex queries that are embedded in Metabase.

This ADR establishes a clear standard for when to use each tool.

How does Metabase Persistence mode work?

Metabase creates a schema for each connected database (e.g., adgem_analysis.metabase_cache_9748c_3) where all models are materialized on a custom frequency. In the open-source edition, persistence mode can only be activated per database — not per model. Upgrading to Pro/Enterprise for per-model control would cost ~$13.3k/year.

You can check the Metabase documentation here.

Considered Options

  1. Upgrade Metabase to Pro — Enables per-model persistence control, usage analytics, and official source flagging. Pricing is ~$13.3k/year. Still requires data engineering intervention for model optimization.
  2. Establish a standard for Metabase vs dbt model usage — Define clear guidelines for when each tool should be used, migrate existing product models to dbt, and disable persistence mode once complete. No added cost.

DECISION

CHOSEN OPTION: Establish a standard for Metabase vs dbt model usage, avoiding increases in cost while working on technologies familiar to the team owners of the datasets.

Standard: When to use a Metabase model

A Metabase model should only be used for short-lived, exploratory work with an expected lifespan of one to two weeks. Use cases include:

  • Prototyping a feature — Exploring a dataset to validate a hypothesis or build a proof of concept before committing to a full implementation.
  • Incident investigation — Creating a temporary dataset to diagnose and resolve an active issue.
  • Ad-hoc analysis — Answering a one-time business question that does not need to be maintained.

Once the work is complete, the model should be either converted to a question (to preserve the code) or moved to trash.

Standard: When to use a dbt model

A dbt model should be used when any of the following apply:

  • The dataset will be used on a recurring basis — dashboards, scheduled reports, or any tool that is expected to be active beyond a couple of weeks.
  • The query touches large granular tables — tables like adgem_analytics.analytics.events or adaction_analysis.tune.event_stream are too expensive to query on every dashboard load. A dbt model pre-aggregates the data and materializes it on a schedule.
  • The dataset is used by multiple questions or dashboards — shared datasets should be version-controlled and managed through the data pipeline.
  • The query involves complex transformations — multiple joins, window functions, or multi-step aggregations are better maintained and tested in dbt.

Lifecycle of existing Metabase models

  1. Track every model — Review models via the Metabase Data > Models section. Check usage by selecting a model, clicking More Info, and reviewing the Relationships tab.
  2. Classify as prototype or product — Apply the criteria above. If a model matches any dbt criteria, it is a product model.
  3. Retire prototypes — Convert obsolete models to questions (preserves the code in the collection) or move to trash.
  4. Migrate product models to dbt — Add to the team owner's backlog. Create the dbt model in the appropriate database and schema, managed by Airflow. Once deployed, update all dependent questions and dashboards to use the new table.
  5. Disable persistence mode — Once all product models have been migrated, persistence mode can be deactivated for the database.

CONSEQUENCES

Positive consequences

  1. Clear ownership: dbt models are version-controlled, code-reviewed, and tested.
  2. Better database performance: large table queries run on a schedule instead of on every dashboard load.
  3. Faster dashboards for end users.
  4. Cleaner Metabase environment with fewer stale models.

Negative consequences

  1. Increased development work for the team owner of the dataset when migrating existing models to dbt.

Risks

This is a low-risk decision. Risks and how to handle them are:

  1. Increased developer work. Handle via sprint planning.
  2. Difficulty tracking model usage. Weekly review of our Metabase environment by the team owner of the dataset (already in effect).

NOTES

References

Original Author

Francisco Manjon

Approval date

Approved by

Appendix

Success Stories

Dashboards that have been migrated from Metabase models to dbt models: