From dbt Cloud User to Certified: My Study Guide for the Analytics Engineering Exam

I was never a big dbt Core user in the early days, but I have been building data pipelines in dbt Cloud from early on. Now days I’m more comfortable with the VS Code IDE integration, but your source() and ref() are still the same. However, that does that mean I was ready to sit for the dbt Analytics Engineering Certification Exam. There are some gaps worth closing deliberately.

What the Exam Actually Tests

dbt has a deep study guide available here: https://www.getdbt.com/dbt-assets/certifications/dbt-certificate-study-guide The exam is 65 questions, 2 hours, online proctored, and requires a 65% passing score. It covers dbt Core 1.7. The cost is $200, and the certification lasts 2 years.

The eight topic domains are:

Developing dbt models
Understanding dbt models governance
Debugging data modeling errors
Managing data pipelines
Implementing dbt tests
Creating and maintaining dbt documentation
Implementing and maintaining external dependencies
Leveraging the dbt state

If you’ve been working in dbt Cloud since 2022, you likely have strong intuitions on topics 1, 3, 5, and 6. Topics 2, 4, 7, and 8 are where I’d invest the most focused study time. Right after I passed I still went back and review 8. Leveraing the dbt state.

A Note on VS Code

dbt offers the dbt VS Code extension alongside dbt Cloud, it’s genuinely useful for local development, but when it came to studying I opted to watch the dbt UI videos instead. I did that to change it up and focus my attention on something new vs the same thing I was using during my day.

Main Study Resource

Developer Path: https://learn.getdbt.com/learning-paths/dbt-certified-developer
dbt Fundamentals: https://learn.getdbt.com/courses/dbt-fundamentals
Jinja, Macros, & Packages: https://learn.getdbt.com/courses/jinja-macros-and-packages
Refactoring for Modularity: https://learn.getdbt.com/courses/refactoring-sql-for-modularity
Incremental Models: https://learn.getdbt.com/courses/incremental-models
Snapshots: https://learn.getdbt.com/courses/snapshots
Analyses & Seeds: https://learn.getdbt.com/courses/analyses-and-seeds
Exposures: https://learn.getdbt.com/courses/exposures
dbt State: https://learn.getdbt.com/courses/dbt-state
dbt Retry: https://learn.getdbt.com/courses/dbt-retry
dbt Mesh: https://learn.getdbt.com/courses/dbt-mesh
Advanced Testing: https://learn.getdbt.com/courses/advanced-testing
Advanced Deployment: https://learn.getdbt.com/courses/advanced-deployment
dbt Clone: https://learn.getdbt.com/courses/dbt-clone
Grants: https://learn.getdbt.com/courses/grants

Topic-by-Topic Study Priorities based on Study Guide

Topic 1: Developing dbt Models

Your experience level: HIGH — but verify the edges

You’ve built models, used ref(), configured sources, and written dbt_project.yml. The areas worth double-checking:

Python models: Unless you’ve shipped .py model files, this is a gap. Review how dbt handles Python models, how they differ from SQL materializations, and when you’d choose one over the other.
Grants configuration: The grants config key for database-level access control is easy to overlook if your team handles permissions separately. Know what it does and how to configure it in dbt_project.yml or model configs.
DRY principles and modularity: You probably practice this already, but be able to articulate why staging → intermediate → mart layering exists and how to apply it.
dbt Packages: Know how to add packages via packages.yml, run dbt deps, and use common packages like dbt_utils.

Key commands to be sharp on: dbt run, dbt compile, dbt build, dbt seed, dbt run-operation, dbt docs generate, dbt source freshness, dbt retry

Topic 2: dbt Models Governance

Your experience level: LOW–MEDIUM — prioritize this

This is a newer area of dbt that many practitioners haven’t implemented in production yet. Study all three sub-topics deliberately:

Model contracts: Understand how to define a contract in YAML (with contract: enforced: true) and what it means for column-level data types and constraints. Know which data platforms support what constraints.
Model versions: Know how to create v1, v2, etc. of a model, how to deprecate old versions, and how downstream consumers reference versioned models.
Model access: The access property (private, protected, public) controls which models can reference which others across dbt projects. Understand how this enforces project boundaries.

Resource to study: The Model Governance documentation is essential reading here.

Topic 3: Debugging Data Modeling Errors

Your experience level: HIGH

Two years of production dbt work means you’ve debugged plenty of compilation errors, YAML issues, and SQL failures. The exam will test your ability to distinguish between:

A dbt compilation error (Jinja templating, bad ref(), missing config)
A SQL error that surfaces through dbt (bad join logic, type mismatches, platform-specific syntax)

Make sure you know how to use dbt compile to inspect generated SQL and compare it against what the error message is pointing to. Know how to read the target/compiled/ directory.

Topic 4: Managing Data Pipelines

Your experience level: MEDIUM — study dbt clone

Troubleshooting DAG failures is familiar territory. The item worth specific study is dbt clone — a command introduced in dbt Core 1.6 that creates zero-copy clones of models using your data platform’s native cloning capability. Understand when and why you’d use it (typically for CI environments or development environments that need production-like data without full rebuilds).

Also be solid on troubleshooting errors from integrated tools — know how dbt surfaces errors from the warehouse versus errors in its own layer.

Topic 5: Implementing dbt Tests

Your experience level: HIGH — but know all four test types cold

You’ve written tests. Make sure you can clearly distinguish and explain:

Generic tests: unique, not_null, accepted_values, relationships — applied in YAML
Singular tests: Custom .sql files in the tests/ directory that return failing rows
Custom generic tests: Reusable test definitions written in Jinja macros
Packages-based tests: e.g., dbt_utils.expression_is_true

The sample exam question about incremental models and the where test parameter is a good example of the depth expected here. Know test configuration parameters: severity, warn_if, error_if, store_failures, where, and limit.

Topic 6: Creating and Maintaining dbt Documentation

Your experience level: HIGH

If you’ve been writing descriptions in your YAML files and generating docs, you’re in good shape. Make sure you understand:

How to use doc() blocks for reusable descriptions stored in .md files
How exposures appear in the lineage DAG
The dbt docs generate + dbt docs serve workflow

Topic 7: External Dependencies

Your experience level: MEDIUM

Exposures: Know how to define an exposure in YAML (type, owner, depends_on, etc.) and why they’re useful for showing downstream consumers like dashboards in the DAG.
Source freshness: Know the loaded_at_field, warn_after, and error_after configuration, and how dbt source freshness is typically used in CI/CD pipelines to gate runs on fresh data.

Topic 8: Leveraging dbt State

Your experience level: LOW — invest real time here

State and result selectors are powerful but often underused by practitioners who run full refreshes. This topic area is worth deliberate study:

Understanding state: dbt compares the current project against a previous run’s manifest.json. The state:modified selector rebuilds only what changed.
dbt retry: Reruns only the nodes that failed in the previous invocation. Know when to use this versus a full run.
Combining selectors: e.g., dbt run --select state:modified+ to run modified models and their downstream dependents. Understand result:error, result:warn, and how to chain these.

The “state” selector article linked in the study guide is essential reading.

The CI Recommendation You Need to Follow Up On

The official study guide doesn’t explicitly call out Continuous Integration by name, but it’s woven throughout Topics 4 and 8. I’d strongly recommend reading the Continuous Integration in dbt documentation as a supplement to your study.

Here’s why it matters for the exam:

dbt state is fundamentally a CI tool. The workflow of storing a production manifest.json artifact, then using --select state:modified+ in a CI job to run only affected models is the canonical use case for state selectors. Without understanding CI, the state topic feels abstract. With it, the pieces click together.

Additionally, understanding how CI jobs differ from production deployment jobs — slim CI runs, deferral to production state, the role of pull request triggers — gives you the mental model to answer pipeline management and state questions with confidence.

If you’ve been manually triggering runs or using simple scheduled jobs in dbt Cloud, carving out a few hours to set up and understand a CI job configuration will pay dividends both on the exam and in your day-to-day work.

Leave a comment Cancel reply