Is Windmill a drop-in replacement for dbt?

Not feature-for-feature. dbt is a mature SQL transformation framework and the de-facto analytics-engineering standard, with docs-site generation, a semantic layer and seeds that Windmill has no direct equivalent for. Windmill pipelines (in alpha) cover the core transform-test-materialize-lineage loop with DuckDB and DuckLake, and add what dbt leaves out: the orchestration, scheduling, triggers and compute all live in the same platform, so you do not need a separate scheduler or a separate warehouse.

Can I migrate dbt models to Windmill automatically?

No, there is no automated dbt-to-Windmill converter. A dbt model is a SQL select plus config in a Jinja block and a schema.yml file; the Windmill equivalent is a DuckDB SELECT with the same config expressed as comment annotations (// materialize, // data_test, // on) in the one file. The SQL body ports closely; ref()/source() references become asset URIs and Jinja macros become DuckDB CREATE MACRO libraries.

Does Windmill have a package ecosystem like dbt Hub?

No. dbt Hub has hundreds of community packages (dbt-utils, dbt_expectations, codegen, audit_helper) that are widely used and battle-tested. Windmill has a community Hub of reusable scripts and flows, plus macro libraries for shared DuckDB logic, but nothing the size or maturity of the dbt package ecosystem. If your project leans heavily on dbt packages, that is a real advantage for dbt today.

Windmill pipelines are in alpha. Is that a problem?

It depends on your risk tolerance. The pipelines feature (DuckDB + DuckLake + // materialize) is in alpha and the annotation syntax may change, even though the surface is broad: managed materialization, SCD2, partitions and backfill, data tests, schema contracts, column lineage and freshness SLOs (with a stale-asset watchdog on Enterprise) all ship today. dbt is production-hardened and used by thousands of teams. Windmill's underlying platform (scripts, flows, scheduling, workers) is mature and used in production; the alpha label applies specifically to the asset-based pipelines layer.

What about the licenses, Apache 2.0 vs AGPLv3?

dbt Core is Apache 2.0, which is more permissive if you plan to modify and redistribute. Windmill's OSS core is AGPLv3, which requires source availability for network-exposed modifications. Note that dbt's newer Fusion engine ships under the Elastic License 2.0 (source-available, not OSI-approved), and dbt Cloud is fully proprietary. Both companies gate advanced features behind proprietary, paid tiers.

Partly on both sides. dbt publishes a free Developer tier (1 seat, 3,000 models/month) and a Starter tier ($100/developer/month, up to 5 seats), but Enterprise pricing requires a sales conversation. Windmill publishes per-seat and per-worker Enterprise pricing on its pricing page, and its open-source core is free and unlimited to self-host.

Windmill

Windmill cloud

OSS

Windmill vs dbt

Q: dbt is transform-only. What does that mean in practice?

dbt compiles and runs SQL against a warehouse you already pay for (Snowflake, BigQuery, Redshift, Databricks, Postgres). It does not schedule runs or provide compute of its own: dbt Core needs an external scheduler (cron, Airflow, Dagster, or dbt Cloud jobs) and always runs inside your warehouse. Windmill runs the transforms itself on DuckDB, stores results in managed DuckLake tables, and schedules and triggers them natively, on your own workers.

dbt is a SQL transformation framework that runs in your warehouse and needs an external scheduler. Windmill pipelines fold transform, storage, orchestration and compute into one platform. Eight questions to tell them apart, answered honestly.

Try Windmill cloud Self-host in 3 mins

The 8 questions

01What you can buildWhich data work and software fits on each
02TargetWho the platform is built for
03Build experienceHow you build on each platform
04Integrations & computeWhere the work runs and how it connects
05Migration & lock-inHow hard to get in, how hard to get out
06Enterprise requirementsAudit logs, observability, security, performance
07Licensing & pricingOpen source, pricing, self-hosting
08VerdictThe verdict

Windmill in one sentence

An open-source developer platform whose pipelines (alpha) fold transform (DuckDB), storage and versioning (DuckLake), orchestration and compute (your own workers) into one data-pipeline platform, alongside workflows, AI agents and apps. Any language.

dbt in one sentence

The de-facto standard for SQL transformation in the warehouse: models with a ref() / source() lineage DAG, tests, snapshots, macros, a semantic layer and a generated docs site. Transform-only by design, it runs against a warehouse you provide and needs an external scheduler (cron, Airflow, Dagster or dbt Cloud jobs) to run on a cadence.

01 · What you can build

Which data work and software can you do on each?

dbt is focused and deep: SQL modeling, tests, snapshots, macros, a semantic layer and a docs site, all pushed down to a warehouse you already run. Windmill covers the same core transform-and-lineage loop with DuckDB and DuckLake (in alpha), and adds the orchestration, compute and the wider software (workflows, apps, agents) around it.

Capability	Windmill	dbt
SQL data models with a lineage DAG ref() / source() style dependencies inferred from the code (Windmill: alpha)
Data tests unique, not_null, accepted_values, relationships that fail the run
Snapshots / SCD2 history Keep every version of every row for effective-dated as-of joins
Reusable macros dbt Jinja macros vs Windmill engine-native DuckDB macro libraries
Column-level lineage Column-to-column lineage inferred from the SQL		dbt Cloud
Schema contracts dbt enforces declared contracts at build time; Windmill checks consumers against captured schemas at save time (warn-only)
Freshness SLOs Fresh/stale verdict per asset; automatic re-runs of stale assets are Enterprise on Windmill, source freshness is dbt Cloud		dbt Cloud
Seeds (CSV loading) Load small static CSVs into the warehouse as tables
Docs-site generation A generated, browsable documentation website of the project
Semantic layer / metrics Central metric definitions queried by BI tools		dbt Cloud
Built-in scheduling & orchestration Run on a cron, react to events, retry (dbt Core needs an external scheduler)		dbt Cloud
Own compute engine dbt pushes SQL to your warehouse; Windmill runs DuckDB on your own workers
Steps in any language Python, TypeScript, Go, Bash next to SQL (dbt: Python models on some adapters)		Partial
Workflows, webhooks, APIs & apps General flows, HTTP endpoints, AI agents and internal apps on the same runtime

Windmill pipeline features (materialize, DuckLake, data tests, macros, column lineage) are in alpha. Range backfill, the freshness watchdog and write-audit-publish (a failing data test rolls back the write instead of leaving the failed table live) are Enterprise Edition features; single-partition runs, the fresh/stale badge and commit-then-test data tests are available in all editions.

02 · Target

Who is each platform built for?

dbt is built for analytics engineers modeling SQL inside a warehouse, and it set the standard for that discipline. Windmill is built for developer-led teams that want the transform layer and the orchestration, compute and surrounding software in one place, in any language.

Windmill

dbt

Primary audience

Windmill

Developer-led teams building across data and software. Engineers own the platform end-to-end in any language, from DuckDB transforms to workflows, APIs, agents and apps, with Git, local dev and CI/CD.

dbt

Analytics engineers and data teams who model in SQL against a cloud warehouse. dbt brought software-engineering practices (version control, testing, CI, modularity) to the analytics layer, and that is squarely who it serves.

Where it fits in the stack

Windmill

A single runtime for the whole job: ingest, transform, materialize, orchestrate, serve and alert. The data pipeline is one part of a broader internal-software platform, not a separate tool bolted to a scheduler and a warehouse.

dbt

The transformation layer, sitting between ingestion (Fivetran, Airbyte) and BI (Looker, Tableau). It assumes you already run a warehouse and something to schedule it, and does the T of ELT extremely well within that boundary.

Language

Windmill

SQL (DuckDB) for transforms, plus Python, TypeScript, Go and Bash steps in the same pipeline. Any package is a first-class import.

dbt

SQL first, with Jinja templating for logic and macros. Python models are supported on some warehouse adapters (Snowflake, BigQuery, Databricks), but SQL is the center of gravity.

03 · Build experience

How do you build on each platform?

dbt splits a model across a SQL file and YAML config, with a mature --select grammar and best-in-class Git-native CI. Windmill keeps the write, tests and lineage as inline annotations in one DuckDB file, and runs it on its own compute with a local wmill pipeline dev loop.

Windmill

One DuckDB file is the whole step. Comment annotations declare the managed materialization, the inputs, the tests and the lineage. There is no separate config file, no connection profile and no external scheduler: Windmill owns the write, runs it on your workers and triggers it natively.

dim_customer.sql
SCD2 variant

-- pipeline
-- on ducklake://analytics/raw_customers
-- materialize ducklake://analytics/dim_customer key=customer_id
-- data_test unique customer_id
-- data_test not_null customer_id

ATTACH 'ducklake://analytics' AS dl;
SELECT customer_id, name, tier FROM dl.raw_customers;

-- pipeline
-- on ducklake://analytics/raw_customers
-- materialize ducklake://analytics/dim_customer key=customer_id history track=name,tier
-- data_test not_null customer_id

-- Keeps every version of every row (SCD2): the runtime
-- manages valid_from / valid_to / is_current for you.
SELECT customer_id, name, tier FROM ducklake://analytics/raw_customers;

dbt

A model is a SQL select with a Jinja config() block. Tests, sources and column metadata live in a separate schema.yml. Project settings live in dbt_project.yml, the warehouse connection in profiles.yml, and an external scheduler (cron, Airflow, Dagster or dbt Cloud jobs) runs it against your warehouse.

dim_customer.sql
schema.yml

-- models/dim_customer.sql
{{ config(
    materialized = 'incremental',
    unique_key   = 'customer_id'
) }}

select
    customer_id,
    name,
    tier
from {{ source('raw', 'customers') }}

# models/schema.yml
version: 2

sources:
  - name: raw
    tables:
      - name: customers

models:
  - name: dim_customer
    columns:
      - name: customer_id
        tests:
          - unique
          - not_null

# plus dbt_project.yml, profiles.yml (warehouse
# connection) and an external scheduler to run it.

Windmill

dbt

Authoring

Windmill

A pipeline step is a DuckDB script whose comment annotations declare everything: -- materialize for the managed write, -- on for inputs, -- data_test for checks, -- partitioned for partitions. Config lives inline in the one file. Non-SQL steps (Python, TypeScript, Bash) use the same annotations.

dbt

A model is a SQL select with a Jinja config() block. Tests, sources and descriptions live in schema.yml; project settings in dbt_project.yml; the warehouse connection in profiles.yml. The split across files is a convention thousands of teams already know well.

Lineage & selection

Windmill

Lineage is inferred from the assets each script reads and writes, including column-level lineage from the SQL. "Run up to here" bounds a run to a chosen node without a compile-time select grammar.

dbt

Lineage comes from ref() and source(). The --select / --exclude grammar (with graph operators like +model and tag/path selectors) is mature and expressive, and column-level lineage is available in dbt Explorer.

Local dev & IDE

Windmill

CLI loop: wmill pipeline dev watches the folder and live-previews the graph in the browser, wmill pipeline run --local runs it from working-tree files. VS Code and AI coding tools work against the real source.

dbt

dbt run / dbt build from the CLI, or the browser IDE and Studio in dbt Cloud. A large, well-documented local workflow with strong editor and adapter tooling.

Compute & warehouse

Windmill

Transforms run on DuckDB on your own workers, and results land in managed DuckLake tables. No external warehouse required, though you can still query Postgres, Snowflake or BigQuery from a step.

dbt

dbt compiles SQL and runs it in your warehouse: Snowflake, BigQuery, Redshift, Databricks, Postgres and others via adapters. Compute and cost are the warehouse's; dbt provides none of its own.

Resources & secrets

Windmill

Resources and variables are first-class, encrypted at rest and shared across scripts, flows and apps. External secret backends (Vault, AWS Secrets Manager) are Enterprise only.

dbt

Warehouse credentials live in profiles.yml or environment variables for dbt Core; dbt Cloud manages connections and secrets in the UI. Governed connections and fine-grained access are Cloud (paid) features.

Git & CI

Windmill

Scripts, pipelines, resources, schedules and permissions are all files deployed via the CLI and Git sync. Git sync is free for up to 2 users; beyond that is Enterprise only. Workspace forks add per-branch dev environments with forked DuckLake data, the analog of developing against dev schemas with --defer in dbt.

dbt

dbt is Git-native by design: a project is a repo, and dbt Cloud has built-in CI jobs that run and test models on pull requests. This is one of dbt's strongest, most mature areas.

04 · Integrations & compute

Where does the work run, and how does it connect?

dbt connects to your warehouse through adapters and draws on a large, mature package ecosystem, but brings no compute or scheduler of its own. Windmill runs transforms on its own workers and schedules and triggers them natively, so the pipeline does not depend on a separate warehouse or orchestrator.

Windmill

dbt

Where transforms run

Windmill

On your own workers via DuckDB, writing to managed DuckLake tables on object storage. You bring the compute; there is no separate warehouse in the loop unless you want one.

dbt

Inside your cloud warehouse. dbt generates SQL and hands it to Snowflake, BigQuery, Redshift, Databricks or Postgres through an adapter. Warehouse choice, performance and bill are all yours.

Packages & ecosystem

Windmill

Any npm, PyPI, Go or Maven package is a first-class import with automatic dependency resolution, plus workspace DuckDB macro libraries for shared SQL logic and a community Hub of scripts.

dbt

dbt Hub has hundreds of community packages (dbt-utils, dbt_expectations, codegen, audit_helper) that are widely adopted and mature. This is a genuine advantage: the package ecosystem is large and battle-tested.

Scheduling & triggers

Windmill

Native. A step fires on a schedule, on an upstream asset write (the cascade), or from a trigger (HTTP, Kafka, Postgres CDC, SQS, MQTT and more). No external orchestrator to stand up.

dbt

None in dbt Core: you supply the scheduler (cron, Airflow, Dagster, or dbt Cloud jobs). dbt Cloud adds hosted job scheduling and event triggers, but that is the paid product, not the open-source engine.

05 · Migration & lock-in

How hard to get in, and how hard to get out?

dbt is easy to adopt if you already write warehouse SQL, and your data stays in your warehouse, though model code is dbt-flavored (Jinja, macros, packages). Windmill keeps transforms as standard SQL over open DuckLake Parquet you own, so leaving costs you the orchestration layer, not the data.

Windmill

dbt

Getting in

Windmill

A dbt model's SQL body ports closely to a DuckDB materialize script: the config() and schema.yml become comment annotations, ref() / source() become asset URIs, and Jinja macros become DuckDB macro libraries. There is no automated converter, so it is a manual rewrite.

dbt

Very low friction if you already write SQL against a supported warehouse. Init a project, point profiles.yml at your warehouse, and existing SQL becomes models with small edits. This ease of onboarding is a big part of dbt's adoption.

Getting out

Windmill

Transform logic is standard SQL and the DuckLake tables are open Parquet on object storage you own, readable by any DuckDB client. The CLI exports the workspace as plain files. What you lose leaving is the orchestration layer, not the data or the SQL.

dbt

Model SQL is portable, but it is dbt-flavored: Jinja templating, ref()/source(), macros and package dependencies need reworking on any other tool. Because dbt writes to your own warehouse, the data itself never leaves your control, which lowers lock-in on the storage side.

06 · Enterprise requirements

Audit logs, observability, security, performance

Both bring software-engineering rigor to data work, and both gate governance behind paid tiers. dbt's performance is your warehouse's, tuned by incremental, state-aware rebuilds. Windmill runs transforms on its own workers, so throughput scales with the compute you allocate rather than a warehouse bill.

Windmill

dbt

Observability

Windmill

Real-time streaming logs, per-run inputs / outputs / duration, the pipeline graph with live run status, and a Prometheus exporter. Every materialized asset records its snapshot id and row count.

dbt

Run results and timings from the CLI and artifacts; dbt Cloud adds run history, alerting, dbt Explorer and dbt Insights. Rich run metadata, focused on the transformation graph.

Audit logs & security

Windmill

SOC 2 Type II. RBAC, SSO (up to 10 users), encrypted secrets and sandboxed execution in open source. Uncapped SSO, extended audit-log retention, SCIM and SAML are Enterprise only.

dbt

SOC 2. RBAC, SSO (SAML), audit logging, PrivateLink and IP restrictions are dbt Cloud Enterprise / Enterprise+ features. dbt Core self-hosted has no built-in RBAC or SSO.

Multi-tenancy

Windmill

Multiple isolated workspaces on one instance, each with its own users, resources and secrets. Free tier is capped at 3 workspaces; unlimited is Enterprise only.

dbt

Projects organize models; the free Developer tier allows 1 project, Starter 1, Enterprise up to 30 and Enterprise+ unlimited. dbt Mesh (cross-project references) is an Enterprise feature.

Performance model

Windmill

Transforms run on DuckDB on your workers, so throughput scales with the compute you allocate, independent of any warehouse. Single-node DuckDB and Polars cover the vast majority of ETL workloads without a warehouse or cluster (see ETL & data processing). Steps can be pinned to a worker tag (a high-memory or GPU pool) per -- tag.

dbt

Performance is the warehouse's: dbt's job is to generate efficient SQL and rebuild only what changed (state-aware, incremental models). Cost and speed are governed by your warehouse configuration and spend.

07 · Licensing & pricing

Open source, pricing, and self-hosting?

dbt Core's Apache 2.0 license is more permissive than Windmill's AGPLv3, though its Fusion engine (ELv2) and dbt Cloud are proprietary. Windmill publishes per-seat and per-worker pricing upfront, while dbt's Enterprise tiers require a sales call.

Windmill

dbt

Open-source license

Windmill

AGPLv3 core, free and unlimited to self-host. Enterprise features (SSO, dedicated workers, audit logs, external secret backends, range backfill) ship in a separate proprietary codebase. Managed cloud available.

dbt

dbt Core is Apache 2.0, more permissive for modify-and-redistribute. The newer Fusion engine ships under the Elastic License 2.0 (source-available, not OSI-approved), and dbt Cloud is fully proprietary.

Pricing

Windmill

Public per-seat and per-worker pricing on the pricing page: developers around $20/month, operators $10/month, standard workers $50/month. The open-source core is free.

dbt

Free Developer tier (1 seat, 3,000 models/month). Starter is $100 per developer/month (up to 5 seats). Enterprise and Enterprise+ pricing is not public and requires a sales conversation.

08 · Verdict

The verdict

dbt and Windmill are not the same category of tool. dbt is a transform-only SQL framework: it compiles models, tracks lineage through ref() and source(), and runs the SQL in a warehouse you provide, on a schedule something else supplies. Windmill pipelines cover that same transform-test-materialize-lineage loop with DuckDB and DuckLake, and also own the parts dbt deliberately leaves out: the compute, the storage, the scheduling and the triggers.

dbt is the right call if your data already lives in a cloud warehouse you are committed to, your team models in SQL, and you want the most mature analytics-engineering tool available: a large package ecosystem on dbt Hub, docs-site generation, a semantic layer, snapshots and Git-native CI, all production-hardened and widely understood. Windmill does not match dbt feature-for-feature here, and its pipelines are in alpha.

Windmill is the stronger fit if you would rather not run a separate warehouse and a separate orchestrator just to transform data. One platform runs the DuckDB transforms on your own workers, versions the results in DuckLake, and schedules and triggers everything natively, with data tests, SCD2 history, macros, freshness SLOs and column-level lineage as inline annotations. In one place it goes past dbt's defaults: on Enterprise Edition a failing data test rolls back the write (write-audit-publish) instead of leaving the failed model live. The same runtime also carries the surrounding software: workflows, APIs, AI agents and internal apps, in any language, with shared auth and observability.

If you are choosing between them, the honest framing is scope: dbt for deep, warehouse-native SQL modeling; Windmill for folding transform, storage, orchestration and compute into one platform. The fastest way to judge is to spend an afternoon in each.

Frequently asked questions

Data pipelines on Windmill

The product overview: the DuckDB and DuckLake data-pipeline feature set, integrations and benchmarks at a glance.

Pipelines documentation

Every annotation and option, with the full "how it compares to dbt and Dagster" section.

Windmill vs Dagster

The other side of the asset model: how Windmill compares to the Dagster orchestrator.

Build your internal platform on Windmill

Scripts, flows, apps, and infrastructure in one place.

Get started for free