Data Platforms

The $3k/Month Data Stack

A complete analytics infrastructure for growing companies. Everything you need without the enterprise price tag.

12 min read

You can run a production-grade analytics setup for well under $3,000/month. In practice, most Series A to Series C companies land between $500 and $1,500 depending on whether they self-host or go managed. The $3k figure is the honest ceiling, not the target.

Here's the reference architecture I recommend, and the tradeoffs that decide where on that range you actually land.

The Stack

Infrastructure

  • PostgreSQL (RDS or managed)
  • S3 for data lake storage
  • EC2 or Lambda for compute

Tools

  • MotherDuck (or DuckDB) for the analytics warehouse
  • dbt Core for transformations
  • Metabase for BI
  • dlt or Airbyte for ingestion

Cost Breakdown

Here's what you're actually paying for:

Component Monthly Cost
PostgreSQL (RDS db.r6g.large, Multi-AZ) $250-400
S3 storage (500GB) for Parquet and backups $15
MotherDuck (team warehouse, Standard plan) $100-300
EC2 for dbt runs and ingestion (t3.large) $60
Metabase (self-hosted on a small EC2) $70
dlt or self-hosted Airbyte $50-300
GitHub Actions (CI/CD) $20
Total ~$600-1,200

Want everything managed? Swap Metabase self-hosted for Metabase Cloud (Starter at $100/month base plus $6 per additional user after the first 5) and Airbyte Cloud (capacity-based, roughly $200-500/month for typical usage). Even fully managed, the stack rarely clears $2,000/month at this stage.

How It Works Together

1. Data Ingestion

dlt or Airbyte pulls data from your sources: Salesforce, Stripe, your production database, marketing platforms. For a small team, dlt is usually the lighter choice. It's a Python library you run from wherever your dbt runs, with no separate service to operate. Raw data lands in PostgreSQL or directly in MotherDuck.

2. Transformation Layer

dbt Core is the transformation layer and the most important portability decision in the stack. Standard medallion architecture: staging, intermediate, marts. All version controlled, all tested. Today it runs against MotherDuck via the dbt-duckdb adapter. If you later move to Snowflake, Databricks, or BigQuery, you switch the adapter and keep the models. The warehouse becomes a detail, not a commitment.

3. Analytical Queries

MotherDuck hosts your analytical tables and serves Metabase, notebooks, and ad-hoc SQL from analysts. You get the DuckDB engine without running the storage, auth, concurrency, and sharing plumbing yourself. PostgreSQL stays the operational source of truth and holds raw ingested data.

Why MotherDuck over self-hosted DuckDB

DuckDB on your laptop or on an EC2 box is fantastic for one analyst. Once you have a team, you need shared storage, concurrency, access control, a catalog, and backup. Building that yourself is a distributed-systems side project. MotherDuck solves it for roughly the cost of the EC2 instance you'd otherwise run, keeps your tables portable as Parquet plus DuckLake metadata, and lets heavy local workloads stay local through hybrid execution.

Self-hosted DuckDB still makes sense for embedded use inside applications, strictly single-user analyst workflows, or teams with a real compliance reason to run everything in their own account.

What This Stack Handles

  • Hundreds of gigabytes of data
  • 5-10 concurrent dashboard users
  • Complex dbt models with 100+ transformations
  • Ad-hoc analytical queries on large datasets
  • 15+ data source integrations

When to Upgrade

This stack has limits. Consider moving to Snowflake/Databricks when:

  • Data exceeds 1TB and query performance degrades
  • You have 20+ concurrent analysts running queries
  • You need enterprise governance and access controls
  • Real-time streaming becomes a requirement

But most Series A-C companies never hit these limits. And if you do, you'll have saved hundreds of thousands of dollars by the time you need to upgrade.

Getting Started

The hardest part isn't the technology. It's the data modeling. Getting your business logic right, defining metrics consistently, and building trust in the numbers.

Start with one use case. Get it working end-to-end. Then expand. Don't try to boil the ocean on day one.

Want help building this stack?

Let's discuss what makes sense for your company.

Get in Touch