Data Platforms
The foundation for everything else. Warehouse, lake, or lakehouse, designed for your actual needs, not vendor benchmarks.
Context
Why this matters
A well-designed data platform is the foundation for analytics, ML, and operational reporting. Get it wrong, and every downstream use case becomes painful. Get it right, and you've built something that scales with your business.
The key is matching the platform to your stage. A Series A startup doesn't need the same infrastructure as a 500-person company. I design for where you are today with a path to where you're going.
Capabilities
What I build
Data Warehouse
Snowflake, BigQuery, Redshift, or PostgreSQL, depending on what fits. Schema design, modeling, and optimization.
Lakehouse Architecture
Delta Lake, Iceberg on Databricks or cloud-native. Bronze/silver/gold layers with defined data contracts.
Infrastructure as Code
Everything in Terraform, CDK, or Pulumi. Reproducible, version-controlled, and auditable infrastructure.
Cloud Setup
AWS, Azure, or GCP: proper account structure, networking, security, and cost controls from day one.
Cost Optimization
Right-sizing compute, storage tiering, reservation strategies. I've achieved 50% cost reductions through optimization.
Data Governance
Catalog implementation, lineage tracking, access control. Built-in compliance and audit capabilities.
Philosophy
Right-sized for your stage
Early Stage
< 50GB data
PostgreSQL is usually enough as both operational database and lightweight analytics target. dbt handles transformations from day one so you can swap the warehouse later without rewriting your models. Metabase on top for reporting.
~$300-700/month infrastructure
Growth Stage
50GB - 1TB data
Time for a proper team warehouse. MotherDuck on top of Postgres gives you the DuckDB engine with managed storage, sharing, and DuckLake-based open format. dbt stays the transformation layer, so the warehouse is swappable. No lock-in if you outgrow it.
~$600-1,500/month self-hosted, up to $2k fully managed
Scale Stage
> 1TB data or specific triggers
Snowflake, Databricks, or BigQuery start earning their cost when you hit 20+ concurrent analysts, heavy governance requirements, multi-engine access needs, or transforms that outgrow a single node.
$3k+/month infrastructure
Stack
Technologies I work with
Platforms
Infrastructure
Cloud
Building a data platform?
Let's make sure you build the right one for your stage.