Active engagement

Data platform rebuild for a UAE payments fintech

Standing up the data and analytics function at a digital payments company serving emerging markets. Strategy, platform, financial data models, and the team to keep it all running.

50%

ETL cost cut

Since 2023

engagement

Head of Data

role

Databricks

platform

Context

A payments business outgrowing its data setup

The client is a digital payments fintech serving emerging markets, processing a constant flow of transactions across multiple processors and partners. I joined as a senior data engineer in late 2023 and grew into the Head of Data and Analytics role as the function took shape. The data setup at the time was a patchwork of pipelines, ad-hoc reports, and spreadsheets the business had simply outgrown.

The mandate was broader than building a single feature. It was to stand up the data and analytics function as a whole: pick the right platform, fix the models the business runs on, build the financial backbone for accurate reporting, and shape the team and practices to keep evolving it. I currently continue with the company as a contractor, focused on the work with the highest leverage.

Problem

Three problems pretending to be one

The platform itself was the first issue. ETL costs were rising faster than volume justified, real-time visibility into transactions was effectively impossible, and the architecture made it expensive to try anything new. Cost and latency were both blockers.

The second issue was the models. Different teams quoted different numbers for the same KPI because the underlying definitions were inconsistent or undocumented. There was no shared financial backbone that reporting and compliance could lean on with confidence.

The third was reconciliation, which had been living in spreadsheets at month-end. Every transaction recorded internally needs to match what processors and partners report, and any gap is either a real loss or a sign that something upstream is wrong. Doing that work by hand was slow, stressful, and easy to question.

Approach

Strategy first, then platform, then models

The work started with strategy. Before writing code, we agreed on target architecture, the platform direction, the data model principles for finance, and the sequencing that would deliver visible value early. Decisions were written down so leadership and engineering shared the same picture.

Next was the platform. We migrated the data platform onto Databricks and re-engineered ingestion around Spark Structured Streaming and CDC, which cut ETL processing costs by roughly half and turned batch updates into real-time visibility. Infrastructure went into Terraform, and the lakehouse layering gave us a clean place to rebuild models on solid ground.

With the platform in place, we rebuilt the financial backbone. Proper financial data models for reporting and compliance. Reconciliation pipelines that match internal records against partner and processor data and surface exceptions automatically. BI built around the questions each team actually asks, with KPIs that hold up because the underlying models are documented and owned.

Around all of this, the data and analytics function had to exist. That meant hiring, shaping the engineering practices the team runs on, and setting the rituals that make a small group productive inside a fast-moving fintech.

Deliverables

What was shipped

Data strategy and architecture

Target architecture, platform direction, data model principles for finance, and a sequenced roadmap leadership signed off on.

Databricks platform migration

Migrated the data platform onto Databricks, cut ETL processing cost by roughly 50%, and built room to scale.

Real-time transaction ingestion

Re-engineered ingestion using Spark Structured Streaming and CDC, replacing batch with real-time visibility into transactions.

Financial data models

A proper financial backbone for reporting and compliance, with documented definitions and clear ownership.

Reconciliation pipelines

Pipelines that match internal records against partner and processor data, with exception handling that finance can run themselves.

Lakehouse and infrastructure as code

Layered lakehouse on Delta Lake, infrastructure managed in Terraform, and environments that engineers and reviewers can reason about.

BI for the whole business

Dashboards for finance, ops, product, and leadership, built around the questions each team actually asks, with consistent KPI definitions across the org.

Data and analytics function

Stood up the data and analytics function, hired and led the team, and set the engineering practices the group runs on today.

Outcome

A platform the business can lean on

The platform sits on a noticeably smaller bill while delivering real-time visibility the business did not have before. Financial reporting now runs on documented models rather than ad-hoc queries. Reconciliation moved from spreadsheets at month-end to pipelines finance owns and triages day to day.

Across the business, the same KPIs appear with the same definitions in the same dashboards. The data and analytics function is in place and the engagement continues, now in a contractor capacity, focused on the highest-leverage problems.

Stack

Technologies

Databricks Spark Structured Streaming CDC dbt Python SQL Delta Lake Terraform AWS

Building or rebuilding a fintech data platform?

Happy to walk through what a sensible first phase looks like.