Case study

Streaming IoT for a food and beverage equipment supplier

A real-time pipeline turning machine messages from production lines into operational KPIs that the equipment supplier could share back with its own customers.

Real-time

streaming

8 mo

engagement

Principal

role

Azure

platform

Context

From shipping machines to shipping insight

The client was a leading supplier of food and beverage processing technology, with machinery installed at customer sites around the world. They wanted to move beyond selling equipment and start offering their customers real-time insight into how the machines were running.

I joined as the principal engineer on the engagement, leading the architecture and technology decisions and building the first production version of the streaming pipeline.

Problem

Messages were arriving, insight was not

Machines on factory floors were already sending telemetry into Azure IoT Hub. The hard part was turning that flow into something useful quickly enough to matter, and doing it on a platform the client's own team could operate going forward.

The engagement also had a clear advisory dimension. Decisions taken now (lakehouse versus warehouse, batch versus streaming, which monitoring tools to standardise on) would shape the next several years of work. Getting the architecture call right mattered as much as the first pipeline.

Approach

Databricks lakehouse, streamed end to end

We built the pipeline on Databricks with Spark Structured Streaming and Delta Lake, ingesting from Azure IoT Hub and landing into a layered model that separated raw messages, normalised events, and operational KPIs. Streaming from end to end kept the latency budget simple and avoided the usual seam between batch and real-time.

Infrastructure went into Terraform, deployments through Azure DevOps, and the monitoring story lived across Azure Application Insights and Grafana so the operations team had the same view of the system as the engineers.

Alongside the build, the engagement included formal architecture and technology guidance. Tradeoffs were written down rather than left in people's heads, so the client could make confident decisions on their own once the engagement closed.

Deliverables

What was shipped

Real-time streaming pipeline

Spark Structured Streaming jobs on Databricks, ingesting from Azure IoT Hub and producing operational KPIs continuously.

Layered Delta Lake model

Raw, normalised, and curated layers in Delta Lake, with explicit contracts and ownership at each step.

CI/CD and infrastructure as code

Azure DevOps pipelines and Terraform modules covering the platform, so environments stayed reproducible and changes were reviewable.

Monitoring strategy

Azure Application Insights and Grafana dashboards covering both the pipeline and the underlying platform, with alerting tuned for the operations team.

Architecture and technology guidance

Written architecture decisions, technology selection rationale, and a roadmap the client could carry forward without me.

Handover-ready engagement

Documentation and runbooks designed for the client team to operate, extend, and reason about the platform on their own.

Outcome

A platform the client owns

The client moved from raw machine messages to real-time operational KPIs running on a Databricks lakehouse, with the architecture and tooling decisions documented and defensible.

More importantly, this was set up as the foundation for a long-running platform, not a one-off pipeline. The team that took it over had the documentation, infrastructure, and monitoring to extend it without inheriting a black box.

Stack

Technologies

Databricks Spark Structured Streaming Delta Lake Python Azure IoT Hub Azure DevOps Terraform Azure Application Insights Grafana

Streaming machine data and not sure where to start?

Happy to help you scope a sensible first pipeline and the platform underneath it.