Case study
Streaming IoT for a food and beverage equipment supplier
A real-time pipeline turning machine messages from production lines into operational KPIs that the equipment supplier could share back with its own customers.
Real-time
streaming
8 mo
engagement
Principal
role
Azure
platform
Context
From shipping machines to shipping insight
The client was a leading supplier of food and beverage processing technology, with machinery installed at customer sites around the world. They wanted to move beyond selling equipment and start offering their customers real-time insight into how the machines were running.
I joined as the principal engineer on the engagement, leading the architecture and technology decisions and building the first production version of the streaming pipeline.
Problem
Messages were arriving, insight was not
Machines on factory floors were already sending telemetry into Azure IoT Hub. The hard part was turning that flow into something useful quickly enough to matter, and doing it on a platform the client's own team could operate going forward.
The engagement also had a clear advisory dimension. Decisions taken now (lakehouse versus warehouse, batch versus streaming, which monitoring tools to standardise on) would shape the next several years of work. Getting the architecture call right mattered as much as the first pipeline.
Approach
Databricks lakehouse, streamed end to end
We built the pipeline on Databricks with Spark Structured Streaming and Delta Lake, ingesting from Azure IoT Hub and landing into a layered model that separated raw messages, normalised events, and operational KPIs. Streaming from end to end kept the latency budget simple and avoided the usual seam between batch and real-time.
Infrastructure went into Terraform, deployments through Azure DevOps, and the monitoring story lived across Azure Application Insights and Grafana so the operations team had the same view of the system as the engineers.
Alongside the build, the engagement included formal architecture and technology guidance. Tradeoffs were written down rather than left in people's heads, so the client could make confident decisions on their own once the engagement closed.
Deliverables
What was shipped
Real-time streaming pipeline
Spark Structured Streaming jobs on Databricks, ingesting from Azure IoT Hub and producing operational KPIs continuously.
Layered Delta Lake model
Raw, normalised, and curated layers in Delta Lake, with explicit contracts and ownership at each step.
CI/CD and infrastructure as code
Azure DevOps pipelines and Terraform modules covering the platform, so environments stayed reproducible and changes were reviewable.
Monitoring strategy
Azure Application Insights and Grafana dashboards covering both the pipeline and the underlying platform, with alerting tuned for the operations team.
Architecture and technology guidance
Written architecture decisions, technology selection rationale, and a roadmap the client could carry forward without me.
Handover-ready engagement
Documentation and runbooks designed for the client team to operate, extend, and reason about the platform on their own.
Outcome
A platform the client owns
The client moved from raw machine messages to real-time operational KPIs running on a Databricks lakehouse, with the architecture and tooling decisions documented and defensible.
More importantly, this was set up as the foundation for a long-running platform, not a one-off pipeline. The team that took it over had the documentation, infrastructure, and monitoring to extend it without inheriting a black box.
Stack
Technologies
Offered today as
If this sounds like your problem
Engagements like this one usually start with an architecture and technology decision and grow into a first production pipeline.
Data Platforms
Lakehouse foundations and infrastructure as code, set up so the client team can run it.
Data Pipelines
Real-time pipelines designed for the actual latency you need, with the monitoring to back them up.
Data Strategy
Architecture reviews and technology selection for teams making decisions they will live with for years.
Streaming machine data and not sure where to start?
Happy to help you scope a sensible first pipeline and the platform underneath it.