customer story
Achieving demand
forecasting at scale

INDUSTRY: Retail and consumer goods

SOLUTION: Demand forecasting

PLATFORM USE CASE: Delta Lake, data science, machine learning, ETL

CLOUD: Azure, Google Cloud


As a multinational consumer goods manufacturing company serving millions of retail customers, Reckitt struggled with the complexity of forecasting demand, with large volumes of different types of data across many disjointed pipelines. Today, Azure Databricks provides Reckitt with a Unified Data Analytics Platform that enables its data teams to deliver ML-powered insights to the business, improving the support of neighborhood grocery stores through predictive analytics, product placement, and business forecasting.


Reckitt distributes their products to consumers across 60+ countries. One of their key market segments is called traditional trade or neighborhood grocery stores. This market is highly fragmented and consists of millions of small mom and pop stores, mostly in emerging markets in Asia, Africa, and South America. To serve this market, they have a team of over 16,000 reps who visit these stores with the goal of helping store owners select the best products to meet the unique needs of their markets.

Data is one of the most critical assets they have to improve demand forecasting. However, Reckitt struggled with large volumes of different types of data across many disjointed pipelines — making it difficult for them to efficiently extract insights to help the sellers on the streets operate efficiently and drive more business.

  • Process over 2TB of data every data across 250+ data pipelines that are running 24×7
  • Internal business teams (finance, sales, operations) struggle to access and process external data sets such as point of sales, ecommerce, Nielsen, consumer analytics.
  • Hadoop infrastructure proved to be complex, cumbersome, and costly to scale. This legacy system struggled with performance and also in terms of deploying new data sets into it. As a result, the DevOps team was extremely busy monitoring and fixing issues — making it difficult to deliver timely insights.


Azure Databricks provides Reckitt with a Unified Data Analytics Platform that has fostered a scalable and collaborative environment across data science and engineering, allowing data teams to more quickly innovate and deliver ML-powered insights to the business.

  • Fully managed platform with automated cluster management simplifies the infrastructure and operations at any scale.
  • Collaborative notebook environment with support for multiple languages (SQL, Scala, Python, R) enables a diverse team of users to work together in their preferred language.
  • Native support for Delta Lake allowed them to compress their data sets, greatly improving cost optimization and storage space.


With Databricks, Reckitt has seen significant performance gains and cost management improvements which have allowed them to scale their business and uncover new opportunities faster.

  • Improved cost optimization: Able to leverage Delta Lake to compress their data from 80TB to about 2TB of data which greatly improved cost management while also accelerating pipelines for downstream analytics.
  • Faster time-to-insight: Databricks has helped reduce pipeline performance — accelerating the running of 24×7 jobs by 2x (from 24 hours to 13 hours to run all of their pipelines). This has allowed them to greatly reduce DevOps costs while allowing these resources to focus on additional use cases.
  • Increased marketshare: With the support of Databricks, Reckitt has increased its ability to support its customers by over 10x. Before Databricks, their maximum capacity was around 45,000 stores. With Databricks, they are quickly scaling to nearly 500,000 stores.
  • 10x
    Increased capacity to support business volume
  • 98%
    Data compression from 80TB to 2TB, reducing operational costs
  • 2x
    Faster data pipeline performance for 24×7 jobs

Databricks is the key enabler for us to experiment fast and then scale quickly — that’s how the platform is adding value to the business and helping us grow.”

– Atif Ahmed, Director of Advanced Analytics, Reckitt

Related Content

Technical Talk at Spark + AI Summit EU 2019