Disrupting Brick & Mortar Retail: Top 10 Data Science use cases – A research paper.

Retail Therapy: Designing a modern mall

First to set the right expectation for the reader, this is not a blog! This is one of the ‘High Level Data Science Research & Use Case’ papers that I submitted as part of my ongoing data science post graduation program (Aegis) recently . Hence the size & depth. Only if you really like either or both, data science & retail industry, it would be worth your time!

Traditional retail industry across the world, its sheer incredible range of offerings, experiences & customer reach have always fascinated me for more than three decades (since my childhood days). Specially in India (I am born in Kolkata, live in Mumbai) the range is mind boggling (the highest retail density in the world), from vegetable retailer across the road, multiple retail shops under every residential building, to luxury single brand retailers to gigantic shopping malls, we experience it all, almost every week! The fact we cant ignore is like education, healthcare, utility (water, gas, electricity, mobile), retail also falls into the recession agnostic sector if we consider only the non discretional part (grocery & regular needs) of the spend.

A brief context of global Retail Industry before we get started on the big data,  data science & machine learning discussion.

  • Revenue to go up from $22.6 trillion (2015) to $29 trillion by 2018 (including e-commerce & un-organized retailers).
  • Only about 9% of it will be e-commerce by 2018, rest is still brick & mortar!
  • This industry is about 31% of world’s GDP growing at 3.8% YoY.
  • Organized physical retail is a sunrise sector for many economies in the world
  • Despite the growth of online retail, therapeutic & entertainment value of ‘physical shopping’ will help sustain this industry for long time!

Top 10 world retailers:

  • Walmart Sores Inc
  • Costco Wholesale Corporation
  • Kroger
  • Wallgreen Boots
  • Tesco PLC
  • Carrefour SA
  • Metro Group AG
  • The Home Depot
  • Target Corporation
  • Aldi & Lidl (Germany)

I am going to list down top 10 data science use cases  in no special order below. I have tried to follow a problem-solution approach for each use case. Many of the ideas below are either implemented by giants listed above or still completely fresh ideas waiting to be tapped globally, I am sure it will be a useful and engaging read for all readers.

Use Case 1 : Smart inventory , Forecasting & Supply chain management

Challenges :

  • Tesco, Walmarts and likes have approx 90k products with more than 15000 categories. 260 million purchases per week, 12000 stores, 28 countries.
  • Each category involve multiple vendors, life span, demand cycles, warehouse vs made to order etc etc – plethora of factors. Imagine the complexity, velocity, veracity & variety of the dataset.

Opportunities/Use cases :

  • End to end digital supply chain based on product RFID, auto replenishment algorithms, interconnected ERPs between retailers, vendors and manufacturers.
  • Technologies like : R, Hadoop, Spark, MapReduce are being used @Walmart.
  • Holt-Winters Double Exponential Smoothing Forecasting Algorithm is leveraged by Walmart sales. There are truckloads of models/frameworks possible to crack this one right.


Use Case 2 : In-store product storage & Hygiene

Challenges :

 “Half the world’s food produce do not make it to plates, upto 2 Bn tonnes wasted every year, potentially could feed an additional 3 Bn people!! A problem (if solved) could actually solve “World Hunger” problem”. (2013).

Opportunities/ Use cases :

Imagine Tesco dealing with 90k products, 15000 categories of products in a single store through IOT  where a:

  • product item automatically connects to the refrigerator to enhance cooling,
  • a smart packaging alerting store managers about the condition of the product or
  • asking to move the same onto high discount shelf,
  • alerts/beeps when a best before date is about to expire,
  • odor tracking …etc..etc…opportunities are endless.

Use Case 3 : Monetizing Customer Life Style & Life Cycle

“The second largest discount store in the US, TARGET Corporation, predicts high school girl’s pregnancy before her father did! (2012).”

“My daughter got this in the mail!, She’s still in high school, and you’re sending her coupons for baby clothes and cribs? Are you trying to encourage her to get pregnant?” – The father screamed at the store manager.

The manager didn’t have any idea what the man was talking about. He looked at the mailer. Sure enough, it was addressed to the man’s daughter and contained advertisements for maternity clothing, nursery furniture and pictures of smiling infants. The manager apologized accepting the same as a rare mistake.  The manager called up the father a few days later to apologize again. On the phone, though, the father was somewhat abashed.

“I had a talk with my daughter,” he said. “It turns out there’s been some activities in my house I haven’t been completely aware of. She’s due in August. I owe you an apology!”

How did they do it ? Well through machine learning it knows the month by month buying patterns of the pregnant women, from calcium tablets to hand sanitizers to ice creams to toys. Using data the store actually predicted the delivery date of the lady and pushed her some promotional offers on products dadthat could be of her interest. This is classic case for power of data science in retail.

Now imagine the same every facets, profession, interest in our lives : Students , Parenting, Elderlies , Bachelors, Kids, Sports, Birthdays & dates, Reading, Golf, Professional, Cooking etc etc etc….Dont stop your imagination, yes great idea for start-ups!

 Use Case 4 : Product Life Cycle Management

Challenges :

  • In the US, more than 1,000 food born outbreaks investigated by state and local health departments are reported each year. There are many legal cases open on retailers across the globe related to quality of food items sold.
  • In the US, roughly 48 million people are afflicted annually, with 128,000 hospitalized and 3,000 dying. In India its, 10X.
  • Imagine the impact & effect on the brand (Walmart etc) when a customer blames a food item as cause of his bad health/sufferings.

Opportunities/Use case:

  • Blockchain is a shared digital ledger  or a continually updated list of all transactions in which transactions made in bitcoin or another cryptocurrency are recorded chronologically and publicly. This decentralized ledger keeps a record of each transaction that occurs across a fully distributed or peer-to-peer network, either public or private.
  • Walmart is a pioneer for application of ‘Block-Chain’ technology (with IBM) in food safety (outside financial services world). This is already considered to be the largest single application of Blockchain till date.
  • A customer complaint cycle time came down from days to minutes with minimum data loss & maximum accuracy.
  • Below is a simplified demonstration of life cycle of a food item sold in a retail store. (EOL means’end of life’).
  • Imagine the number products sold, their varied life cycles, number of parties involved, its a whole new world of data storage and network. Block chain is on its way to transform this world.


A little more on this video.

Use Case 5 : A Walk or Drive in the park ?

Globally most significant sources of dis-satisfaction for customers in Hyper or Supermarkets are :

  1. Payment Queue
  2. Where is my product?
  3. Promise(price) breakers at the checkout point
  4. Returns/refunds (painful/long process)
  5. Waiting for serviceman.

Opportunities/Use case:

Amazon Go @ 2131, 7th Avenue, Seattle, Washington, check this incredible video!

Machine learning, deep learning, smart sensors and access to high speed data, making this ‘walk in the park’ shopping experience possible! Incredible isn’t it ?

Their next plan is create a retail store with drive ways, customer will be able to shop while in the car!

Did you see, three of the five customer dis-satisfactions sources are addressed by this one idea alone.Imagine our world with shops like these around the globe.

Use Case 6 : Ploughing the in-store real estate



This use case is applicable only for FMCG segment of retail. Since 1970, large retailers across the globe started a significant revenue stream in the form of fees received from their suppliers. There are two types, one is slotting fees by providing a prominent shelf space to a brand/product, second is promotion fee by running in store campaigns on specific products. For example, top four British retailers collectively receive more fees from their suppliers than their operating profits, thats the tune of money at stake here! Manufacturers and suppliers are literally at war! Thus so lucrative have slotting fees become that industry insiders joke that supermarket shelves are now the world’s most expensive property.

Opportunities/Use case :

Starting from size of shelf space, height, location, distance from entry/exit/checkouts, customer line of sights, time/season of the year, day of the week, banner location/size every little thing is a variable to consider when it comes to maximizing sales through machine learning. Past data is available for millions of baskets, imagine supplier’s or retailer’s ability to predict sales with highest accuracy, it can alone be a cash making jackpot if you have built the right algorithm for it!

Use Case 7 : Predictive stocking.



There are so many things change daily, monthly, yearly for every store as far as the environment is concerned. Events happen in real world and customer’s priorities change, its a complex combination of factors. How would a retail store know before a week starts on what new customers are going to look for ?

Opportunities/Use case :

  • Walmart predicted pre hurricane buying pattern – ontime  & onsize, made millions! This news came many years ago, the image on the right is what they sold in truckloads of quantities. What is it ?
  • Its strawberry pop tarts, its demand goes up 7 times ahead of Hurricane in the US. Reason being its high on nutrition & calorie value, does not need refrigeration, can be stored for weeks etc. Could you have solved this problem without data science ?
  • All known events, festivals for each region on the calendar could be analysed based on past trends data.
  • Peta bytes of data available all the way upto individual customer (family) basket level. The day before English premier league derby, Tesco surely knows what customers want.

Use Case 8 : Multi channel conversion

TESCO is a classic success story on this subject which is out in the public news. How they managed to predict & lift up in-store purchase.

  • TESCO : $48B revenue, 11 countries, market leader in 5 (UK, Malayasia, Ireland, Hungary, Thailand). 7000 stores, 78m shopping trips per week, 90k products.
  • Deployed Hadoop & Spark technology many years ago.
  • Tesco ClubCard allowed them to monitor each client’s buying patterns
  • Digital to Physical channel
  • Identified statistical significance of days between Browse to Purchase (14 days)
  • Conversion rate & revenue generated across product lines
  • Relationship between number of browse and purchase for every household.
  • Targeted promotions when the probability is highest (days to /number of  browse).
  • Target selling through CCTVs at Tesco petrol stations.
  • Self scanning technology was an industry first.



Tesco Multi Channel Big Data Analytics & Marketing Strategy, check this video.



Use Case 9 : Customer Experience Management


This is my favorite, as a Business Excellence professional. Lets dig into the customer experience space for retail business. From Use case 5, we know the top 5 areas of customer dis-satisfactions :

  • Payment Queue
  • Where is my product?
  • Promise(price) breakers
  • Returns/refunds
  • Waiting for serviceman

Opportunities/Use Case:

We have already discussed some of the solutions on this paper. Potentials are endless :

  • High value customers walk-in alerts/personal in-store escorts (IOT). This will provide that exclusive customer experience for the special ones only.
  • In-store GPS with product location on mobile app with preloaded shopping list. Customer can have the option of pre-loading their shopping list before they walked into the store and a GPS map welcomes them to the store pointing exact location of the required items on the smartphone APP saving you tens of minutes.
  • Return without walking in – imagine a world where you don’t even walk in, just place your return item outside return counter (unmanned), the packaging knows the condition of the item inside or a scanner scans the product and let the you leave if condition is found OK, you get immediate credit back to your account.
  • Tiny self scanning machines (for each customer) can take care of promise breakers on price and expiry etc.


Use Case 10 : In-store Market Basket Analysis (MBA)

MBA is a significant source of revenue for retailers today across the globe. Each checkout transaction by each customer every day is a ‘basket’ of goods, imagine there are 260 million weekly purchases (baskets) @Walmart! Basket dataset looks something similar to below table. Each row is an assortment of items as picked up by customers.


Essentially there are three variables for each combination of two products. Let us take an example of Curd (Y) and Cereals(X).

  • Support (Y) =  This is the probability of Curd purchase among all baskets say in a day
  • Confidence (X –》Y) = This the probability that the same customer will buy Curd (Y) if he has already picked up Cereals(X) .
  • Lift (X –》Y) = Ratio of Confidence to Support for Y. A very high value mean almost a certain X & Y combination sell.
  • Appriori algorithm is used prioritize the most strong recommendations from thousands of combinations in market baskets.

Opportunities/Use cases :

  • The items with highest Support probability are kept deep inside the store and vice-versa.
  • High lift ratio items are always kept away from each other. This will force the customer to walk across to the other item resulting into further impulsive sell.
  • Combo offers are rolled out (as frequently as daily and as specific as for individual customer) for low lift ratio combination of items.As there is already a very high chance of high lift combination to be picked up by the same customer with or without an offer.
  • High support individual items (daily FMCG only) to run discounts/offer on low volume only (2 kgs). This will discourage customers to pick up high volume (10 kg of Rice) at one time and force the customer to come back to the store soon or next day (70% of shopping in a retail store is impulsive!).


Ah…thats it…thanks for reading and being as passionate about data science as I am! Please share your feedbacks/comments, they will inspire me to keep writing and sharing.


3 thoughts on “Disrupting Brick & Mortar Retail: Top 10 Data Science use cases – A research paper.

  1. Pallab @ What do you do 🙂 ? My company name is Thought At Workand we support retail sector …:-)Will love to partner to get your knowledge executed..


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s