Retailers today must keep up with the ever-changing environment and customer expectations to stay competitive. Businesses are facing multiple challenges, including but not limited to volatile sales and fierce competition. One way to accomplish this is through retail analytics.
This is why data is essential for any retail business. The business numbers like average transaction value or online visitors should be monitored to gauge the performance effectively. What we have to have in mind is that numbers alone don’t tell the successful retail story. To gain meaningful insights affecting financial results directly, businesses are utilizing retail analytics.
Retail analytics provides insights on sales, inventory, pricing, trends, procurement, and many more variables that directly impact the decision-making process. One of the most important of them all is retail demand forecasting.
Retailers need to know what and how many they will sell not just tomorrow but the day after, next week, and in some cases even next year. Such a forecast has a significant impact on retail business health. The more accurate is the forecast, the better results are.
Demand forecasting is sophisticated predictive analytics tailor-made for retailer businesses. Instead of predicting future sales by utilizing merely statistic data, demand forecasting in retail relies on detailed data from various sources to calculate the demand of each product at each store at a specific time frame. Statistical models in MS Excel might be good, but they can’t go beyond statistics.
Retail analytics, especially demand forecasting in retail, is a complex science of applied mathematics modeling, solving simple yet crucial business problems like what, when, and in which quantities a buyer should stock. If it is a prominent retailer, SKUs’ overall amount is turning the problem into a hefty one.
This video of the Centre for Marketing Analytics and Forecasting of Lancaster University, UK hosting Stephan Kolassa from SAP, gives in-depth information on how retail demand forecasting works, what you can expect, and how it can cope with disrupting events like earthquake and pandemics.
Stephan Kolassa’s presentation is full of interesting use cases, but the Q&A session is even more interesting since it reveals some popular conceptions. We are writing down the Stephen Kolassa’s answers and we’ll try to add another layer of insights from Alexander Efremov, PhD – associate professor at Sofia Technical University and chief scientist at A4Everyone.
How is cannibalization between products monitored?
Stephen Kolassa: Cannibalization between products is always extremely hard to monitor because if you have one product on promotion and that may have an impact on other products, and the uplift on this one product gets the cannibalization effect kind of distributed across multiple cannibalized products, so the signal gets weaker. It’s the same for the halo and complementary effects. I think there is no fully automated way of doing this, and it always comes down to a lot of manual work. If you just put everything into machine learning methods and ask them to find the cannibalization effect, you’ll get some strange results, just like I did when I made an experiment and found out that when spark plugs are in promotion, then sales of honey went down. If a brand of spark plugs gets into promotion, then other brands of spark plugs will go down; you actually need to include the product hierarchy and the categories involved so you need very good data. I can’t say that we really solved this problem.
Alexander Efremov: I fully agree that many wrong conclusions could be made by relying solely on ML. And here the business logic and the product hierarchy comes to help. Unfortunately, our experience shows that many clients’ data has no informative structure and the product categories are not relevant to the problem of cannibalization. This makes it difficult to automate the process of modeling the interations [Ma1] between the products. Even in this case, the automation of models development and accounting for cross-product relations is still feasible. One solution is to create an artificial products hierarchy, which is appropriate for the discussed problem. In fact, a large part of this single effort task could be automated using the accumulated business logic, combining with the available data structure, and using Natural Language Processing and/or other approaches to identify the appropriate products hierarchy. This even partial automaton is completely necessary when talking about tens or hundreds of thousands of SKU-s. But once this data structure is derived, then the automated models’ development on a daily basis, accounting for the interactions of the products, is achievable.
Weather is a demand influencing factor. It seems to do a very good job in intraday forecasting, but it doesn’t do as well when it comes to the long term. So the question is if the weather can be relevant for long-term forecasting?
Stephan Kolassa: Everyone wants to include weather in their forecast, but there are multiple problems. One of them is when you plan a promotion in the future. The real problem is that we can forecast weather 10 days ahead. So the question is will our noisy weather forecast really improve our demand forecasts given that the link between the weather and the demand is not straightforward. If we are writing this evening supermarket orders that will arrive the day after tomorrow, we need forecast at least two days or even more. Also, the weather forecast can’t influence the stock position for tomorrow in the store anymore. Weather is helpful if you are a butcher and you see nice weather ahead, so you have to stock with excellent barbecue beef.
Alexander Efremov: Here the question is what is the problem to be solved – for short-term forecasting, the weather data bring beneficial information about some products. Of course, the weather should be taken into account. It is better for problems requiring long-term forecasting to use the seasonal periodicity in the sales than the weather data.
When talking about the weather, it is also essential what market system is investigated – retail store, restaurant, coffee or pastry shop. Regarding the restaurants, for example, weather variables would be significant factors in the demand models. If a restaurant has a garden, the nice weather will increase the restaurant’s capacity and it would be more attractive. On the other hand, a rainy day would cause an increase in the demand for restaurants without gardens as many people would decide to eat inside.
How similar is the impact of casual factors like weather across different store locations?
Stephan Kolassa: I had an interesting discussion with a customer who works for a chain of stores running large stores, called megastores and smaller supermarkets. They create a promotion, and it is distributed locally across stores. Supermarkets don’t have end cap space like megastores, so there is a limitation of the number of products in a promotion that can be distinguished on the isle. So this is why exactly the same promotion performs differently on a store size level. It is similar if you have supermarkets next to a school and another out in the town outskirts. This means school-related promotions will perform well at this location and not so much in the other supermarket. So, different locations sometimes mean different promotion impacts.
Alexander Efremov: Yes, we definitely should take the location into account. I may add another example about a pastry shop near a business building and a shop within a living area. The first shop’s sales would be mostly during the working days, and for the second shop, a higher demand could be observed during the weekend. Also, if the pastry shop is located in a mall, the bad weather may increase the sales, but if the shop is local, the customers’ flow will change differently.
Do you model different SKU groups differently?
Stephan Kolassa: We try to put all of our eggs in the same basket and then make sure it’s a really good basket. So essentially, we try to do one really good model for all of the SKUs and then focus on making that model as good as possible. We treat perishables slightly differently than non-perishable products with long shelf life. So, fruits, vegetables, flowers, dairy, newspapers have a short shelf life and have strong seasonality. They have much smaller promotional impacts because you can’t stock them. You can stock olive oil, but you can’t stock strawberries, and the uplift wouldn’t be this strong. Another situation is with the DIY type of stores. If you shop for a light switches, you will buy 20 or 30 exactly the same switches. This is why you see a sudden sales peak of 30 units and then zero, and it is really challenging to do this. Our approach estimates that 5 units per week will be sold there, which is why it is OK for a retailer to stock some quantities of different light switches.
Alexander Efremov: Usually, on the store level, there are many control variables – those used by the retailers to affect the market like: price, discount, promotion type, display type, etc. Also there are tens or even hundreds of thousands of output variables, which are the products’ sales. In addition to that, there are tens of variables that are measurable but have the sense of disturbances, like the weather and social factors, and they affect each product sales differently. So, keeping in mind the colossal input and output dimensions and the complexity of the market system, we need to decompose the entire model into submodels. Otherwise, especially when we automate the modeling process, the final model would incorporate many wrong interconnections, and would miss other important relations. As Mr. Kolassa mentioned, the market behavior is entirely different for fast and slow-moving items, and this is another reason to model different SKU groups differently.
What you preffer as a metric for the evaluation of intermittent and lumpy time series?
Stephan Kolassa: We have a regression based approach where you can feed in all the causals as predictors. We are using this regression for all kinds, and it works good enough for fast-moving products and slow-moving products and not quite so well for the lumpy demands, so this is a challenge. There is always the problem – if you have the mean absolute error and forecast an intermittent product with more than half zeros, then the best forecast is a flat zero, which makes it very strongly biased.
Alexander Efremov: Definitely, the data mining task, which is solved when model the demand is the regression. There are different model accuracy measures, depending on the model, which could be used. Mr. Kolassa covered this topic quite well in the video. Here the main question, from my experience, is the goal of the forecasting and from the answer of that question, we may select appropriate measures when evaluating the impact of the model. For instance, if we need to forecast the demand in order to optimize the next orders, then the best model could be the one, which leads to the best long term ordering strategy – and here the KPI-s are related to the related waste and stockout, availability, turnover, losses in money, etc. In this case, the KPI-s strongly depend on the retailer’s goal, and the model accuracy measures like MSE, MAE, etc. are not relevant.
How about focusing on lead time effects?
Stephan Kolassa: For instance, things like trends should be in the forecast if you plan orders ahead in time. If you have to forecast replenishment, then your forecast needs to go out for about two weeks in a store. If you are in a distribution center, it should be for a couple of months. Some of the SAP clients are calculating long-term forecasts and start sharing Christmas demand forecasts with their suppliers in the summer.
Alexander Efremov: This question shows again how important it is to account the forecasting specific goal. When talking about optimal ordering, the lead time could be completely different. For example, for perishables, the period between low deliveries is one or a few days, but for other products, e.g., from the clothing industry, the period is one year. Here, the measurement of the model accuracy is again reduced to particular KPI-s in terms of the final goal of the entire decision system. With other words, how much a specific model improves the final ordering decisions in terms of the retailer KPI-s.
Why you even try to forecast an earthquake effect if such events are rare and unpredictable?
Stephen Kolassa: We do this to take out the earthquake effect so we don’t mistakenly assume this is seasonal and project the effect in the next period. It’s the same with COVID19. Hurricanes can actually be forecasted to project their impact on consumer demand, especially if you are a retailer in Florida. This will mean for you to stock up with dry ice and wooden boards. So it really depends on whether you want to forecast or cleanse data.
Alexander Efremov: It is always worth investigating such events that have a significant effect on market behavior. Sometimes it is possible to predict them in the future. For instance, there are special days like Easter, which are on a different date every year – so here even we don’t need a forecast but of the official calendar for the upcoming year and for the particular country. On the other hand, if we cannot predict these events, as Mr. Kolassa mentioned, the best thing we may do is to clean the data before modeling. In order to manage with such events and more generally to account for the time-varying market behavior, it is very important for the models to be always up to date (e.g. updated on a daily base) and to “forget” in an appropriate way the irrelevant old observations. Here the automation of models development is the only solution. This would make our models and the entire decisioning system flexible enough to meet the market’s current changes.