Retail Offerings — Using MBA (Market Basket Analysis)
Has it ever happened to you that you went to buy a Pepsi from a retail mart, and you end up buying a combo of Pepsi-Lays because it seemed a better deal? or while purchasing from Amazon you bought an additional item from the “Frequently bought together” section.
Have you ever wondered how these associative offerings are created? It is data-driven and today we are going to learn about an algorithm that will help us to build these offerings. — Market Basket Analysis
The fundamental idea of an MBA is to analyze past transactions and look for the product combinations that are often purchased together. Now the mathematical form of the same idea is represented by MBA and the algorithm that is used is called Apriori.
Let’s understand this concept with the example —
In our dummy store, we only sell 4 products —
Now, let’s look at the transaction table of all the purchases —
To understand the algorithm we need to understand 3 main concepts-
- Support
- Confidence
- Lift
Before going into detail, remember that these 3 concepts pertain to the combinations that we are evaluating. Now you may ask how we select these combinations then the simple answer would be we check all the possible combinations, like all pairs, triplets, so on. and these concepts that we are going to learn to help us filter down the final associations that are statistically significant. Let’s dive into it
Combination = Antecedent -> Consequent, Coke -> Lays;
Read as: Probability of buying a lays if coke is purchased
Support
This simply refers to the total percentage of the transactions in which the combination is present.
Support = freq(A,B)/Total Transactions
In this case,
Support = #Transaction with both coke & lays are present/ #Total Transactions
Support = 4/10 = 0.4
This is the metric basis by which we filter out a lot of combinations. Simple logic is if there aren’t enough transactions with both the products present there are very less chances that any association exists between the products.
Confidence
Closely observe the formula for confidence —
Confidence(B/A) = freq(A,B)/freq(A)
In confidence, we basically try to determine that out of all the transactions in which product A is sold, what proportions of transactions also include product B.
In our case,
Confidence(Lays/Coke) = #Transaction with both coke & lays are present/ #Transactions with coke
Confidence(Lays/Coke) = 4/5 = 0.8
Lift
Formula First
Lift = Confidence(B/A) / Support(B)
Lift = (freq(A,B)/freq(A))/(freq(B)/N)
Lift = freq(A,B) * N / freq(A) * freq(B)
Lift(Lays/Coke) = Confidence(Lays/Coke) / Support(Lays)
Lift(Lays/Coke) = 0.8 * (7/10) = 1.14
Lift(Lays/Coke) = Percentage of total coke transactions in which lays are present/Percentage of total transactions in which lays are present
Understand its meaning very carefully,
The denominator is the percentage of total transactions in which lays are present
The numerator is the same metric only difference is that in this our base transactions reduced to transactions with coke
Now if our case lays are present in 70% of transactions whereas if it is a coke transaction, then lays are present in 80% of the transactions. Hence there seems an association between Coke and Lays basis our data.
Hence, we can say that —
If Lift >1, this means that there exists some association in our rule
If Lift =1, this means that the product is present in the same proportion of transactions with and without antecedent.
If Lift < 1, there doesn’t exist any association, in fact, this product is purchased more on average without the antecedent rather than with the antecedent product.
Now, this analysis helps retailers to improve the overall sales and the average ticket size of each transaction. Not just creating combo offers, in the offline setup retailers even use it to decide the position of the products. More associative products are kept near to each other.
Hope you enjoyed reading it!