Retail CLV Optimization — Project Documentation

01 — Dataset Overview

The Raw Data

We used the Online Retail II dataset from the UCI Machine Learning Repository. It contains all transactions from a UK-based wholesale gift retailer between December 2009 and December 2011. Their customers are businesses — gift shops, boutiques, and resellers — who buy in bulk to stock their shelves.

1,067,371Total Rows

8Columns

5,942Unique Customers

53,628Unique Invoices

£476Avg Order Value

2 YearsDate Range

Column	Type	Example	What it means
Invoice	string	489434	Unique order ID. Invoices starting with 'C' are cancellations.
StockCode	string	85048	Unique product identifier.
Description	string	15CM CHRISTMAS GLASS BALL	Product name. Some rows have missing descriptions.
Quantity	int	12	Units purchased. Negative values indicate returns.
InvoiceDate	datetime	2009-12-01 07:45	Date and time of the transaction.
Price	float	6.95	Price per unit in GBP. Zero prices indicate errors.
Customer ID	float	13085.0	Unique customer identifier. 243,007 rows have no Customer ID.
Country	string	United Kingdom	Country of the customer. UK represents 83.8% of revenue.

Important context: Each row is one product line from one invoice — not one order. A single order can contain 20+ rows, one per product. Average order value of £476 confirms this is a wholesale business — not a typical consumer retailer.

02 — Data Cleaning

From Raw to Reliable

Raw transaction data is never production-ready. We applied five cleaning steps, each with a clear business reason. We always kept the original raw data untouched and stored clean data in a new variable.

Raw Data — First 5 Rows (before any cleaning)

This is exactly what we loaded from the Excel file. Notice: Customer ID is a float (13085.0), some rows have no Customer ID (NaN), invoices starting with 'C' are cancellations, and there is no TotalPrice column yet.

Invoice	StockCode	Description	Quantity	InvoiceDate	Price	Customer ID	Country
489434	85048	15CM CHRISTMAS GLASS BALL 20 LIGHTS	12	2009-12-01 07:45	6.95	13085.0	United Kingdom
489434	79323P	PINK CHERRY LIGHTS	12	2009-12-01 07:45	6.75	13085.0	United Kingdom
489434	79323W	WHITE CHERRY LIGHTS	12	2009-12-01 07:45	6.75	13085.0	United Kingdom
489434	22041	RECORD FRAME 7" SINGLE SIZE	48	2009-12-01 07:45	2.10	13085.0	United Kingdom
C489449	21258	VICTORIAN SEWING BOX LARGE	-5	2009-12-01 09:01	10.95	NaN	United Kingdom

Row 5 shows two problems at once — a cancellation (C489449) and a missing Customer ID (NaN). Both will be removed during cleaning.

1

Remove rows with no Customer ID

243,007 rows had no customer identifier. Without a Customer ID we cannot attribute the purchase to any individual — making the row useless for customer-level analysis. These are likely guest checkouts or system entries.

− 243,007 rows removed → 824,364 remaining

2

Remove cancelled invoices

Invoices starting with 'C' (e.g. C489434) represent cancellations or returns. These are not real purchases — including them would artificially inflate revenue and distort purchase frequency counts.

− 18,744 rows removed → 805,620 remaining

3

Remove invalid quantities and prices

71 rows had a price of zero or negative. A transaction with zero price is either a data entry error or a system glitch — it cannot represent a real sale.

− 71 rows removed → 805,549 remaining

4

Remove non-product stock codes

Some rows represent internal charges, not product sales: POST (postage — 1,838 rows), M (manual adjustments — 709), BANK CHARGES (32), PADS (17), DOT (16), D (discounts — 5). These are accounting entries that would corrupt any revenue analysis.

− 2,617 rows removed → 802,932 remaining

5

Fix data types and derive columns

Customer ID was stored as a float (13085.0) — converted to a clean string "13085". InvoiceDate was stored as text — parsed to proper datetime for date calculations. Three new columns derived: TotalPrice (Quantity × Price), invoice_dow (day of week 0–6), invoice_hour (0–23).

0 rows removed — type corrections and column additions only

Clean Data — First 5 Rows (after all 5 cleaning steps)

Now Customer ID is a clean string, InvoiceDate is a proper datetime, TotalPrice is calculated, cancellations are gone, and all rows have valid quantities and prices.

Invoice	StockCode	Description	Quantity	InvoiceDate	Price	Customer ID	Country	TotalPrice
489434	85048	15CM CHRISTMAS GLASS BALL 20 LIGHTS	12	2009-12-01 07:45	6.95	13085	United Kingdom	£83.40
489434	79323P	PINK CHERRY LIGHTS	12	2009-12-01 07:45	6.75	13085	United Kingdom	£81.00
489434	79323W	WHITE CHERRY LIGHTS	12	2009-12-01 07:45	6.75	13085	United Kingdom	£81.00
489434	22041	RECORD FRAME 7" SINGLE SIZE	48	2009-12-01 07:45	2.10	13085	United Kingdom	£100.80
489434	21232	STRAWBERRY CERAMIC TRINKET BOX	24	2009-12-01 07:45	1.25	13085	United Kingdom	£30.00

1,067,371Raw Rows

264,439Rows Removed

802,932Clean Rows

75.2%Data Retained

03 — Exploratory Data Analysis

Understanding the Business

Before building any models, we explored the data visually to understand revenue patterns, customer geography, and seasonal behaviour. This shapes all modelling decisions that follow.

What This Dataset Represents

This is a UK-based wholesale retailer. Their customers are businesses — gift shops, boutiques, and online resellers — who buy products in bulk and then resell them to end consumers at a higher price.

Customer 13085 bought 12 units of "15CM CHRISTMAS GLASS BALL" at £6.95 each → Total: £83.40

That is not an individual person buying one Christmas ornament for their home. That is a shop owner buying 12 of them to stock their shelves and sell individually to their retail customers.

This explains two things we observed in the data:

The average order value is £476 — shops buy in bulk, not single items. A typical consumer retailer averages £20–£50 per order.
The October/November revenue peak — shop owners are stocking up their shelves ahead of the Christmas retail season so their customers can buy gifts.

When we say "customer" in this project, we mean a business — not an individual consumer. CLV, churn, and segmentation all apply to these business buyers.

Monthly Revenue Trend

Total revenue grouped by month across the full 2-year observation period (Dec 2009 – Dec 2011). December 2011 appears low because the dataset only covers the first 9 days of that month.

Monthly Revenue (£)

Oct–Nov peaks clearly visible in both years — driven by Christmas stocking

Monthly Revenue Breakdown

Year	Month	Revenue (£)	% of Total	Notes
2009	December	£681,530	3.9%	Partial month (dataset starts Dec 1)
2010	January	£539,008	3.1%	Post-Christmas slowdown
2010	February	£499,950	2.9%	Slowest month in dataset
2010	March	£668,978	3.8%
2010	April	£588,126	3.4%
2010	May	£595,087	3.4%
2010	June	£632,686	3.6%
2010	July	£584,049	3.3%
2010	August	£597,130	3.4%
2010	September	£808,545	4.6%	Pre-Christmas ramp begins
2010	October	£1,015,989	5.8%	🔺 Peak month
2010	November	£1,163,154	6.7%	🔺 Highest month in dataset
2010	December	£877,814	5.0%
2011	January	£564,227	3.2%	Post-Christmas drop again
2011	February	£443,546	2.5%	Second lowest month
2011	March	£585,263	3.4%
2011	April	£455,667	2.6%
2011	May	£660,918	3.8%
2011	June	£654,933	3.8%
2011	July	£593,232	3.4%
2011	August	£637,157	3.7%
2011	September	£941,801	5.4%	Pre-Christmas ramp begins again
2011	October	£1,006,342	5.8%	🔺 Peak month
2011	November	£1,143,246	6.6%	🔺 Second highest month
2011	December	£513,380	2.9%	Partial — only 9 days of data

Key insight — seasonal pattern repeats perfectly: October and November peak in both 2010 and 2011, with almost identical revenue figures (£1.01M–£1.16M). January and February are consistently the weakest months. This predictable seasonality means the business should plan inventory and staffing around this cycle every year.

Revenue by Country

The UK dominates with 83.8% of all revenue. The remaining 16.2% is spread across 37 other countries, mostly in Europe.

Revenue by Country (Top 10)

United Kingdom accounts for £14.6M of the £17.4M total

Rank	Country	Revenue (£)	% of Total	Business Insight
1	United Kingdom	£14,627,419	83.8%	Home market — overwhelmingly dominant
2	Ireland (EIRE)	£602,058	3.4%	Close neighbour, easy logistics
3	Netherlands	£549,953	3.2%	Strong EU trading relationship with UK
4	Germany	£388,960	2.2%	Largest economy in Europe
5	France	£315,714	1.8%	Close neighbour
6	Australia	£168,485	1.0%	English-speaking market
7	Spain	£98,841	0.6%
8	Switzerland	£93,624	0.5%
9	Sweden	£86,079	0.5%
10	Denmark	£68,560	0.4%
…	27 other countries	£456,083	2.6%	Tiny individual contributions

Business risk — over-reliance on UK market: 83.8% of revenue from one country is a concentration risk. If the UK economy slows down, the entire business is exposed. The business already has footholds in Ireland (3.4%), Netherlands (3.2%), and Germany (2.2%) — expanding these markets would significantly reduce this risk.

04 — Feature Engineering

Building Customer-Level Features

Raw data has one row per product line per invoice. Machine learning models need one row per customer. We transformed 802,932 transaction rows into a clean feature table of 5,862 customers — each described by 12 behavioural features. Final RFM table shape: (5862, 12).

Snapshot Date: 10th December 2011. All customer metrics are calculated relative to this single reference point — the day after the last transaction in the dataset (9th December 2011). We add one day to ensure no customer has a recency of 0. Think of it as the "as of this morning" date for the analysis.

How Each Feature Is Calculated

TotalPrice (derived column)

TotalPrice = Quantity × Price

Example: 12 units × £6.95 = £83.40

Added during cleaning. Every product line now has a monetary value. This is the foundation for all spend-based calculations. Not a customer-level feature — exists at transaction level.

Recency

Recency = Snapshot Date − MAX(InvoiceDate)

Example: 10 Dec 2011 − 8 Dec 2011 = 2 days

How many days since the customer last placed an order. Measured from snapshot date. Low recency = recently active. High recency = has gone quiet and may be churning.

Frequency

Frequency = COUNT(DISTINCT Invoice)

Example: Customer placed 8 unique invoices = Frequency 8

Total number of distinct orders placed by the customer. Each unique invoice number counts as one order regardless of how many product lines it contains.

Monetary

Monetary = SUM(TotalPrice) ÷ Frequency

Example: £5,633 total ÷ 8 orders = £704.16 avg order

Average order value in £. This is not total lifetime spend — it is the average amount spent each time the customer placed an order. Captures spending intensity per visit.

TotalSpend

TotalSpend = SUM(Quantity × Price)

Example: All orders combined = £5,633.32

Cumulative lifetime revenue from this customer. Unlike Monetary which is an average, TotalSpend is the running total across all orders ever placed. A high-frequency low-spend customer can still have high TotalSpend.

UniqueProducts

UniqueProducts = COUNT(DISTINCT StockCode)

Example: Bought 126 different StockCodes = 126 unique products

Number of different products the customer has purchased across all their orders. High value = explores the catalogue widely and buys variety. Low value = repeatedly buys the same product(s).

AvgBasketSize

AvgBasketSize = MEAN(Quantity per transaction row)

Example: Average 13 units per product line

Average number of physical units ordered per product line. Different from Monetary which measures £ value. A customer with AvgBasketSize=2000 is a bulk buyer of cheap products. One with AvgBasketSize=2 buys small quantities of expensive items.

Tenure

Tenure = MAX(InvoiceDate) − MIN(InvoiceDate)

Example: First order Jan 2010, last order Jan 2011 = 365 days tenure

Number of days between the customer's first and last purchase. Measures how long the customer was actively buying from this retailer. Different from Recency — Tenure measures their active lifespan, Recency measures their current silence.

PurchaseVelocity

PurchaseVelocity = MEAN(days between consecutive orders)

Example: Orders on Jan 1, Feb 1, Apr 1 → gaps: 31, 59 days → avg 45 days

Average number of days between consecutive orders. Represents the customer's natural buying rhythm. A customer with velocity=7 days buying weekly is very different from one with velocity=90 days buying quarterly — even if their recency is the same.

PreferredDayOfWeek

PreferredDayOfWeek = MODE(InvoiceDate.dayofweek)

Example: 4 out of 8 orders placed on Tuesday → PreferredDayOfWeek = 1

The day of the week (0=Monday, 6=Sunday) on which the customer most commonly places orders. Used for personalising promotion send timing — reach them the day before they naturally buy.

PreferredHour

PreferredHour = MODE(InvoiceDate.hour)

Example: Most orders placed at 14:00 → PreferredHour = 14

The hour of day (0–23) at which the customer most commonly places orders. Complements PreferredDayOfWeek for precise timing of marketing outreach.

PreferredDayName

PreferredDayName = MAP(PreferredDayOfWeek → name)

Example: 1 → "Tuesday"

Human-readable version of PreferredDayOfWeek. Maps 0–6 to Monday–Sunday. Added purely for readability in reports and dashboards — the model uses the numeric version.

RFM Feature Table — First 8 Rows

Shape: (5862, 12). One row per customer. This table is the input for all subsequent modelling steps.

Customer ID	Recency	Frequency	Monetary	TotalSpend	UniqueProducts	AvgBasketSize	Tenure	PurchaseVelocity	PrefDOW	PrefHour	PrefDayName
12346	326	12	£6,463	£77,556	27	2,184.9	400	35.9	0	13	Monday
12347	2	8	£704	£5,633	126	13.0	402	57.0	1	14	Tuesday
12348	75	5	£332	£1,658	24	58.8	362	90.5	0	14	Monday
12349	19	3	£1,226	£3,679	137	9.4	570	285.0	3	9	Thursday
12350	310	1	£294	£294	16	12.3	0	0.0	2	16	Wednesday
12352	36	9	£192	£1,733	72	11.1	393	49.1	2	14	Wednesday
12353	204	2	£203	£407	13	14.5	144	144.0	3	10	Thursday
12356	36	5	£402	£2,015	49	13.3	425	106.3	2	14	Wednesday

Reading a customer row — Customer 12347:
Recency=2 → bought 2 days before snapshot (very active) · Frequency=8 → placed 8 orders · Monetary=£704 → average £704 per order · TotalSpend=£5,633 → spent £5,633 total · UniqueProducts=126 → bought 126 different products · Tenure=402 → was an active customer for 402 days · PurchaseVelocity=57 → orders every 57 days on average · Preferred day: Tuesday at 14:00.

05 — Probabilistic CLV Modelling

BG/NBD + Gamma-Gamma

Standard ML models see "no purchase in 60 days" and predict churn. BG/NBD is smarter — it asks whether this silence is normal for this specific customer's buying rhythm, or genuinely alarming. It models two hidden processes simultaneously for every customer.

🛒

Purchase Process (NBD)

While alive, each customer buys at their own personal rate λ. Some buy weekly, some monthly, some quarterly. These rates vary across the population following a Gamma distribution — that's the Negative Binomial (NBD) part.

👻

Dropout Process (BG)

After each purchase, a customer has some probability p of never returning — quietly churning. Each customer has their own p, and these vary across the population following a Beta distribution — that's the Beta-Geometric (BG) part.

What the model needs (inputs)

frequency (BG/NBD)

Number of repeat purchases after the first. First purchase excluded — it's the starting point. Minimum 1 required. One-time buyers (1,681 customers) are filtered out before fitting.

recency (BG/NBD)

Days between first and last purchase. Not the same as RFM recency. RFM recency measures from snapshot date. BG/NBD recency measures within the customer's own history.

T (observation window)

Days between first purchase and snapshot date. Always ≥ recency. The gap (T − recency) = days the customer has been silent. This silence is the key signal for P(Alive).

Model parameters learned from your data

r = 1.517

Shape of purchase rate distribution

Controls how spread out buying frequencies are across your 4,181 customers. r ≈ 1.5 means moderate variety — there's a mix of fast and slow buyers but not extremely polarised.

α = 103.875

Scale of purchase rate distribution

Controls average purchase rate. High alpha (103) means customers buy roughly once every 103 days on average — slow but consistent. Expected for a wholesale retailer where businesses reorder seasonally.

a = 0.097

Beta distribution — churn tendency

Very small value. Very few customers have a high dropout tendency after each purchase. Most customers who buy once tend to come back.

b = 0.976

Beta distribution — retention tendency

Larger than a — confirms most customers tend to stay active after each purchase. Your customer base is loyal and sticky. Low overall churn risk population.

What the model outputs (per customer)

P(Alive)

Probability that the customer is still an active buyer. Between 0 and 1. A score of 0.985 means 98.5% confident still active. A score of 0.31 means likely churned.

Why it's powerful: Two customers with the same 300-day silence get very different P(Alive) if one naturally buys every 6 months and the other used to buy weekly.

Expected Future Purchases

How many times the model expects the customer to buy in the next N days (we used 90 days). A value of 1.47 means the customer is expected to place about 1–2 orders in the next quarter.

Why it's useful: Multiplied by average order value and profit margin gives us CLV — a financial number the business can act on directly.

Gamma-Gamma Spend Model

What it does: Estimates how much each customer will spend per order in the future, based on their historical spending pattern.

Key assumption verified: How often a customer buys (frequency) must be independent of how much they spend per order (monetary). We verified this — Pearson correlation was 0.023, well below the 0.30 threshold. Safe to proceed.

Parameters learned: p = 11.75 (spending consistency within each customer — high p means predictable spenders), q = 0.884 (spread of spending across customers — less than 1 means high variety between customers), v = 11.72 (population average spend anchor).

06 — CLV Calculation

90-Day Customer Lifetime Value

We combined both models to produce a single financial number per customer — how much profit they are expected to generate in the next 90 days.

CLV (90d) = E[future purchases] × E[avg order value] × profit margin × discount factor Where: E[future purchases] → from BG/NBD model (predicted_purchases_90d) E[avg order value] → from Gamma-Gamma model Profit margin → 20% (assumed gross margin) Discount rate → 1% monthly (money today > money tomorrow) Forecast horizon → 3 months (90 days)

4,181Customers Scored

£104Mean 90d CLV

£42Median 90d CLV

£21,705Max 90d CLV

81%Avg P(Alive)

Mean (£104) vs Median (£42): The mean is more than double the median. A small number of very high-value Champions are pulling the average up significantly. The median (£42) is a more honest representation of what a typical customer is worth. This gap is the Pareto effect in action.

Sample CLV scores

Customer ID	Frequency	Recency (BG)	T	P(Alive)	Pred. Purchases 90d	CLV 90d
12347	7	402	405	98.5%	1.47	£207.63
12346	7	400	726	50.6%	0.46	£1,009.45
12348	4	363	439	94.7%	0.86	£61.07
12353	1	204	409	73.5%	0.32	£6.39

07 — Customer Segmentation

K-Means Clustering

We grouped all 5,862 customers into 4 segments using K-Means clustering. Both the Elbow Curve and Silhouette Score pointed to K=4 as optimal. We capped features at the 99th percentile before clustering to prevent extreme outliers from dominating the results.

Why we scaled features before clustering: K-Means measures distance between customers. Without scaling, TotalSpend (£77,000) would completely dominate UniqueProducts (27). StandardScaler converts every feature to zero mean and unit variance so all features contribute equally to the distance calculation.

Segment 3

Champions

Customers162 (2.8%)

Avg Recency37 days

Avg Frequency47 orders

Avg TotalSpend£46,443

Avg CLV 90d£1,148

Avg P(Alive)95%

✅ Retain & Reward. VIP treatment, exclusive access, dedicated account manager. Never send generic promotions.

Segment 1

Loyal Customers

Customers976 (16.6%)

Avg Recency43 days

Avg Frequency14 orders

Avg TotalSpend£5,659

Avg CLV 90d£139

Avg P(Alive)95%

🔵 Nurture & Grow. Personalised recommendations and moderate incentives. Some can become Champions.

Segment 2

At Risk

Customers2,382 (40.6%)

Avg Recency104 days

Avg Frequency4 orders

Avg TotalSpend£1,298

Avg CLV 90d£45

Avg P(Alive)88%

🟠 Win-Back Campaign. Time-limited offer within 2 weeks before they drift to Lost/Dormant.

Segment 0

Lost / Dormant

Customers2,342 (39.9%)

Avg Recency376 days

Avg Frequency1.7 orders

Avg TotalSpend£561

Avg CLV 90d£9

Avg P(Alive)30%

🔴 Low-cost reactivation only. 70% probability already churned. One bulk email max.

The Pareto insight: Champions are just 2.8% of the customer base but their average 90d CLV (£1,148) is 128× higher than Lost/Dormant (£9). Losing 10 Champions has the same revenue impact as losing 1,280 dormant customers.

08 — Churn Prediction

XGBoost Churn Classifier

We trained an XGBoost classifier to predict churn probability for every customer. XGBoost builds hundreds of decision trees sequentially, each correcting the mistakes of the previous one — a technique called gradient boosting.

Churn Label Definition

Churned = 1 IF Recency > 90 days AND P(Alive) < 0.5 Churned = 0 otherwise Churn rate: 10.5% (613 out of 5,862 customers)

Why a hybrid label? Using recency alone would mislabel slow buyers as churned. A customer who naturally orders every 4 months has a 120-day recency — but they're not churned, they're just slow. Adding the BG/NBD P(Alive) check ensures we only label a customer as churned if both the time signal AND the probabilistic model agree.

0.9994ROC-AUC

99%Overall Accuracy

90%Churn Precision

98%Churn Recall

SHAP Feature Importance

Rank	Feature	Direction	Business meaning
1	Recency	High → more churn	Single strongest signal. Customers who haven't bought recently are far more likely to churn.
2	Frequency	High → less churn	Customers who buy regularly are much less likely to leave.
3	PurchaseVelocity	Low → less churn	Short gaps between orders = less likely to churn.
4	Tenure	Long + silent → more churn	A long-standing customer who suddenly goes quiet is a strong churn signal.
5–10	TotalSpend, Monetary etc.	Minimal impact	Spending behaviour has almost no predictive power for churn.

Key insight: Churn is driven by time-based signals, not spending. Don't wait for a customer's order value to drop before acting. Watch recency and purchase velocity. When a customer breaks their natural buying rhythm — act immediately.

09 — Uplift Modelling

T-Learner Promotion Response

A regular conversion model tells you who will buy. Uplift modelling tells you who will buy because of the promotion — not just buy anyway. This distinction is worth millions in reduced promotional waste.

The 4 Customer Types

Persuadables

Won't buy

Will buy

High +

✅ Target these

Sure Things

Will buy

Near zero

❌ Wasted money

Lost Causes

Won't buy

Near zero

❌ Don't bother

Sleeping Dogs

Will buy

Won't buy

Negative −

❌ Never target

How T-Learner Works

Step 1: Split data into two groups Treatment group → customers who received the promotion Control group → customers who did not Step 2: Train two separate XGBoost models μ1(x) = P(convert | received promotion, features x) μ0(x) = P(convert | no promotion, features x) Step 3: For each new customer calculate uplift score Uplift = μ1(customer) − μ0(customer) Step 4: Rank customers by uplift score — target the top N%

The Promotion Experiment (Simulated): In a real business you would use actual A/B test logs from your CRM. In this project we simulated a randomised controlled trial (RCT). Treatment was randomly assigned using a 50/50 coin flip. Conversion probability was modelled as a function of CLV and recency, with uplift highest for mid-range customers (the inverted-U formula) — reflecting the real insight that Champions buy anyway and dormant customers don't respond regardless.

10 — Business Impact Simulation

ROI Comparison of Targeting Strategies

We simulated targeting the top 20% of customers using three different strategies and compared promotional ROI. This translates model outputs into the financial language that businesses care about.

Assumptions: Promotion cost per customer = £2.00 Average revenue per conversion = £35.00 Target = top 20% of customers ROI = (Revenue − Cost) / Cost × 100%

Strategy

Revenue

Cost

ROI

🎯 Uplift targeting (top 20% by uplift score)

£560

£468

+19.7%

📊 Random baseline (no targeting)

£350

£468

−25.2%

💰 CLV targeting (top 20% by CLV)

£175

£468

−62.6%

Key insight — high CLV ≠ persuadable: CLV targeting performed worst because Champions (high CLV) buy anyway — they don't need a promotion. The uplift model correctly identified mid-range customers who genuinely respond, achieving the only profitable ROI. This is the core value proposition of uplift modelling over simple conversion or CLV targeting.

🏆

Use uplift scores to target

Only send promotions to customers with high positive uplift scores. These are the Persuadables — the promotion genuinely moves the needle for them.

🛡️

Protect Champions differently

Champions don't need discounts — they buy anyway. Invest in loyalty programmes, exclusive access, and personalised service instead.

⚡

Act fast on At Risk

At Risk customers still have meaningful CLV (£45 average). A targeted win-back campaign now costs far less than losing them permanently.

Retail CustomerLifetime ValueOptimization Engine

The Raw Data

From Raw to Reliable

Raw Data — First 5 Rows (before any cleaning)

Remove rows with no Customer ID

Remove cancelled invoices

Remove invalid quantities and prices

Remove non-product stock codes

Fix data types and derive columns

Clean Data — First 5 Rows (after all 5 cleaning steps)

Understanding the Business

What This Dataset Represents

Monthly Revenue Trend

Monthly Revenue Breakdown

Revenue by Country

Building Customer-Level Features

How Each Feature Is Calculated

RFM Feature Table — First 8 Rows

BG/NBD + Gamma-Gamma

Purchase Process (NBD)

Dropout Process (BG)

What the model needs (inputs)

frequency (BG/NBD)

recency (BG/NBD)

T (observation window)

Model parameters learned from your data

What the model outputs (per customer)

P(Alive)

Expected Future Purchases

Gamma-Gamma Spend Model

90-Day Customer Lifetime Value

Sample CLV scores

K-Means Clustering

Champions

Loyal Customers

At Risk

Lost / Dormant

XGBoost Churn Classifier

Churn Label Definition

SHAP Feature Importance

T-Learner Promotion Response

The 4 Customer Types

How T-Learner Works

ROI Comparison of Targeting Strategies

Use uplift scores to target

Protect Champions differently

Act fast on At Risk

Retail Customer
Lifetime Value
Optimization Engine