The Challenge
NovaBuy had 500 customers and no way to predict how much any individual would spend in the next year. Marketing budgets were allocated equally across all customers — high-value long-term members received the same spend as brand-new signups. Without a spending model, the team couldn't prioritise retention campaigns, personalise offers, or identify which behavioural signals actually drove revenue.
The Solution
I built a full ML pipeline in scikit-learn: four regression models (OLS, Ridge, Lasso, ElasticNet) trained with GridSearchCV hyperparameter tuning and 5-fold cross-validation. A performance threshold gate (R²≥0.95, RMSE≤$15) ensures only production-quality models are deployed. Feature analysis revealed that mobile app engagement and membership length are the dominant revenue drivers — website time adds near-zero predictive value (coefficient $0.31, confirmed by Lasso zeroing it out). The model is served via a FastAPI endpoint and visualised in an interactive Streamlit dashboard with live sliders, batch CSV upload, and a confidence interval on every prediction.
Results
97.8%
R² score on held-out test set
$10.48
RMSE — average prediction error
1.79%
MAPE — mean absolute % error
4
Models compared, best selected automatically
Tech Stack
“The model immediately showed us what we suspected but couldn't prove — app engagement drives revenue, the website doesn't. We redirected our dev budget to mobile features within a week of seeing the coefficients.”
Growth Lead
NovaBuy E-Commerce