Aschauer, R. (2024). Predictive modeling of next product to buy in the banking sector using boosting techniques [Diploma Thesis, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2024.121688
E105 - Institut für Stochastik und Wirtschaftsmathematik
-
Date (published):
2024
-
Number of Pages:
72
-
Keywords:
Prädiktive Modellierung; Next Product to Buy; Bankensektor; Boosting-Techniken
de
Predictive Modeling; Next Product to Buy; Banking Sector; Boosting Techniques
en
Abstract:
Every day, you come across the saying “Data is the new gold”. In today’s data-driven world, it is more essential than ever to collect customer-specific data properly, adapt them for future work, and consequently, utilize them strategically. Customer data can provide a crucial advantage over competitors. The latter are, of course, eager to lure customers away with lucrative offers. For this reason, it is particularly important to establish long-term customer loyalty and equip them with a wide range of products, thus strengthening the customer-company relationship. Offering the appropriate product can help strengthen the relationship. Product recommendation is crucial in today’s banking industry. Through the use of recommender systems, attempts are made to create individually tailored offers for customers. Customers exhibit multiple characteristics that provide insights into which products they are likely to purchase next. The sales department could gain a significant edge by transforming millions of data points to provide personalized product recommendations for each customer. This would lead to a substantial increase in sales numbers. This work involves a fully automated product recommendation system that engages in both cross-selling and up-selling. The modeling is carried out using boosting techniques. The final modeling is done using eXtreme Gradient Boosting (XGBoost), which gained popularity in the mid-2010s. Different sampling methods for dividing the dataset into training and testing are presented to enhance performance, and hyperparameter optimization is employed. Methods such as Random Search, Grid Search, and Bayesian Optimization are used to significantly improve performance. Often, there is an imbalance in classification within the datasets. Weighted boosting is one approach to enhance the performance of imbalanced classification. Another method to prevent overfitting is variable reduction. Various techniques, including classical methods like Correlation Analysis, as well as technical analyses of Feature Importance and Recursive Feature Elimination, offer numerous possibilities for variable reduction. The most efficient ones are analyzed and discussed. All of this is applied – with one goal – offering the right product to the customer. Data engineering is performed using the data mining software application IBM SPSS Modeler. In the end, the modeling and results which were derived with Python will be presented to determine if the theory behind them holds true.