TY - JOUR
T1 - Deciphering Digital Social Dynamics
T2 - A Comparative Study of Logistic Regression and Random Forest in Predicting E-Commerce Customer Behavior
AU - Sunarya, Po Abas
AU - Rahardja, Untung
AU - Chen, Shih Chih
AU - Li, Yung Ming
AU - Hardini, Marviola
N1 - Publisher Copyright:
© 2024, Bright Publisher. All rights reserved.
PY - 2024/1
Y1 - 2024/1
N2 - This study compares Logistic Regression and Random Forest in predicting e-commerce customer churn. Utilizing the E-commerce Customer dataset, it navigates the complexities of customer interactions and behaviors, offering a rich context for analysis. The methodology focuses on meticulous data preprocessing to ensure data integrity, setting the stage for applying and evaluating Logistic Regression and Random Forest. Both models were assessed using accuracy, precision, recall, F1-Score, and AUC-ROC. Logistic Regression showed an accuracy of 90%, precision of 91% for class 0 and 82% for class 1, recall of 98% for class 0 and 50% for class 1, F1-Score of 94% for class 0 and 62% for class 1, and AUC-ROC of 0.88. Random Forest, with its ability to handle complex patterns, demonstrated higher overall performance with an accuracy of 95%, precision of 95% for class 0 and 93% for class 1, recall of 99% for class 0 and 74% for class 1, F1-Score of 97% for class 0 and 82% for class 1, and an AUC-ROC of 0.97. This comparative analysis offers insights into each model's strengths and suitability for predicting customer churn. The findings contribute to a deeper understanding of machine learning applications in e-commerce, guiding stakeholders in enhancing customer retention strategies. This research provides a foundation for further exploration into the digital social dynamics that shape customer behavior in the evolving digital marketplace.
AB - This study compares Logistic Regression and Random Forest in predicting e-commerce customer churn. Utilizing the E-commerce Customer dataset, it navigates the complexities of customer interactions and behaviors, offering a rich context for analysis. The methodology focuses on meticulous data preprocessing to ensure data integrity, setting the stage for applying and evaluating Logistic Regression and Random Forest. Both models were assessed using accuracy, precision, recall, F1-Score, and AUC-ROC. Logistic Regression showed an accuracy of 90%, precision of 91% for class 0 and 82% for class 1, recall of 98% for class 0 and 50% for class 1, F1-Score of 94% for class 0 and 62% for class 1, and AUC-ROC of 0.88. Random Forest, with its ability to handle complex patterns, demonstrated higher overall performance with an accuracy of 95%, precision of 95% for class 0 and 93% for class 1, recall of 99% for class 0 and 74% for class 1, F1-Score of 97% for class 0 and 82% for class 1, and an AUC-ROC of 0.97. This comparative analysis offers insights into each model's strengths and suitability for predicting customer churn. The findings contribute to a deeper understanding of machine learning applications in e-commerce, guiding stakeholders in enhancing customer retention strategies. This research provides a foundation for further exploration into the digital social dynamics that shape customer behavior in the evolving digital marketplace.
KW - Customer Behavior Analysis
KW - E-commerce Churn Prediction
KW - Logistic Regression
KW - Machine Learning Algorithms
KW - Predictive Modeling
KW - Random Forest
UR - http://www.scopus.com/inward/record.url?scp=85184871808&partnerID=8YFLogxK
U2 - 10.47738/jads.v5i1.155
DO - 10.47738/jads.v5i1.155
M3 - Article
AN - SCOPUS:85184871808
SN - 2723-6471
VL - 5
SP - 100
EP - 113
JO - Journal of Applied Data Sciences
JF - Journal of Applied Data Sciences
IS - 1
ER -