Download PDFOpen PDF in browserFeature Selection and Adaptive Synthetic Sampling Approach for Optimizing Online Shopper Purchase Intent PredictionEasyChair Preprint 66245 pages•Date: September 16, 2021AbstractThis paper proposes a novel approach for optimizing online shopper purchase intent prediction using feature selection combined with Adaptive Synthetic Sampling (ADASYN). A supervised learning technique is applied to predict whether the customer visits ending with shopping or not based on the features. However, not all features are important to predict the classes. In addition, a suboptimal performance may occur due to the class imbalance problem. Therefore, we propose Information Gain and Correlation feature selection to select the most important features. ADASYN is additionally used to deal with the class imbalance problem by adaptively generating new synthetic samples of the minority class with considering density distribution. The proposed approach is run using Random Forest classifier. The results indicate that ADASYN effectively improves the classification performances in terms of accuracy, precision, recall, and F1-score. The use of feature selection combined with ADASYN has been compared to previous works, the results indicate that our proposed approach outperforms all. We additionally use a statistical test to show that our results are statistically significant. By these results, our proposed approach is promising in optimizing classification performances. Keyphrases: ADASYN, Adaptive synthetic sampling, Class Imbalance Problem, Filter-based feature selection, Imbalanced dataset, Information Gain, Random Forest, feature selection, machine learning, online shoppers' purchasing intention, statistical test
|