Download PDFOpen PDF in browserA Survey on Machine Learning AlgorithmsEasyChair Preprint 288220 pages•Date: March 6, 2020AbstractHeart disease can be termed as a significant public health problem that is widely responsible for premature deaths all around the world. In the present decade, there is a need for a system to tackle this problem by using the latest technical developments. Many machine learning techniques have been employed independently to be able to predict the presence of heart diseases in individuals based on structured and unstructured healthcare data. This paper gives a detailed analysis of classification algorithms like Naïve Bayes, KNN, Decision tree, Random Forest, and Support Vector Machine (SVM). Studies have suggested that this accuracy rate is interlinked with the selection of the features from the available dataset. Thus, appropriate feature selection increases the quality of the prediction model. In this research, we take reference from the study of Cleveland heart disease dataset from the UCI-Repository will be used to get the best combination of six features after testing each possible combination from all the thirteen features available against all the five classifiers mentioned above. Each combination will be assigned an individual score, which will correspond to its weighted mean of the accuracy obtained on all the five classifiers. Furthermore, this paper will find the most prominent feature out of all the thirteen features by counting the number of occurrences of each element in the top hundred combinations giving the maximum score. Thus, this model provides us with an insight into the characteristic feature that has the maximum correlation with the accuracy of the prediction models. Keyphrases: Decision Tree, KNN, Random Forest, Support Vector Machine, machine learning
|