Download PDFOpen PDF in browserClassification of Water Quality Index Using Machine Learning Algorithm for Well Assessment: a Case Study in Dili, Timor-LesteEasyChair Preprint 147436 pages•Date: September 6, 2024AbstractThis paper investigates to use of information technology, i.e. machine learning algorithms for water assessment in Timor-Leste. It is essential to assess groundwater quality to ensure the safety and availability of well water. The Water Quality Index (WQI) is the standard tool for assessing water quality, which can be calculated from physicochemical and microbiological parameters. However, in developing countries, it is sometimes difficult due to machine malfunctions and limited human resources. In such cases, missing-value imputation and machine learning models are useful for classifying water samples into suitable or unsuitable with significant accuracy. Some imputation methods were tested, and four machine-learning algorithms were explored: logistic regression, support vector machine, random forest, and Gaussian naïve Bayes. We obtained a dataset with 368 observations from 26 groundwater sampling points in Dili, the capital city of Timor-Leste. According to experimental results, it is found that 64% of the water samples are suitable for human consumption. We also found k-NN imputation method and random forest method were the clear winners, achieving 96% accuracy with three-fold cross-validation. The analysis revealed that some parameters significantly affected the classification results. Keyphrases: Classification, Water Quality Index, machine learning, missing value imputation
|