A comparative study of the performance of K-nearest neighbors and support vector machines for classification of groundwater

نویسندگانمحمد ساکی زاده-روح اله میرزایی محمد آبادی
تاریخ انتشار۰-۰-۰۱
رتبه نشریهعلمی - پژوهشی
نمایه نشریهISC ,SID

چکیده مقاله

The aim of this paper is to examine the feasibility of SVMs and K-NN classifiers for the classification of an aquifer in Khuzestan Province, Iran. For this purpose, 17 groundwater quality variables (EC, TDS, Turbidity, pH, Total hardness, Ca, Mg, Total alkalinity, Sulfate, Nitrate, Nitrite, Fluoride, Phosphate, Fe, Mn, Cu, Cr(VI)) from 41 wells and springs during an eight years time period (2006 to 2013) were used. Cluster analysis was used leading to a dendrogram that differentiated two distinct groups. Factor analysis extracted eight factors accumulatively accounting for 90.97 percents of the total variance so, the variation of 17 variables can be covered by just eight factors. K-nearest neighbor (K-NN) and Support vector machines (SVMs) were applied for the classification of the considered aquifer. The results of SVMs indicate that the best performed model was related to an exponent of degree one with an accuracy of 94% for the test data set in which the sensitivity and specificity were 1.00 and 0.87, respectively. In addition, there was not any significant difference among the results of different kernels indicating that an acceptable result can be achieved by selecting optimum parameters of a kernel. The results of K-NN showed roughly lower efficiency compared with that of SVMs where sensitivity and specificity had reduced to 0.90 and 0.88, respectively whereas the accuracy of this model was 93%. A sensitivity analysis was performed on groundwater quality variables suggesting that calcium next to nitrate are the most influential parameters in the classification of this aquifer.