Enhancing clustering performance: an analysis of the clustering based on arithmetic optimization algorithm
Hakam Singh and Ashutosh Kumar Dubey
Abstract
This study explored the clustering based on arithmetic optimization algorithm (CAOA) and its potential for addressing challenging clustering problems. CAOA is based on the arithmetic optimization algorithm (AOA), which utilizes arithmetic operators, including Addition, Subtraction, Multiplication, and Division, to optimize solutions. The performance of CAOA was investigated by applying it to diverse real-life datasets and meticulously analysing its clustering performance. Two primary evaluation metrics, namely the average distance among cluster members (intra-cluster distance) and the F-measure, were employed to gauge the clustering quality. Statistical validation was conducted using the Friedman test, ensuring robust and significant results. The results revealed substantial insights into CAOA's performance. In terms of average intra-cluster distance, CAOA consistently recorded the lowest values among all tested clustering algorithms. This outcome indicated CAOA's ability to form tightly packed, well-defined clusters, enhancing its suitability for applications like pattern recognition and data segmentation. Regarding F-measure, CAOA delivered competitive clustering quality. Notably, it achieved among the highest F-measure values, especially in datasets like "Cancer" and "LR," signifying its potential for accurate cluster identification, crucial in domains such as medical diagnosis and customer segmentation. This study indicated the effectiveness of CAOA in addressing real-world clustering challenges. The findings emphasized CAOA's consistent superiority over other algorithms in minimizing the average intra-cluster distance while also demonstrating competitive clustering quality as measured by the F-measure. Statistical validation through the Friedman test confirmed the distinctiveness of CAOA's performance.
Keyword
CAOA, AOA, F-measure, Average Intra-cluster Distance, Friedman test.
Cite this article
Singh H, Dubey AK.Enhancing clustering performance: an analysis of the clustering based on arithmetic optimization algorithm. International Journal of Advanced Technology and Engineering Exploration. 2024;11(117):1169-1182. DOI:10.19101/IJATEE.2023.10102298
Refference
[1]Azevedo A. Data mining and knowledge discovery in databases. In advanced methodologies and technologies in network architecture, mobile computing, and data analytics 2019 (pp. 502-14). IGI Global.
[2]Hu C, Wu T, Liu S, Liu C, Ma T, Yang F. Joint unsupervised contrastive learning and robust GMM for text clustering. Information Processing & Management. 2024; 61(1):103529.
[3]Wang L, Yan J, Mu L, Huang L. Knowledge discovery from remote sensing images: a review. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery. 2020; 10(5):e1371.
[4]Chaouch S, Yvonnet J. An unsupervised machine learning approach to reduce nonlinear FE2 multiscale calculations using macro clustering. Finite Elements in Analysis and Design. 2024; 229:104069.
[5]Nanda SJ, Panda G. A survey on nature inspired metaheuristic algorithms for partitional clustering. Swarm and Evolutionary Computation. 2014; 16:1-8.
[6]Thakur B, Kumar N, Gupta G. Machine learning techniques with ANOVA for the prediction of breast cancer. International Journal of Advanced Technology and Engineering Exploration. 2022; 9(87):232-45.
[7]Bouguettaya A, Yu Q, Liu X, Zhou X, Song A. Efficient agglomerative hierarchical clustering. Expert Systems with Applications. 2015; 42(5):2785-97.
[8]Shambhu S, Koundal D, Das P. Deep learning-based computer assisted detection techniques for malaria parasite using blood smear images. International Journal of Advanced Technology and Engineering Exploration. 2023; 10(105):990-1015.
[9]Amini A, Wah TY, Saboohi H. On density-based data streams clustering algorithms: a survey. Journal of Computer Science and Technology. 2014; 29:116-41.
[10]Song XF, Zhang Y, Gong DW, Gao XZ. A fast hybrid feature selection based on correlation-guided clustering and particle swarm optimization for high-dimensional data. IEEE Transactions on Cybernetics. 2021; 52(9):9573-86.
[11]Alswaitti M, Albughdadi M, Isa NA. Variance-based differential evolution algorithm with an optional crossover for data clustering. Applied Soft Computing. 2019; 80:1-7.
[12]Sharma R, Vashisht V, Singh U. EEFCM‐DE: energy‐efficient clustering based on fuzzy C means and differential evolution algorithm in WSNs. IET Communications. 2019; 13(8):996-1007.
[13]Ezugwu AE, Ikotun AM, Oyelade OO, Abualigah L, Agushaka JO, Eke CI, et al. A comprehensive survey of clustering algorithms: state-of-the-art machine learning applications, taxonomy, challenges, and future research prospects. Engineering Applications of Artificial Intelligence. 2022; 110:104743.
[14]Rajwar K, Deep K, Das S. An exhaustive review of the metaheuristic algorithms for search and optimization: taxonomy, applications, and open challenges. Artificial Intelligence Review. 2023; 56(11):13187-257.
[15]Darvishpoor S, Darvishpour A, Escarcega M, Hassanalian M. Nature-inspired algorithms from oceans to space: a comprehensive review of heuristic and meta-heuristic optimization algorithms and their potential applications in drones. Drones. 2023; 7(7):1-134.
[16]Dorigo M, Birattari M, Stutzle T. Ant colony optimization. IEEE Computational Intelligence Magazine. 2006; 1(4):28-39.
[17]Kumar Y, Sahoo G. A charged system search approach for data clustering. Progress in Artificial Intelligence. 2014; 2(2):153-66.
[18]Dubey A, Gupta U, Jain S. Medical data clustering and classification using TLBO and machine learning algorithms. Computers, Materials and Continua. 2021; 70(3):4523-43.
[19]Harshavardhan A, Boyapati P, Neelakandan S, Abdul-rasheed AAA, Singh PAK, Walia R. LSGDM with biogeography‐based optimization (BBO) model for healthcare applications. Journal of Healthcare Engineering. 2022; 2022(1):2170839.
[20]Nadimi-shahraki MH, Zamani H, Mirjalili S. Enhanced whale optimization algorithm for medical feature selection: a COVID-19 case study. Computers in Biology and Medicine. 2022; 148:105858.
[21]Sharma SK, Ghai W. Artificial bee colony optimized VM migration and allocation using neural network architecture. International Journal of Advanced Technology and Engineering Exploration. 2023; 10(102):590-607.
[22]Bezdek JC, Boggavarapu S, Hall LO, Bensaid A. Genetic algorithm guided clustering. In proceedings of the first IEEE conference on evolutionary computation. IEEE world congress on computational intelligence 1994 (pp. 34-9). IEEE.
[23]Shelokar PS, Jayaraman VK, Kulkarni BD. An ant colony approach for clustering. Analytica Chimica Acta. 2004; 509(2):187-95.
[24]Liu Y, Yi Z, Wu H, Ye M, Chen K. A tabu search approach for the minimum sum-of-squares clustering problem. Information Sciences. 2008; 178(12):2680-704.
[25]Mahdavi M, Chehreghani MH, Abolhassani H, Forsati R. Novel meta-heuristic algorithms for clustering web documents. Applied Mathematics and Computation. 2008; 201(1-2):441-51.
[26]Santosa B, Ningrum MK. Cat swarm optimization for clustering. In international conference of soft computing and pattern recognition 2009 (pp. 54-9). IEEE.
[27]Zhang C, Ouyang D, Ning J. An artificial bee colony approach for clustering. Expert Systems with Applications. 2010; 37(7):4761-7.
[28]Singh S, Srivastava S. Kernel fuzzy C-means clustering with teaching learning based optimization algorithm (TLBO-KFCM). Journal of Intelligent & Fuzzy Systems. 2022; 42(2):1051-9.
[29]Kaur A, Kumar Y. Neighborhood search based improved bat algorithm for data clustering. Applied Intelligence. 2022; 52(9):10541-75.
[30]Kumar Y, Kaur A. Variants of bat algorithm for solving partitional clustering problems. Engineering with Computers. 2022; 38(Suppl 3):1973-99.
[31]Niknam T, Amiri B. An efficient hybrid approach based on PSO, ACO and k-means for cluster analysis. Applied Soft Computing. 2010; 10(1):183-97.
[32]Aggarwal S, Singh P. Cuckoo, bat and krill herd based k-means clustering algorithms. Cluster Computing. 2019; 22(Suppl 6):14169-80.
[33]Hartigan JA, Wong MA. Algorithm AS 136: a k-means clustering algorithm. Journal of the Royal Statistical Society. Series C (Applied Statistics). 1979; 28(1):100-8.
[34]Zhang B, Hsu M, Dayal U. K-harmonic means-a spatial clustering algorithm with boosting. In international workshop on temporal, spatial, and spatio-temporal data mining 2000 (pp. 31-45). Berlin, Heidelberg: Springer Berlin Heidelberg.
[35]Žalik KR. An efficient k-means clustering algorithm. Pattern Recognition Letters. 2008; 29(9):1385-91.
[36]Geng X, Mu Y, Mao S, Ye J, Zhu L. An improved K-means algorithm based on fuzzy metrics. IEEE Access. 2020; 8:217416-24.
[37]Tang LY, Wang ZH, Wang SD, Fan JC, Yue GW. A novel rough semi-supervised k-means algorithm for text clustering. International Journal of Bio-Inspired Computation. 2023; 21(2):57-68.
[38]Liu B, Liu C, Zhou Y, Wang D, Dun Y. An unsupervised chatter detection method based on AE and merging GMM and K-means. Mechanical Systems and Signal Processing. 2023; 186:109861.
[39]Ning Z, Chen J, Huang J, Sabo UJ, Yuan Z, Dai Z. WeDIV–an improved k-means clustering algorithm with a weighted distance and a novel internal validation index. Egyptian Informatics Journal. 2022; 23(4):133-44.
[40]Cheng D, Huang J, Zhang S, Xia S, Wang G, Xie J. K-means clustering with natural density peaks for discovering arbitrary-shaped clusters. IEEE Transactions on Neural Networks and Learning Systems. 2023; 35(8):11077-90.
[41]Yang F, Sun T, Zhang C. An efficient hybrid data clustering method based on K-harmonic means and particle swarm optimization. Expert Systems with Applications. 2009; 36(6):9847-52.
[42]Sixu L, Muqing W, Min Z. Particle swarm optimization and artificial bee colony algorithm for clustering and mobile based software-defined wireless sensor networks. Wireless Networks. 2022; 28(4):1671-88.
[43]Yan X, Zhu Y, Zou W, Wang L. A new approach for data clustering using hybrid artificial bee colony algorithm. Neurocomputing. 2012; 97:241-50.
[44]Huang CL, Huang WC, Chang HY, Yeh YC, Tsai CY. Hybridization strategies for continuous ant colony optimization and particle swarm optimization applied to data clustering. Applied Soft Computing. 2013; 13(9):3864-72.
[45]Hatamlou A, Hatamlou M. PSOHS: an efficient two-stage approach for data clustering. Memetic Computing. 2013; 5(2):155-61.
[46]Singh H, Kumar Y. An enhanced version of cat swarm optimization algorithm for cluster analysis. International Journal of Applied Metaheuristic Computing. 2022; 13(1):1-25.
[47]Abualigah L, Diabat A, Mirjalili S, Abd EM, Gandomi AH. The arithmetic optimization algorithm. Computer Methods in Applied Mechanics and Engineering. 2021; 376:113609.