Unite and conquer approach for data clustering based on particle swarm optimization and moth flame optimization

Document Type : Research Article

Authors

1 Department of Computer Science, Yazd University, Yazd, Iran.

2 Parallel Processing Laboratory, Yazd University, Yazd, Iran.

3 Researcher at Oncober, Basel, Switzerland.

Abstract

Data clustering is a widely used technique in various domains to group data objects according to their similarity. Clustering molecules is a useful process where you can easily subdivide and manipulate and large datasets to group compounds into smaller clusters with similar properties. To dis-cover new molecules with optimal properties and desired biological activity, can be used by comparing molecules and their similarities. A prominent clustering technique is the k-means algorithm, which assigns data objects to the nearest cluster center. However, this algorithm relies on the ini-tial selection of the cluster centers, which can affect its convergence and quality. To address this issue, metaheuristic algorithms have been proposed as a type of approximate optimization algorithm capable of identifying almost optimal solutions. In this paper, a new meta-heuristic approach is proposed by combining two algorithms of particle swarm optimization (PSO) and moth flame optimization (MFO), following that, it is used to improve data clustering. The  fficiency of the proposed approach is evaluated utilizing benchmark functions F1-F23. Its efficiency is evaluated with PSO and MFO algorithms on different datasets. Our experiential results show that the suggested approach exceeds the PSO and MFO algorithms with respect to speed of convergence and clustering quality.

Keywords

Main Subjects


[1] Abdollahzadeh, B., Gharehchopogh, F.S., Khodadadi, N. and Mirjalili, S. Mountain gazelle optimizer: a new nature-inspired metaheuristic algo-rithm for global optimization problems, Advances in Engineering Software, 174 (2022), 103282.
[2] Agushaka, J.O., Ezugwu, A.E. and Abualigah, L. Dwarf mongoose op-timization algorithm, Comput. Methods Appl. Mech. Engrg, 391 (2022), 114570.
[3] Ahmad, A. and Dey, L. A k-mean clustering algorithm for mixed numeric and categorical data, Data & Knowledge Engineering, 63 (2007), 503–27.
[4] Alhawarat, M. and Hegazi, M. Revisiting K-Means and topic modeling, a comparison study to cluster arabic documents, IEEE Access, 6 (2018), 42740–42749.
[5] Aljarah, I., Mafarja, M., Heidari, A.A., Faris, H. and Mirjalili, S. Cluster-ing analysis using a novel locality-informed grey wolf-inspired clustering approach, Knowledge and Information Systems, 62 (2020), 507–539.
[6] Almasri, A., Alkhawaldeh, R.S. and Çelebi, E. Clustering-Based EMT Model for Predicting Student Performance, Arab. J. Sci. Eng, 45 (2020), 10067–10078.
[7] Alswaitti, M., Albughdadi, M. and Isa, N.A.M. Variance-based differential evolution algorithm with an optional crossover for data clustering, Appl. Soft Comput, 80 (2019), 1–17.
[8] Barbakh, W.A. Wu, Y. and Fyfe, C. Review of clustering algorithms, In Non-Standard Parameter Adaptation for Exploratory Data Analysis, Springer: Berlin/Heidelberg, Germany, (2009), 7–28.
[9] Çomak, E. A modified particle swarm optimization algorithm using Renyi entropy-based clustering, Neural Computing and Applications, 27 (5) (2016), 1381–1390.
[10] Dorigo, M., Bonabeau, E. and Theraulaz, G. Ant algorithms and stig-mergy, Future Generation Computer System, 16 (2000), 851–871.
[11] Eberhart, R. and Kennedy, J. Particle swarm optimization, InProceed-ings of the IEEE international conference on neural networks, 4 (1995), 1942-1948.
[12] Eesa, A.S. and Orman, Z. A new clustering method based on the bio-inspired cuttlefish optimization algorithm, Expert Syst, 37 (2020), e12478. [13] Esmin, A.A., Coelho, R.A. and Matwin, S. A review on particle swarm optimization algorithm and its variants to clustering high-dimensional data, Artif. Intell. Rev, 44 (2015), 23–45.
[14] Gan, G., Ma, C. and Wu, J. Data clustering: theory, algorithms, and ap-plications, SIAM, Society for Industrial and Applied Mathematics, (2007).
[15] Gandomi, A.H., Yang, X.S. and Alavi, A.H. Cuckoo search algorithm, a metaheuristic approach to solve structural optimization problems, Eng. Comput, 29 (2013), 17–35.
[16] Hu, F., Liu, J., Li, L. and Liang, J. Community detection in complex networks using Node2vec with spectral clustering, Physica A: Statistical Mechanics and its Applications, 545 (2020), 123633.
[17] Jadhav, A.N. and Gomathi, N. Kernel-based exponential grey wolf opti-mizer for rapid centroid estimation in data clustering, Jurnal Teknologi, 78 (11) (2016), 65–74.
[18] Jadhav, A.N. and Gomathi, N. WGC: Hybridization of exponential grey wolf optimizer with whale optimization for data clustering, Alex. Eng. J, 57 (2018), 1569–1584.
[19] Jain, A.K. Data clustering: 50 years beyond K-means, Pattern Recognit. Lett, 31 (2010), 651–666.
[20] Jain, A.K. and Dubes, R.C. Algorithms for Clustering Data, Englewood Cliffs, NJ, USA: Prentice-Hall, (1988).
[21] Kumar, N. and Kumar, H. A fuzzy clustering technique for enhancing the convergence performance by using improved Fuzzy c-means and Particle Swarm Optimization algorithms, Data & Knowledge Engineering, 140 (2022), 102050.
[22] Kumar, Y. and Sahoo, G. Hybridization of magnetic charge system search and particle swarm optimization for efficient data clustering using neighborhood search strategy, Soft Computing, 19 (12) (2015), 3621–3645.
[23] Kushwaha, N., Pant, M., Kant, S. and Jain, V.K. Magnetic optimization algorithm for data clustering, Pattern Recognition Letters, 115 (2018), 59–65.
[24] Liang, J., Suganthan, P. and Deb, K. Novel composition test func-tions for numerical global optimization, in Swarm Intelligence Symposium, 2005. SIS 2005, Proceedings 2005 IEEE, (2005), 68–75.
[25] Luque-Chang, A., Cuevas, E., Fausto, F., Zaldívar, D. and Pérez, M. So-cial spider optimization algorithm: modifications, applications, and per-spectives, Mathematical Problems in Engineering, 1 (2018), 6843923.
[26] Lv, Z., Liu, T., Shi, C., Benediktsson, J.A. and Du, H. Novel land cover change detection method based on k-Means clustering and adaptive majority voting using bitemporal remote sensing images, IEEE Access, 7 (2019), 34425–34437.
[27] MacQueen, J. Some methods for classification and analysis of multivari-ate observations, in Proc. 5th Berkeley Symp. Math. Statist. Probab, 1 (1967), 281–297.
[28] Mansalis, S., Ntoutsi, E., Pelekis, N. and Theodoridis, Y. An evalua-tion of data stream clustering algorithms, Statistical Analysis and Data Mining: The ASA Data Science Journal, 11 (2018), 167–187.
[29] Meng, Y., Liang, J., Cao, F. and He, Y. A new distance with deriva-tive information for functional k-means clustering algorithm, Inf. Sci. 463 (2018), 166–185.
[30] Mirjalili, S. Moth-flame optimization algorithm: A novel nature-inspired heuristic paradigm, Knowledge-based systems. 89 (2015), 228–249.
[31] Mirjalili, S. and Lewis, A. The whale optimization algorithm, Adv. Eng. Softw. 95 (2016), 51–67.
[32] Mirjalili, S., Mirjalili, S.M. and Lewis, A. Grey wolf optimizer, Advances in engineering software. 69 (2014), 46–61.
[33] Nanda, S.J. and Panda, G. A survey on nature inspired metaheuristic algorithms for partitional clustering, Swarm Evol. Comput. 16 (2014), 1–18.
[34] Nasiri, J. and Khiyabani, F.M. A whale optimization algorithm (WOA) approach for clustering, Cogent Math. Stat. 5 (2018), 1483565.
[35] Rodriguez, M.Z., Comin, C.H.v Casanova, D., Bruno, O.M., Aman-cio, D.R., Costa, L.D.F. and Rodrigues, F.A. Clustering algo-rithms: A comparative approach, PLoS ONE 14(1), e0210236 (2019).
 https://doi.org/10.1371/journal.pone.0210236
[36] Saida, I.B., Nadjet, K. and Omar, B. A new algorithm for data clus-tering based on cuckoo search optimization, In Genetic and Evolutionary Computing, Springer: Berlin/Heidelberg, Germany. (2014), 55–64.
[37] Sayed, G.I. and Hassanien, A.E. A hybrid SA-MFO algorithm for func-tion optimization and engineering design problems, Complex & Intelligent Systems. 4 (2018), 195–212.
[38] Shojaee, Z., Shahzadeh Fazeli, S.A., Abbasi, E. and Adibnia, F. Feature Selection based on Particle Swarm Optimization and Mutual Information, AI and Data Mining. 9(1) (2021), 39–44.
[39] Singh, T. A novel data clustering approach based on whale optimization algorithm, Expert Syst. 38 (2020), e12657.
[40] Singh, T. and Mishra, K.K. Data clustering using environmental adap-tation method In International Conference on Hybrid Intelligent Systems, Springer. (2019), 156–164.
[41] Singh, T., Mishra, K.K. and Ranvijay. A variant of EAM to uncover community structure in complex networks, Int. J. Bio-Inspired Comput. 16 (2020), 102–110.
[42] Singh, T., Saxena, N., Khurana, M., Singh, D., Abdalla, M. and Alsha-zly, H. Data clustering using moth-flame optimization algorithm, Sensors. 21(12) (2021), 4086.
[43] Suganthan, P.N., Hansen, N., Liang, J. J., Deb, K., Chen, Y.P., Auger, A. and Tiwari, S. Problem definitions and evaluation criteria for the CEC 2005 special session on real-parameter optimization, KanGAL re-port. 2005005 (2005).
[44] Talevi, A. and Bellera, CL. Clustering of small molecules: new per-spectives and their impact on natural product lead discovery, Frontiers in Natural Products. 3 (2024), 1367537.
[45] Wolpert, D.H. and Macready, W.G. No free lunch theorems for opti-mization, IEEE Trans Evol Comput. 1 (1997), 67–82.
[46] Zhang, Q.H., Li, B.L., Liu, Y.J., Gao, L., Liu, L.J. and Shi, X.L. Data clustering using multivariant optimization algorithm, Int. J. Mach. Learn. Cybern. 7 (5) (2016), 773–782.
[47] Zhu, J., Jiang, Z., Evangelidis, G.D., Zhang, C., Pang, S. and Li, Z. Efficient registration of multi-view point sets by K-means clustering, Inf. Sci. 488 (2019), 205–218.
CAPTCHA Image