A Fuzzy C-means Algorithm for Clustering Fuzzy Data and Its Application in Clustering Incomplete Data | ||
Journal of AI and Data Mining | ||
دوره 8، شماره 4، بهمن 2020، صفحه 515-523 اصل مقاله (1.06 M) | ||
نوع مقاله: Original/Review Paper | ||
شناسه دیجیتال (DOI): 10.22044/jadm.2020.9021.2038 | ||
نویسندگان | ||
J. Tayyebi1؛ E. Hosseinzadeh* 2 | ||
1Department of Industrial Engineering, Birjand University of Technology, Birjand, Iran. | ||
2Department of Mathematics, Kosar University of Bojnord, Bojnord, Iran. | ||
چکیده | ||
The fuzzy c-means clustering algorithm is a useful tool for clustering; but it is convenient only for crisp complete data. In this article, an enhancement of the algorithm is proposed which is suitable for clustering trapezoidal fuzzy data. A linear ranking function is used to define a distance for trapezoidal fuzzy data. Then, as an application, a method based on the proposed algorithm is presented to cluster incomplete fuzzy data. The method substitutes missing attribute by a trapezoidal fuzzy number to be determined by using the corresponding attribute of q nearest-neighbor. Comparisons and analysis of the experimental results demonstrate the capability of the proposed method. | ||
کلیدواژهها | ||
Fuzzy c-means algorithm؛ Incomplete data؛ Fuzzy data؛ Ranking function | ||
مراجع | ||
[1] Bellman, R. E. & Zadeh, L. A. (1970). Decision making in a fuzzy environment, Manag. Sci, vol. 17, pp. 141-164. [2] Bezdek, J. C. (1981). Pattern recognition with fuzzy objective function algorithms, Plenum, New York. [3] Dempster, A. P., Laird, N. M. & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society, Series B, vol. 39, pp. 1-38. [4] Dixon, J. K. (1979). Pattern recognition with partly missing data, IEEE Trans Syst Man Cybern, vol. 9, pp. 617-621. [5] Dutt, A., Ismail, M. A., & Herawan, T. (2017). A systematic review on educational data mining, IEEE Access, vol. 5, pp. 15991-16005. [6] Fang, S. C., Hu, C. F., Wang, H. F., & Wu, S. Y. (1999). Linear programming with fuzzy coefficients in constraints, Computers &Mathematics with Applications, vol. 37, no. 10, pp. 63-76. [7] Farhangfar, A., Kurgan, L. A., & Pedrycz, W. (2007). A novel framework for imputation of missing values in databases, IEEETransactions on Systems, Man, and Cybernetics-Part A: System sand Humans, vol. 37, no. 5, pp. 692-709. [8] Garcia-Aguado, C., & Verdegay, J. L. (1993). On the sensitivity of membership functions for fuzzy linear programming problems, Fuzzy Sets and Systems, vol. 56, no. 1, pp. 47-49. [9] Hathaway, R. J. &Bezdek, J. C. (2001). Fuzzy c-means clustering of incomplete data, IEEE Transactions on systems, Man, and Cybernetics Part B: Cybernetics, vol. 31, no. 5, pp. 735-744. [10] Hettich, S., Blake, C. L. & Merz, C. J. (1998). UCI repository of machine learning database, Department of Information and Computer Science, University of California, Irvine, CA. http. [11] Lai, Y. J. & Hwang, C. L. (1992). Fuzzy Mathematical Programming Methods and Applications, Springer, Berlin. [12] Li, D., Gu, H., & Zhang, L. (2010). A fuzzy c-means clustering algorithm based on nearest-neighbor intervals for incomplete data, Expert Systems with Applications, vol. 37, no. 10, pp. 6942-6947. [13] Li, D., Gu, H., & Zhang, L. (2013). A hybrid genetic algorithm fuzzy c-means approach for incomplete data clustering based on nearest-neighbor intervals, Soft Computing, vol. 17, no. 10, pp.1787-1796.
[14] Li, T., Zhang, L., Lu, W., Hou, H., Liu, X., Pedrycz, W. & Zhong,C. (2017). Interval kernel Fuzzy C-Means clustering of incomplete data, Neurocomputing, vol. 237, pp. 316-331. [15] Liu, L., Sun, S. Z., Yu, H., Yue, X. & Zhang, D. (2016). A modified Fuzzy C-Means (FCM) Clustering algorithm and its application on carbonate fluid identification, Journal of Applied Geophysics, vol. 129, pp. 28-35. [16] Luenberger, D. G. (1984). Linear and Nonlinear Programming, 2nded. Addison-Wesley. [17] Maleki, H. R. (2002). Ranking functions and their applications to fuzzy linear programming, Far East J. Math. Sci, vol. 4, pp. 283-301. [18] Mclachlan, G. J. & Basford, K. E. (1988). Mixture models: inference and applications to clustering, Marcel Dekker, New York. [19] Mesquita, D. P., Gomes, J. P., Junior, A. H. S., &Nobre, J. S.(2017). Euclidean distance estimation in incomplete datasets. Neurocomputing, vol. 248, pp. 11-18. [20] Miyamoto, S., Takata, O. & Umayahara, K. (1998). Handling missing values in fuzzy c-means. In Proceedings of the third Asian fuzzy systems symposium, Masan, Korea, pp. 139-142. [21] Owhadi-Kareshki, M. (2019). Entropy-based Consensus for Distributed Data Clustering, Journal of AI and Data Mining, vol. 7, no. 4, pp. 551-561. [22] Sebestyen, G. S. (1962). Decision-making process in pattern recognition, NY: Macmillan Press. [23] Shaocheng, T. (1994). Interval number and fuzzy number linear programming, Fuzzy sets and systems, vol. 66, no. 3, pp. 301-306. [24] Shen, J., Zheng, E., Cheng, Z. & Deng, C. (2017). Assisting attraction classification by harvesting web data, IEEE Access, vol. 5, pp.1600-1608. [25] Li, J., Struzik, Z., Zhang, L., & Cichocki, A. (2015). Feature learning from incomplete EEG with denoising auto encoder, Neurocomputing, vol. 165, pp. 23-31. [26] Tan, P. N., Steinbach, M. & Kumar, V. (2005). Introduction to Datamining, Addison- Wesley. [27] Tanaka, H. &Ichihashi, H. (1984). A formulation of fuzzy linear programming problem based on comparison of fuzzy numbers, Control Cyber, vol. 13, pp. 185-194. [28] Teodoridis, S. & Koutroumbas, K. (2006). Pattern recognition, Third ed. Academic press, San Diego. [29] Wang, Z. (2017). Determining the clustering centers by slope difference distribution, IEEE Access, vol. 5, pp. 10995-11002. [30] Wang, X., Ruan, D. & Kerre, E. E. (2009). Mathematics of Fuzziness ˝U Basic Issues, Springer-Verlag Berlin Heidelberg. [31] Wu, S., Pang, Y., Shao, S. & Jiang, K. (2018). Advanced fuzzy C-means algorithm based on local Density and Distance, Journal of Shanghai Jiaotong university (Science), vol. 23, no. 5, pp. 636-642. [32] Yager, R.R. (1981). A procedure for ordering fuzzy sets of the unit interval, Information Sciences, vol. 24, pp. 143-161. [33] Yang, M. S. & Nataliani, Y. (2017). Robust-learning fuzzy c-means clustering algorithm with unknown number of clusters, Pattern Recognition, vol. 71, pp. 45-59. [34] Zhang, T. T. & Yuan, B. (2018). Density-based multiscale analysis for clustering in strong noise settings with varying densities, IEEE Access, vol. 6, pp. 25861-25873. | ||
آمار تعداد مشاهده مقاله: 1,012 تعداد دریافت فایل اصل مقاله: 1,601 |