International Journal of Advanced Technology and Engineering Exploration (IJATEE) ISSN (P): 2394-5443 ISSN (O): 2394-7454 Vol - 8, Issue - 75, February 2021
  1. 1
    Google Scholar
Comparison of affinity degree classification with four different classifiers in several data sets

Rosyazwani Mohd Rosdan, Wan Suryani Wan Awang and Wan Aezwani Wan Abu Bakar

Abstract

The affinity notion has been widely used in research fields. Thus, in this research, affinity is employed to find the degree between two data sets and classify through prediction. But, as Affinity Degree (AD) classification is a new technique, the comparison with different classification types is needed to test the compatibility technique. Herein, this study compares various machine learning techniques and determines the most efficient classification technique based on the data set. Four different classification algorithms, K-Nearest Neighbour (KNN), Naive Bayes (NB), Decision Tree (J48), and Support Vector Machine (SVM), were used as other techniques to compare with AD classification. Three different data sets, breast cancer, acute inflammation, and iris plant, were used for experiment purposes. The results show J48 has the best rate in performance measures compare to the other four classifiers. However, the results of AD classification show the significance that more studies can improve it.

Keyword

Affinity degree (AD), K-nearest neighbour (KNN), Naive bayes (NB), Decision tree (J48), Support vector machine (SVM).

Cite this article

Rosdan RM, Awang WS, Abu Bakar WA

Refference

[1][1]Li Z, Kim J, Regnier FE. Mobile affinity sorbent chromatography. Analytical Chemistry. 2018; 90(3):1668-76.

[2][2]Asseraf Y, Shoham A. The “tug of war” model of foreign product purchases. European Journal of Marketing. 2016; 5(3-4):550-74.

[3][3]Bakhouya M, Gaber J. Approaches for engineering adaptive systems in ubiquitous and pervasive environments. Journal of Reliable Intelligent Environments. 2015; 1(2):75-86.

[4][4]Chen YW, Larbani M, Hsieh CY, Chen CW. Introduction of affinity set and its application in data-mining example of delayed diagnosis. Expert Systems with Applications. 2009; 36(8):10883-9.

[5][5]Awang WS, Deris MM, Rana OF, Zarina M, Rose AN. Affinity replica selection in distributed systems. In international conference on parallel computing technologies 2019 (pp. 385-99). Springer, Cham.

[6][6]Bost R, Popa RA, Tu S, Goldwasser S. Machine learning classification over encrypted data. In NDSS 2015 (pp. 1-14).

[7][7]Cover T, Hart P. Nearest neighbor pattern classification. IEEE Transactions on Information Theory. 1967; 13(1):21-7.

[8][8]Sonawane JM, Gaikwad SD, Prakash G. Microarray data classification using dual tree m-band wavelet features. International Journal of Advances in Signal and Image Sciences. 2017; 3(1):19-24.

[9][9]Prasatha VS, Alfeilate HA, Hassanate AB, Lasassmehe O, Tarawnehf AS, Alhasanatg MB, et al. Effects of distance measure choice on KNN classifier performance-a review. arXiv preprint arXiv:1708.04321. 2017.

[10][10]Nikam SS. A comparative study of classification techniques in data mining algorithms. Oriental Journal of Computer Science & Technology. 2015; 8(1):13-9.

[11][11]Pelillo M. Alhazen and the nearest neighbor rule. Pattern Recognition Letters. 2014; 38:34-7.

[12][12]Hand DJ, Yu K. Idiot s Bayes—not so stupid after all? International Statistical Review. 2001; 69(3):385-98.

[13][13]Patel HH, Prajapati P. Study and analysis of decision tree based classification algorithms. International Journal of Computer Sciences and Engineering. 2018; 6(10):74-8.

[14][14]Durgesh KS, Lekha B. Data classification using support vector machine. Journal of theoretical and applied information technology. 2010; 12(1):1-7.

[15][15]https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Coimbra#. Accessed 15 February 2020.

[16][16]https://archive.ics.uci.edu/ml/datasets/Acute+Inflammations. Accessed 06 December 2020.

[17][17]http://archive.ics.uci.edu/ml/datasets/Iris/. Accessed 06 December 2020.

[18][18]Halim RE, Zulkarnain EA. The effect of consumer affinity and country image toward willingness to buy. The Journal of Distribution Science. 2017; 15(4):15-23.

[19][19]Dancey CP, Reidy J. Statistics without maths for psychology. Pearson Education; 2007.

[20][20]Assegie TA. An optimized K-Nearest neighbor based breast cancer detection. Journal of Robotics and Control. 2021; 2(3):115-8.

[21][21]Tien Bui D, Pradhan B, Lofman O, Revhaug I. Landslide susceptibility assessment in Vietnam using support vector machines, decision tree, and Naive Bayes Models. Mathematical problems in Engineering. 2012; 2(3):115-8.

[22][22]Pharswan R, Singh J. Performance analysis of SVM and KNN in breast cancer classification: a survey. In internet of things and big data applications 2020 (pp. 133-40). Springer, Cham.

[23][23]Thirunavukkarasu K, Singh AS, Rai P, Gupta S. Classification of IRIS dataset using classification based KNN algorithm in supervised learning. In international conference on computing communication and automation 2018 (pp. 1-4). IEEE.

[24][24]Mahdikhani L, Keyvanpour MR. Challenges of data mining classification techniques in mammograms. In 5th conference on knowledge based engineering and innovation (KBEI) (pp. 637-43). IEEE.

[25][25]Saritas MM, Yasar A. Performance analysis of ANN and naive bayes classification algorithm for data classification. International Journal of Intelligent Systems and Applications in Engineering. 2019; 7(2):88-91.

[26][26]Hamoud A, Hashim AS, Awadh WA. Predicting student performance in higher education institutions using decision tree analysis. International Journal of Interactive Multimedia and Artificial Intelligence. 2018; 5(2):26-31.

[27][27]https://www.wcrf.org/dietandcancer/cancer-trends/breast-cancer-statistics. Accessed 15 April 2020.

[28][28]Kuchenbaecker KB, Hopper JL, Barnes DR, Phillips KA, Mooij TM, Roos-Blom MJ, et al. Risks of breast, ovarian, and contralateral breast cancer for BRCA1 and BRCA2 mutation carriers. Jama. 2017; 317(23):2402-16.

[29][29]Majoor BC, Boyce AM, Bovée JV, Smit VT, Collins MT, Cleton‐Jansen AM, et al. Increased risk of breast cancer at a young age in women with fibrous dysplasia. Journal of Bone and Mineral Research. 2018; 33(1):84-90.

[30][30]Brinton LA, Brogan DR, Coates RJ, Swanson CA, Potischman N, Stanford JL. Breast cancer risk among women under 55 years of age by joint effects of usage of oral contraceptives and hormone replacement therapy. Menopause. 2018; 25(11):1195-200.

[31][31]https://www.who.int/cancer/prevention/diagnosis-screening/breast-cancer/en/. Accessed 15 April 2020.

[32][32]https://medlineplus.gov/ency/article/000495.htm#:~:text=. Accessed 25 January 2021.

[33][33]https://www.healthline.com/health/acute-nephritic-syndrome. Accessed 25 January 2021.

[34][34]Ruuska S, Hämäläinen W, Kajava S, Mughal M, Matilainen P, Mononen J. Evaluation of the confusion matrix method in the validation of an automated system for measuring feeding behaviour of cattle. Behavioural Processes. 2018; 148:56-62.

[35][35]Chicco D, Jurman G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics. 2020; 21(1):1-3.