International Journal of Advanced Computer Research (IJACR) ISSN (P): 2249-7277 ISSN (O): 2277-7970 Vol - 13, Issue - 62, March 2023
  1. 1
    Google Scholar
  2. 4
    Impact Factor
An analysis and literature review of algorithms for frequent itemset mining

Mrinabh Kumar and Animesh Kumar Dubey

Abstract

The data mining process should be led by domain knowledge. It includes different aspects including the selection of the data, interpretation, extraction, and transformation. In this paper different domains have been covered for the analysis of various data mining algorithms. The main emphasis on the algorithms which are mainly used for the extraction and discovering of interesting patterns and relationships. Various data mining algorithms, such as sequential pattern discovery using equivalence classes (SPADE), k-means, Apriori algorithm, FP-Growth and others, were discussed in this paper. The reviews and analysis of the advantages and disadvantages of various data mining approaches have been explored with advantages and limitations. In summary, this paper provides a comprehensive understanding of data mining approaches and their potential applications in various fields.

Keyword

Data mining, Domain knowledge, Preprocessing, Knowledge discovery.

Cite this article

Kumar M, Dubey AK

Refference

[1][1]Shabtay L, Fournier-Viger P, Yaari R, Dattner I. A guided FP-Growth algorithm for mining multitude-targeted item-sets and class association rules in imbalanced data. Information Sciences. 2021; 553:353-75.

[2][2]Shawkat M, Badawi M, El-ghamrawy S, Arnous R, El-desoky A. An optimized FP-growth algorithm for discovery of association rules. The Journal of Supercomputing. 2022:1-28.

[3][3]Ghosh M, Roy A, Sil P, Mondal KC. Frequent itemset mining using FP-tree: a CLA-based approach and its extended application in biodiversity data. Innovations in Systems and Software Engineering. 2022:1-9.

[4][4]Zhang X, Tang Y, Liu Q, Liu G, Ning X, Chen J. A fault analysis method based on association rule mining for distribution terminal unit. Applied Sciences. 2021; 11(11):5221.

[5][5]Dubey AK, Shandilya SK. A novel J2ME service for mining incremental patterns in mobile computing. In information and communication technologies: international conference, ICT 2010, Kochi, Kerala, India, Proceedings 2010 (pp. 157-64). Springer Berlin Heidelberg.

[6][6]Happawana KA, Diamond BJ. Association rule learning in neuropsychological data analysis for Alzheimer’s disease. Journal of Neuropsychology. 2022; 16(1):116-30.

[7][7]Alcan D, Ozdemir K, Ozkan B, Mucan AY, Ozcan T. A comparative analysis of Apriori and FP-growth algorithms for market basket analysis using multi-level association rule mining. In industrial engineering in the Covid-19 Era: selected papers from the hybrid global joint conference on industrial engineering and its application areas, GJCIE 2022, October 29-30, 2022 2023 (pp. 128-37). Cham: Springer Nature Switzerland.

[8][8]Shahin M, Inoubli W, Shah SA, Yahia SB, Draheim D. Distributed scalable association rule mining over covid-19 data. In future data and security engineering: 8th international conference, FDSE 2021, Virtual Event, 2021, Proceedings 2021 (pp. 39-52). Cham: Springer International Publishing.

[9][9]Dubey AK, Shandilya SK. Exploiting need of data mining services in mobile computing environments. In international conference on computational intelligence and communication networks 2010 (pp. 409-14). IEEE.

[10][10]Dubey AK, Gupta U, Jain S. Computational measure of cancer using data mining and optimization. In sustainable communication networks and application 2019 (pp. 626-32). Springer International Publishing.

[11][11]Ghafoor N, Ahmad M. Nazish Ghafoor, Mansoor ahmad prioritizing effectiveness of algorithms of association rule mining. Journal of Computational Learning Strategies & Practices. 2021; 1(1):18-30.

[12][12]Fernandez-Basso C, Ruiz MD, Martin-Bautista MJ. New spark solutions for distributed frequent itemset and association rule mining algorithms. Cluster Computing. 2023:1-8.

[13][13]Makkar K, Kumar P, Poriye M, Aggarwal S. Improvisation in opinion mining using data preprocessing techniques based on consumer’s review. International Journal of Advanced Technology and Engineering Exploration. 2023; 10(99):257-77.

[14][14]Dubey AK, Kapoor D, Kashyap V. A review on performance analysis of data mining methods in IoT. International Journal of Advanced Technology and Engineering Exploration. 2020; 7(73):193-200.

[15][15]Suhandi N, Gustriansyah R. Marketing strategy using frequent pattern growth. Journal of Computer Networks, Architecture and High Performance Computing. 2021; 3(2):194-201.

[16][16]Saxena A, Rajpoot V. A comparative analysis of association rule mining algorithms. In IOP conference series: materials science and engineering 2021 (pp. 1-11). IOP Publishing.

[17][17]Babu MV, Sreedevi M. Performance analysis on advances in frequent pattern growth algorithm. In 2022 international conference on advances in computing, communication and applied informatics 2022 (pp. 1-5). IEEE.

[18][18]Anupama CG, Lakshmi C. Approaches to parallelise Eclat algorithm and analysing its performance for K length prefix-based equivalence classes. International Journal of Business Intelligence and Data Mining. 2023; 22(1-2):34-48.

[19][19]Nikitin E, Kashevnik A, Shilov N. Shopping basket analisys for mining equipment: comparison and evaluation of modern methods. In 2022 31st conference of open innovations association 2022 (pp. 207-13). IEEE.

[20][21]Yogasini M, Prathibha BN. Comparative analysis on frequent Itemset mining algorithms in vertically partitioned cloud data. In futuristic communication and network technologies: select proceedings of VICFCNT 2020 (pp. 395-402). Springer Singapore.

[21][21]Zhang F, Zhang Y, Liao X, Jin H. PNPFI: an efficient parallel frequent itemsets mining algorithm. In 22nd international conference on computer supported cooperative work in design 2018 (pp. 172-7). IEEE.

[22][22]Agarwal R, Gautam A, Saksena AK, Rai A, Karatangi SV. Method for mining frequent item sets considering average utility. In international conference on emerging smart computing and informatics 2021 (pp. 275-8). IEEE.

[23][23]Amballoor RG, Naik SB. Utility-based frequent itemsets in data streams using sliding window. In international conference on computing, communication, and intelligent systems 2021 (pp. 108-12). IEEE.

[24][24]Bhatia J, Gupta A. Association rule mining by discretization of agricultural data using extended partitioning algorithm. In 6th international conference for convergence in technology 2021(pp. 1-6). IEEE.

[25][25]Cao H, Yang S, Wang Q, Wang Q, Zhang L. A closed itemset property based multi-objective evolutionary approach for mining frequent and high utility itemsets. In congress on evolutionary computation 2019 (pp. 3356-63). IEEE.

[26][26]Fang W, Zhang Q, Sun J, Wu X. Mining high quality patterns using multi-objective evolutionary algorithm. IEEE Transactions on Knowledge and Data Engineering. 2020; 34(8):3883-98.

[27][27]Halim Z, Ali O, Khan MG. On the efficient representation of datasets as graphs to mine maximal frequent itemsets. IEEE Transactions on Knowledge and Data Engineering. 2019; 33(4):1674-91.

[28][28]Hong TP, Huang WM, Lan GC, Chiang MC, Lin JC. A bitmap approach for mining erasable itemsets. IEEE Access. 2021; 9:106029-38.

[29][29]Junrui Y, Jingyi Y. Frequent itemsets mining algorithm for uncertain data streams based on triangular matrix. In international conference on power electronics, computer applications 2021 (pp. 327-30). IEEE.

[30][30]Nalousi S, Farhang Y, Sangar AB. Weighted frequent itemset mining using weighted subtrees: WST-WFIM. IEEE Canadian Journal of Electrical and Computer Engineering. 2021; 44(2):206-15.

[31][31]Qu JF, Hang B, Wu Z, Wu Z, Gu Q, Tang B. Efficient mining of frequent itemsets using only one dynamic prefix tree. IEEE Access. 2020; 8:183722-35.

[32][32]Thurachon W, Kreesuradej W. Incremental association rule mining with a fast incremental updating frequent pattern growth algorithm. IEEE Access. 2021; 9:55726-41.

[33][33]Wu C, Jiang H. Research on parallelization of frequent itemsets mining algorithm. In 6th international conference on cloud computing and big data analytics 2021 (pp. 210-215). IEEE.

[34][34]De la Cruz-Ruiz F, Canul-Reich J, Rivera-López R, De la Cruz-Hernández E. Impact of data balancing a multiclass dataset before the creation of association rules to study bacterial vaginosis. Intelligent Medicine. 2023.

[35][35]Islam MA, Majumder MZ, Hussein MA. Chronic kidney disease prediction based on machine learning algorithms. Journal of Pathology Informatics. 2023.

[36][36]Ho GT, Tsang YP, Wu Q, Tang V. Ck-FARM: an R package to discover big data associations for business intelligence. SoftwareX. 2023.