International Journal of Advanced Computer Research (IJACR) ISSN (Print): 2249-7277 ISSN (Online): 2277-7970 Volume - 14 Issue - 67 June - 2024
  1. 1
    Google Scholar
Enhancing kNN classification with crow search optimization for dynamic text-based data categorization

Anam Atique and MD. Adil Hashmi

Abstract

An integration of the k-nearest neighbors (kNN) algorithm with crow search optimization (CSO) to tackle the challenges of text-based data classification was proposed. The kNN algorithm, is combined with the CSO's robust optimization capabilities to dynamically select the optimal 'k' value and feature set, enhancing the adaptability and accuracy of text classification. The evolving nature of text data, particularly in high-volume academic and business contexts, demands efficient and adaptive classification methods to handle varying data distributions and feature relevance. The kNN-CSO method addresses these requirements by leveraging CSO to fine-tune kNN parameters, ensuring high-performance metrics across different datasets. Initial results demonstrate the method's efficacy, particularly in handling ambiguities and optimizing classification under varying conditions. The results demonstrate the efficacy of this method, yielding an accuracy of up to 96%, precision up to 95%, and an F1-score reaching 95%, significantly enhancing the model’s adaptability to new or evolving data. This approach not only improves classification accuracy but also enhances the model's ability to adapt to new or evolving data, providing a significant advancement in automated document processing and categorization.

Keyword

kNN, Crow search optimization, Text classification, Dynamic data adaptation, Feature optimization.

Cite this article

Atique A, Hashmi MA.Enhancing kNN classification with crow search optimization for dynamic text-based data categorization. International Journal of Advanced Computer Research. 2024;14(67):50-55. DOI:10.19101/IJACR.2024.1466009

Refference

[1]Jain D, Borah MD, Biswas A. A sentence is known by the company it keeps: improving legal document summarization using deep clustering. Artificial Intelligence and Law. 2024; 32(1):165-200.

[2]Dodda R, Babu AS. Text document clustering using modified particle swarm optimization with k-means model. International Journal on Artificial Intelligence Tools. 2024; 33(01):2350061.

[3]Nasim Z, Haider S. Evaluation of clustering techniques on Urdu News head-lines: a case of short length text. Journal of Experimental & Theoretical Artificial Intelligence. 2024; 36(4):489-510.

[4]Shokouhyar S, Maghsoudi M, Khanizadeh S, Jorfi S. Analyzing supply chain technology trends through network analysis and clustering techniques: a patent-based study. Annals of Operations Research. 2024:1-36.

[5]Hammami E, Faiz R. European union’s legislative proposals clustering based on multiple hidden layers representation. In international baltic conference on digital business and intelligent systems 2024 (pp. 106-19). Cham: Springer Nature Switzerland.

[6]Al-Taani AT, Al-Sayadi SH. Extractive text summarization of arabic multi-document using fuzzy C-means and Latent Dirichlet Allocation. International Journal of System Assurance Engineering and Management. 2024; 15(2):713-26.

[7]Liu K, He J, Chen Y. A topic-enhanced dirichlet model for short text stream clustering. Neural Computing and Applications. 2024; 36(14):8125-40.

[8]Haris M, Yusoff Y, Zain AM, Khattak AS, Hussain SF. Breaking down multi-view clustering: a comprehensive review of multi-view approaches for complex data structures. Engineering Applications of Artificial Intelligence. 2024; 132:107857.

[9]Thielmann A, Reuter A, Seifert Q, Bergherr E, Säfken B. Topics in the haystack: Enhancing topic quality through corpus expansion. Computational Linguistics. 2024: 1-37.

[10]Chebil M, Jallouli R, Bach Tobji MA. Clustering social media data for marketing strategies: Literature review using topic modelling techniques. Journal of Telecommunications and the Digital Economy. 2024; 12(1):510-37.

[11]Ashhab MM, Rony MI, Anwesha N, Farzana N, Ovi JA, Hasan MM. A comparative analysis of deep learning approaches in Bangla document categorization. In international conference on computer and information technology (ICCIT) 2023 (pp. 1-6). IEEE.

[12]Qiu R, Tu Y, Wang YS, Yen PY, Shen HW. DocFlow: a visual analytics system for question-based document retrieval and categorization. IEEE Transactions on Visualization and Computer Graphics. 2022; 30(2):1533-48.

[13]BJ BN, Yadhukrishnan S. A comparative study on document images classification using logistic regression and multiple linear regressions. In second international conference on augmented intelligence and sustainable systems (ICAISS) 2023 (pp. 1096-104). IEEE.

[14]Salman AH, Al-Jawher W. Enhanced document classification using ensemble techniques. In 16th international conference on developments in esystems engineering (DeSE) 2023 (pp. 743-7). IEEE.

[15]Shabaninia E, sadat Eslami F, Afkari-Fahandari A, Nezamabadi-pour H. SUT: a new multi-purpose synthetic dataset for Farsi document image analysis. In 13th international conference on computer and knowledge engineering (ICCKE) 2023 (pp. 253-8). IEEE.

[16]Yang J, Wei F, Huber-Fliflet N, Dabrowski A, Mao Q, Qin H. An empirical analysis of text segmentation for bert classification in extended documents. In international conference on big data (BigData) 2023 (pp. 2793-7). IEEE.

[17]Warinthaksa W, Yimyam W, Ketcham M, Ganokratanaa T, Pinthong T. Examining spurious information through text categorization methods. In IEEE international conference on cybernetics and innovations (ICCI) 2024 (pp. 1-4). IEEE.

[18]Alias N, Abd Rahman N, Alias MN, Nor ZM, Ahmad NA, Ismail NK. Tagging algorithm and POS tags for narrators name in hadith document. In 4th international conference on artificial intelligence and data sciences (AiDAS) 2023 (pp. 126-30). IEEE.

[19]Naseeba B, Challa NP, Doppalapudi A, Chirag S, Nair NS. Machine learning models for news article classification. In 5th international conference on smart systems and inventive technology (ICSSIT) 2023 (pp. 1009-16). IEEE.

[20]Mohan P, Shanthi MB, Disha DN, Rao S. Advances in natural language processing and deep learning for document summarization. In 2023 international conference on integrated intelligence and communication systems (ICIICS) 2023 Nov 24 (pp. 1-6). IEEE.