International Journal of Advanced Computer Research (IJACR) ISSN (P): 2249-7277 ISSN (O): 2277-7970 Vol - 8, Issue - 35, March 2018
  1. 1
    Google Scholar
  2. 4
    Impact Factor
Arabic root extraction using a hybrid technique

Hayel Khafajeh, Nidal Yousef and Mahmoud Abdeldeen

Abstract

Root extraction is one of the main text operations conducted by converting the conflation into its root. This process aims to overcome the morphological richness problem of the Arabic language. Root extraction gives a valuable support to many natural language processing applications such as information retrieval, machine translation, and text-summarizing applications. In this research, a hybrid technique to extract Arabic word roots has been developed. The proposed technique depends on optimization function, which is the enhancing process performed by playing a set of non-morphological rules to enhance the n-gram technique. The proposed technique is tested using a dataset containing more than 6000 distinguished words belonging to 141 different roots. The results show a marked improvement after using the hybrid method, the proposed technique extracts correctly about 99% of tripartite strong roots and about 86% of tripartite vowels roots.

Keyword

Arabic root extraction, Natural language processing, Hybrid technique, Similarity.

Cite this article

Refference

[1][1]Abu-Errub A, Odeh A, Shambour Q, Hassan OA. Arabic roots extraction using morphological analysis. International Journal of Computer Science Issues. 2014; 11(2):128-34.

[2][2]Elazhary H, Alharthi A, Balkhi E, Aljahdali G, Zagzoog D, Alkhammsh A. Automated tutoring of Arabic word root extraction. International Journal of Scientific & Engineering Research. 2015; 6(7):687-91.

[3][3]Wightwick J. Arabic verbs & essentials of grammar. McGraw Hill Professional; 2017.

[4][4]Yousef N, Al-Bidewi I, Fayoumi M. Evaluation of different query expansion techniques and using different similarity measures in Arabic documents. International Journal of Computer Science Issues. 2010; 43:156-66.

[5][5]Al-Fedaghi S, Al-Anzi F. A new algorithm to generate Arabic root-pattern forms. In proceedings of the national computer conference and exhibition 1989 (pp. 391-400).

[6][6]Alsaad A, Abbod M. Arabic text root extraction via morphological analysis and linguistic constraints. In international conference on computer modelling and simulation 2014 (pp. 125-30). IEEE.

[7][7]Hawas FA. Towards a new Approach for Arabic root extraction: exploit relations between the word letters and their placement in the word for Arabic root extraction. Computer Science. 2013; 14(2):327-41.

[8][8]Mustafa SH. A relational approach to the design of an Arabic lexical database. Journal of King Saud University-Computer and Information Sciences. 2002; 14:1-23.

[9][9]Jurjani AI, Al-Sharif AS. Kitab al-Tarifat. Al-Hakawati; 2014.

[10][10]Anis I. Min Asrär Al-Lugah. Among Language Secrets.1975.

[11][11]R. M. Baalbaki, Comparative philology of the Arabic language. Beirut House of Science for Millions, 1999.

[12][12]De Roeck AN, Al-Fares W. A morphologically sensitive clustering algorithm for identifying Arabic roots. In proceedings of the annual meeting on association for computational linguistics 2000 (pp. 199-206). Association for Computational Linguistics.

[13][13]Al Ameed H, Al Ketbi S, Al Kaabi A, Al Shebli K, Al Shamsi N, Al Nuaimi N, et al. Arabic light stemmer: a new enhanced approach. In the second international conference on innovations in information technology. 2005 (pp. 1-9).

[14][14]Boudlal A, Bebah MO, Lakhouaja A, Mazroui A, Meziane A. A markovian approach for Arabic root extraction. The International Arab Journal of Information Technology. 2011; 8(1):91-8.

[15][15]Hmeidi II, Al‐Shalabi RF, Al‐Taani AT, Najadat H, Al‐Hazaimeh SA. A novel approach to the extraction of roots from Arabic words using bigrams. Journal of the Association for Information Science and Technology. 2010; 61(3):583-91.

[16][16]Al-Kabi MN, Kazakzeh SA, Ata BM, Al-Rababah SA, Alsmadi IM. A novel root based Arabic stemmer. Journal of King Saud University-Computer and Information Sciences. 2015; 27(2):94-103.

[17][17]Abuata B, Al-Omari A. A rule-based stemmer for Arabic Gulf dialect. Journal of King Saud University-Computer and Information Sciences. 2015; 27(2):104-12.

[18][18]Boubas A, Lulu L, Belkhouche B, Harous S. GENESTEM: a novel approach for an Arabic stemmer using genetic algorithms. In international conference on innovations in information technology 2011 (pp. 77-82). IEEE.

[19][19]Frakes WB, Baeza-Yates R. Information retrieval: data structures and algorithms. Englewood Cliffs, New Jersey: Prentice Hall; 1992.

[20][20]Yousef N, Abu-Errub A, Odeh A, Khafajeh H. An improved Arabic words roots extraction method using n-gram technique. Journal of Computer Science. 2014; 10(4):716-9.

[21][21]Ababneh M, Al-Shalabi R, Kanaan G, Al-Nobani A. Building an effective rule-based light stemmer for Arabic language to improve search effectiveness. International Arab Journal of Information Technology. 2012; 9(4):368-72.

[22][22]Al-Kabi M, Al-Mustafa R. Arabic root based stemmer. In proceedings of the international Arab conference on information technology, Jordan 2006 (pp. 1-7).