Enhancing Arabic fake news detection with a hybrid MLP-SVM approach and Doc2Vec embeddings
Noralhuda Alabid and Hawraa Ali Taher
Abstract
The rapid spread of the COVID-19 pandemic across the globe has resulted in the widespread dissemination of misinformation and fake news related to the virus. This misinformation has caused significant confusion among citizens, as well as heightened fear and anxiety. Consequently, it is crucial to develop automatic methods capable of effectively detecting such misinformation. Various machine learning (ML) and deep learning (DL) approaches have been implemented, each with its strengths and weaknesses. Combining multiple approaches has shown promising results. In this study, a hybrid model combining a multi-layer perceptron (MLP) and a support vector machine (SVM) was proposed to detect Arabic fake news using the ArCOVID19-Rumors dataset. The MLP was used to extract relevant features, while the SVM was employed to make the final classification decision. The proposed method also leverages Doc2Vec technology, a widely used document embedding technique, to extract features and convert them into numerical vectors while preserving the semantic and syntactic information of the documents. The experiment demonstrated that the hybrid MLP-SVM model outperformed related models when implemented independently. The results showed that the hybrid model achieved an accuracy of 87%, surpassing the performance of standalone MLP (84%) and SVM (83%) models. These results were validated using multiple metrics, including precision, recall, F1-score, and accuracy. The study indicates the importance of combining multiple algorithms for detecting fake news, as integrating the strengths of different techniques can lead to significantly improved classification performance.
Keyword
Fake news detection, Hybrid model, Multi-layer perceptron (MLP), Support vector machine (SVM), Doc2Vec embeddings, Arabic text classification.
Cite this article
Alabid N, Taher HA.Enhancing Arabic fake news detection with a hybrid MLP-SVM approach and Doc2Vec embeddings. International Journal of Advanced Technology and Engineering Exploration. 2024;11(121):1732-1746. DOI:10.19101/IJATEE.2024.111100949
Refference
[1]Raza S, Khan T, Chatrath V, Paulen-patterson D, Rahman M, Bamgbose O. FakeWatch: a framework for detecting fake news to ensure credible elections. Social Network Analysis and Mining. 2024; 14 (1):142.
[2]Maci S, Demata M, Mcglashan M, Seargeant P. The Routledge handbook of discourse and disinformation. Routledge; 2024.
[3]Farhoudinia B, Ozturkcan S, Kasap N. Fake news in business and management literature: a systematic review of definitions, theories, methods and implications. Aslib Journal of Information Management. 2023:1-24.
[4]Chen MY, Lai YW, Lian JW. Using deep learning models to detect fake news about COVID-19. ACM Transactions on Internet Technology. 2023; 23(2):1-23.
[5]Alsmadi I, Rice NM, O’brien MJ. Fake or not? automated detection of COVID-19 misinformation and disinformation in social networks and digital media. Computational and Mathematical Organization Theory. 2024; 30(3):187-205.
[6]Xia H, Wang Y, Zhang JZ, Zheng LJ, Kamal MM, Arya V. COVID-19 fake news detection: a hybrid CNN-BiLSTM-AM model. Technological Forecasting and Social Change. 2023; 195:122746.
[7]Liu Y, Wu YF. Fned: a deep network for fake news early detection on social media. ACM Transactions on Information Systems. 2020; 38(3):1-33.
[8]Al-sarem M, Alsaeedi A, Saeed F, Boulila W, Ameerbakhsh O. A novel hybrid deep learning model for detecting COVID-19-related rumors on social media based on LSTM and concatenated parallel CNNs. Applied Sciences. 2021; 11(17):1-17.
[9]Capuano N, Fenza G, Loia V, Nota FD. Content-based fake news detection with machine and deep learning: a systematic review. Neurocomputing. 2023; 530:91-103.
[10]Villela HF, Corrêa F, Ribeiro JS, Rabelo A, Carvalho DB. Fake news detection: a systematic literature review of machine learning algorithms and datasets. Journal on Interactive Systems. 2023; 14(1):47-58.
[11]Touahri I, Mazroui A. Survey of machine learning techniques for Arabic fake news detection. Artificial Intelligence Review. 2024; 57(6):157.
[12]Azzeh M, Qusef A, Alabboushi O. Arabic fake news detection in social media context using word embeddings and pre-trained transformers. Arabian Journal for Science and Engineering. 2024:1-4.
[13]Abd EDS, Abdelaziz A, Essam G, Mohamed SE. AraFake: a deep learning approach for Arabic fake news detection. In international mobile, intelligent, and ubiquitous computing conference 2023 (pp. 1-8). IEEE.
[14]Alotaibi T, Al-dossari H. A review of fake news detection techniques for Arabic language. International Journal of Advanced Computer Science & Applications. 2024; 15(1):392-407.
[15]Dahou A, Ewees AA, Hashim FA, Al-qaness MA, Orabi DA, Soliman EM, et al. Optimizing fake news detection for Arabic context: a multitask learning approach with transformers and an enhanced nutcracker optimization algorithm. Knowledge-Based Systems. 2023; 280:111023.
[16]Wu J, Guo J, Hooi B. Fake news in sheeps clothing: robust fake news detection against LLM-empowered style attacks. In proceedings of the 30th ACM SIGKDD conference on knowledge discovery and data mining 2024 (pp. 3367-78).
[17]Alghamdi J, Luo S, Lin Y. A comprehensive survey on machine learning approaches for fake news detection. Multimedia Tools and Applications. 2024; 83(17):51009-67.
[18]Hu L, Wei S, Zhao Z, Wu B. Deep learning for fake news detection: a comprehensive survey. AI Open. 2022; 3:133-55.
[19]Garg N, Gupta R, Kaur M, Ahmed S, Shankar H. Efficient detection and classification of orange diseases using hybrid CNN-SVM model. In international conference on disruptive technologies 2023 (pp. 721-6). IEEE.
[20]Dev DG, Bhatnagar V. Hybrid RFSVM: hybridization of SVM and random forest models for detection of fake news. Algorithms. 2024; 17(10):1-16.
[21]Sabri T, El BO, Kissi M. Comparative study of Arabic text classification using feature vectorization methods. Procedia Computer Science. 2022; 198:269-75.
[22]Nassif AB, Elnagar A, Elgendy O, Afadar Y. Arabic fake news detection based on deep contextualized embedding models. Neural Computing and Applications. 2022; 34(18):16019-32.
[23]Abd EM, Dahou A, Orabi DA, Alshathri S, Soliman EM, Ewees AA. A hybrid multitask learning framework with a fire hawk optimizer for Arabic fake news detection. Mathematics. 2023; 11(2):1-15.
[24]Himdi H, Weir G, Assiri F, Al-barhamtoshy H. Arabic fake news detection based on textual analysis. Arabian Journal for Science and Engineering. 2022; 47(8):10453-69.
[25]Bahurmuz NO, Amoudi GA, Baothman FA, Jamal AT, Alghamdi HS, Alhothali AM. Arabic rumor detection using contextual deep bidirectional language modeling. IEEE Access. 2022; 10:114907-18.
[26]Hawashin B, Althunibat A, Kanan T, Alzubi S, Sharrab Y. Improving Arabic fake news detection using optimized feature selection. In international conference on information technology 2023 (pp. 690-4). IEEE.
[27]Najadat H, Tawalbeh M, Awawdeh R. Fake news detection for Arabic headlines-articles news data using deep learning. International Journal of Electrical & Computer Engineering. 2022; 12(4): 3951-9.
[28]Harrag F, Djahli MK. Arabic fake news detection: a fact checking based deep learning approach. Transactions on Asian and Low-Resource Language Information Processing. 2022; 21(4):1-34.
[29]Ameur MS, Aliane H. AraCOVID19-MFH: Arabic COVID-19 multi-label fake news & hate speech detection dataset. Procedia Computer Science. 2021; 189:232-41.
[30]Amoudi G, Albalawi R, Baothman F, Jamal A, Alghamdi H, Alhothali A. Arabic rumor detection: a comparative study. Alexandria Engineering Journal. 2022; 61(12):12511-23.
[31]Alyoubi S, Kalkatawi M, Abukhodair F. The detection of fake news in Arabic tweets using deep learning. Applied Sciences. 2023; 13(14):1-21.
[32]Wotaifi TA, Dhannoon BN. An effective hybrid deep neural network for Arabic fake news detection. Baghdad Science Journal. 2023; 20(4):1392-401.
[33]Fouad KM, Sabbeh SF, Medhat W. Arabic fake news detection using deep learning. Computers, Materials & Continua. 2022; 71(2):1-19.
[34]Aljamel A, Khalil H, Aburawi Y. Comparative study of fine tuned BERT-based models and RNN-based models. case study: Arabic fake news detection. The International Journal of Engineering & Information Technology. 2024; 12(1):56-64.
[35]Al-zahrani L, Al-yahya M. Pre-trained language model ensemble for Arabic fake news detection. Mathematics. 2024; 12(18):1-7.
[36]Ealmandouh M, Alrahmawy MF, Eisa M, Elhoseny M, Tolba AS. Ensemble based high performance deep learning models for fake news detection. Scientific Reports. 2024; 14(1):1-24.
[37]Elsaeed E, Ouda O, Elmogy MM, Atwan A, El-daydamony E. Detecting fake news in social media using voting classifier. IEEE Access. 2021; 9:161909-25.
[38]Mitroi M, Truică CO, Apostol ES, Florea AM. Sentiment analysis using topic-document embeddings. In 16th international conference on intelligent computer communication and processing 2020 (pp. 75-82). IEEE.
[39]Rhanoui M, Mikram M, Yousfi S, Barzali S. A CNN-BiLSTM model for document-level sentiment analysis. Machine Learning and Knowledge Extraction. 2019; 1(3):832-47.
[40]Geeitha S, Aakash R, Akash G, Arvind AM, Thameem AS, Mahudapathi P, et al. Enhanced artificial neural network for spoof news detection with MLP approach. In international conference on advanced communications and machine intelligence 2022 (pp. 441-51). Singapore: Springer Nature Singapore.
[41]Haouari F, Hasanain M, Suwaileh R, Elsayed T. ArCOV19-Rumors: Arabic COVID-19 twitter dataset for misinformation detection. In proceedings of the sixth Arabic natural language processing workshop 2012 (pp.72-81).
[42]Dogru HB, Tilki S, Jamil A, Hameed AA. Deep learning-based classification of news texts using doc2vec model. In 1st international conference on artificial intelligence and data analytics 2021 (pp. 91-6). IEEE.
[43]Siamidoudaran M, İşçioğlu E. Injury severity prediction of traffic collision by applying a series of neural networks: the city of London case study. Promet-Traffic & Transportation. 2019; 31(6):643-54.
[44]Al-yahya M, Al-khalifa H, Al-baity H, Alsaeed D, Essam A. Arabic fake news detection: comparative study of neural networks and transformer‐based approaches. Complexity. 2021; 2021(1):5516945.
[45]Ashraf N, Elkazzaz F, Taha M, Nayel H, Elshishtawy T. BFCAI at SemEval-2022 task 6: multi-layer perceptron for sarcasm detection in Arabic texts. In proceedings of the 16th international workshop on semantic evaluation (SemEval-2022) 2022 (pp. 881-4).