International Journal of Advanced Computer Research (IJACR) ISSN (P): 2249-7277 ISSN (O): 2277-7970 Vol - 8, Issue - 39, November 2018
  1. 1
    Google Scholar
  2. 4
    Impact Factor
Word similarity score as augmented feature in sarcasm detection using deep learning

Joseph Tarigan and Abba Suganda Girsang

Abstract

Sarcasm detection is an important task in natural language processing (NLP). Sarcasm flips the polarity of a sentence and will affect the accuracy of sentiment analysis task. Recent researches incorporate machine learning and deep learning methods to detect sarcasm. Sarcasm can be detected by the occurrence of context disparity. This feature can be detected by observing the similarity score of each word in the sentence. Word embedding vector is used to calculate word similarity score. In this work, the word similarity score is incorporated as an augmented feature in the deep learning model. Three augmenting schemes in deep learning models are observed. Results show that in general, a word similarity score boosts the performance of the classifier. The accuracy of 85.625% with F-Measure of 84.884% was achieved at its best.

Keyword

Sarcasm detection, Word incongruity, Deep learning, Augmented feature.

Cite this article

Refference

[1][1]Oxford Online Dictionaries. https://en.oxforddictionaries.com/definition/sarcasm. Accessed 9 October 2018.

[2][2]Harris ZS. Distributional structure. Word. 1954; 10(2-3):146-62.

[3][3]Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J. Distributed representations of words and phrases and their compositionality. In advances in neural information processing systems 2013 (pp. 3111-9).

[4][4]Joshi A, Sharma V, Bhattacharyya P. Harnessing context incongruity for sarcasm detection. In proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing 2015 (pp. 757-62). Association for Computational Linguistics.

[5][5]Poria S, Cambria E, Hazarika D, Vij P. A deeper look into sarcastic tweets using deep convolutional neural networks. In proceedings of international conference on computational linguistics 2016 (pp. 1601-12). The COLING.

[6][6]Severyn A, Moschitti A. Twitter sentiment analysis with deep convolutional neural networks. In proceedings of the international ACM SIGIR conference on research and development in information retrieval 2015 (pp. 959-62). ACM.

[7][7]Majumder N, Poria S, Gelbukh A, Cambria E. Deep learning-based document modeling for personality detection from text. IEEE Intelligent Systems. 2017; 32(2):74-9.

[8][8]Lin J. Scalable language processing algorithms for the masses: a case study in computing word co-occurrence matrices with MapReduce. In proceedings of the conference on empirical methods in natural language processing 2008 (pp. 419-28). Association for Computational Linguistics.

[9][9]Ivanko SL, Pexman PM. Context incongruity and irony processing. Discourse Processes. 2003; 35(3):241-79.

[10][10]Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781. 2013.

[11][11]Joshi A, Tripathi V, Patel K, Bhattacharyya P, Carman M. Are word embedding-based features useful for sarcasm detection?. arXiv preprint arXiv:1610.00883. 2016.

[12][12]Lunando E, Purwarianti A. Indonesian social media sentiment analysis with sarcasm detection. In international conference on advanced computer science and information systems 2013 (pp. 195-8). IEEE.

[13][13]Allcott H, Gentzkow M. Social media and fake news in the 2016 election. Journal of Economic Perspectives. 2017; 31(2):211-36.

[14][14]LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015; 521:436-44.

[15][15]Liu B, Wang X, Dixit M, Kwitt R, Vasconcelos N. Feature space transfer for data augmentation. In proceedings of the conference on computer vision and pattern recognition 2018 (pp. 9090-8). IEEE.

[16][16]Volpi R, Morerio P, Savarese S, Murino V. Adversarial feature augmentation for unsupervised domain adaptation. In proceedings of the conference on computer vision and pattern recognition 2018 (pp. 5495-504). IEEE.

[17][17]Salamon J, Bello JP. Deep convolutional neural networks and data augmentation for environmental sound classification. IEEE Signal Processing Letters. 2017; 24(3):279-83.

[18][18]Valkonen M, Kartasalo K, Liimatainen K, Nykter M, Latonen L, Ruusuvuori P. Dual structured convolutional neural network with feature augmentation for quantitative characterization of tissue histology. In proceedings of the conference on computer vision and pattern recognition 2017(pp. 27-35). IEEE.

[19][19]Han B, Baldwin T. Lexical normalisation of short text messages: Makn sens a# twitter. In proceedings of the annual meeting of the association for computational linguistics: human language technologies (pp. 368-78). Association for Computational Linguistics.

[20][20]Tahitoe AD, Purwitasari D. Implementation of modified enhanced confix stripping stemmer for Indonesian language using corpus based stemming method. Institut Teknologi Sepuluh (ITS). 2010.

[21][21]Meyer D, Hornik K, Feinerer I. Text mining infrastructure in R. Journal of Statistical Software. 2008; 25(5):1-54.

[22][22]Tala FZ. A study of stemming effects on information retrieval in bahasa Indonesia. Institute for Logic, Language and Computation, Universiteit van Amsterdam, The Netherlands. 2003.

[23][23]Lu Y, Sakamoto K, Shibuki H, Mori T. Are deep learning methods better for twitter sentiment analysis?. In proceedings of the 23rd annual meeting of natural language processing (Japan) (pp. 787-90).