Generation of relation-extraction-rules based on Markov logic network for document classification
M.D.S Seneviratne, K.S.D Fernando and D.D Karunaratne
Abstract
Classifying documents into predefined classes is a very necessary task, especially in extracting information from huge resources such as web. Although a considerable amount of work has been carried out to classify documents into groups according to the subject domain or according to the other attributes. It still prevails as a big challenge in large scale, high dimensional document space. A number of techniques have been presented and proceeded with suggested improvements in order to achieve a higher degree of success in the document class. In this paper, a novel rule-based method for document classification with a combination of relation extraction techniques have been proposed. It is possible to replace overwhelming text classification techniques which involve thousands of words, document features or numerous patterns of word combinations by a set of rules which involves a much smaller number of entities and relations. We further discuss the effectiveness of relation extraction rules in document classification with the use of Markov logic networks for learning the weights of rules efficiently. Our experimental results show that the use of relation extraction rules on document classification yields a very high precision in the selected domain. We also demonstrate the applicability of our method on a benchmark text corpus with good performance measures.
Keyword
Document classification, Relation extraction, Entity, Markov logic network, Relation.
Cite this article
Seneviratne M, Fernando K, Karunaratne D.Generation of relation-extraction-rules based on Markov logic network for document classification. International Journal of Advanced Computer Research. 2019;9(41):94-111. DOI:10.19101/IJACR.2018.838015
Refference
[1]Aggarwal CC, Zhai C. A survey of text classification algorithms. In mining text data 2012 (pp. 163-222). Springer, Boston, MA.
[2]McCallum A, Nigam K. A comparison of event models for naive Bayes text classification. In AAAI-98 workshop on learning for text categorization 1998 (pp. 41-8).
[3]Joachims T. Text categorization with support vector machines: learning with many relevant features. In European conference on machine learning 1998 (pp. 137-42). Springer, Berlin, Heidelberg.
[4]Sebastiani F. Machine learning in automated text categorization. ACM Computing Surveys (CSUR). 2002; 34(1):1-47.
[5]Ma BL, Liu B, Ma Y. Integrating classification and association rule mining. In proceedings of the fourth international conference on knowledge discovery and data mining 1998.
[6]Zaiane OR, Antonie ML. Classifying text documents by associating terms with text categories. Australian Computer Science Communications 2002; 24(2): 215-22.
[7]Haralambous Y, Lenca P. Text classification using association rules, dependency pruning and hyperonymization. arXiv preprint arXiv:1407.7357. 2014.
[8]Apte C, Damerau F, Weiss SM. Automated learning of decision rules for text categorization. ACM Transactions on Information Systems. 1994; 12(3):233-51.
[9]Kumar DM. Automatic induction of rule-based text categorization. International Journal of Computer Science & Information Technology. 2010; 2(6):163-72.
[10]Han H, Manavoglu E, Giles CL, Zha H. Rule-based word clustering for text classification. In proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval 2003 (pp. 445-6). ACM.
[11]Agnihotri D, Verma K, Tripathi P. Variable global feature selection scheme for automatic classification of text documents. Expert Systems with Applications. 2017; 81:268-81.
[12]Tang B, Kay S, He H. Toward optimal feature selection in Naive Bayes for text categorization. IEEE Transactions on Knowledge and Data Engineering. 2016; 28(9):2508-21.
[13]Armanfard N, Reilly JP, Komeili M. Local feature selection for data classification. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2016; 38(6):1217-27.
[14]Cohen WW, Singer Y. Context-sensitive learning methods for text categorization. ACM Transactions on Information Systems. 1999; 17(2):141-73.
[15]Cohen WW. Learning rules that classify e-mail. In AAAI spring symposium on machine learning in information access 1996 (pp.18-25).
[16]Cohen W.Learning Set-Values Features. AAAI Conference, 1996.
[17]Sasaki M, Kita K. Rule-based text categorization using hierarchical categories. IEEE international conference on systems, man, and cybernetics 1998 (pp. 2827-30). IEEE.
[18]Popov B, Kiryakov A, Ognyanoff D, Manov D, Kirilov A, Goranov M. Towards semantic web information extraction. In human language technologies workshop at the 2nd international semantic web conference 2003.
[19]Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH. The WEKA data mining software: an update. ACM SIGKDD Explorations Newsletter. 2009; 11(1):10-8.
[20]Choi E, Kwiatkowski T, Zettlemoyer L. Scalable semantic parsing with partial ontologies. In proceedings of the international joint conference on annual meeting of the association for computational linguistics and the natural language processing 2015 (pp. 1311-20).
[21]Yih SW, Chang MW, He X, Gao J. Semantic parsing via staged query graph generation: question answering with knowledge base. Proceedings of the joint conference of the 53rd annual meeting of the ACL and the 7th international joint conference on natural language processing of the AFNLP. 2015.
[22]Seneviratne MD, Ranasinghe DN. Inductive logic programming in an agent system for ontological relation extraction. International Journal of Machine Learning and Computing. 2011; 1(4):344-52.
[23]Seneviratne MD, Ranasinghe DN. Natural language dependencies for ontological relation extraction. In international conference on advances in ICT for emerging regions 2014 (pp. 142-8). IEEE.
[24]Richardson M, Domingos P. Markov logic networks. Machine Learning. 2006; 62(1-2):107-36.
[25]Shewchuk JR. An introduction to the conjugate gradient method without the agonizing pain. 1994.
[26]Moller MF. A scaled conjugate gradient algorithm for fast supervised learning. Neural Networks. 1993; 6(4):525-33.
[27]Rennie JD, Shih L, Teevan J, Karger DR. Tackling the poor assumptions of naive bayes text classifiers. In proceedings of the international conference on machine learning 2003 (pp. 616-23).
[28]Tang B, He H, Baggenstoss PM, Kay S. A Bayesian classification approach using class-specific features for text categorization. IEEE Transactions on Knowledge and Data Engineering. 2016; 28(6):1602-6.
[29]Furey TS, Cristianini N, Duffy N, Bednarski DW, Schummer M, Haussler D. Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics. 2000; 16(10):906-14.
[30]Drucker H, Wu D, Vapnik VN. Support vector machines for spam categorization. IEEE Transactions on Neural Networks. 1999; 10(5):1048-54.
[31]Joachims T. A probabilistic analysis of the Rocchio algorithm with TFIDF for text categorization. Carnegie Mellon University, Department of Computer Science; 1996.
[32]Lewis DD, Schapire RE, Callan JP, Papka R. Training algorithms for linear text classifiers. In proceedings of the annual international SIGIR conference on research and development in information retrieval 1996 (pp. 298-306). ACM.
[33]Cunningham P, Delany SJ. K-Nearest neighbour classifiers. Multiple Classifier Systems. 2007; 34(8):1-17.
[34]Tan S, Cheng X. An effective approach to enhance centroid classifier for text categorization. In European conference on principles of data mining and knowledge discovery 2007 (pp. 581-8). Springer, Berlin, Heidelberg.
[35]Pang G, Jin H, Jiang S. CenKNN: a scalable and effective text classifier. Data Mining and Knowledge Discovery. 2015; 29(3):593-625.
[36]Lam W, Han Y. Automatic textual document categorization based on generalized instance sets and a metamodel. IEEE Transactions on Pattern Analysis & Machine Intelligence. 2003; 25(5):628-33.
[37]Weiss SM, Indurkhya N. Optimized rule induction. IEEE Expert. 1993; 8(6):61-9.
[38]Riloff E, Lehnert W. Information extraction as a basis for high-precision text classification. ACM Transactions on Information Systems. 1994; 12(3):296-333.
[39]Singla P, Domingos P. Discriminative training of Markov logic networks. In AAAI 2005 (pp. 868-73).
[40]Lowd D, Domingos P. Efficient weight learning for Markov logic networks. In European conference on principles of data mining and knowledge discovery 2007 (pp. 200-11). Springer, Berlin, Heidelberg.
[41]https://en.wikipedia.org/wiki/List_of_birds_by_common_name. Accessed 12 May 2018.