International Journal of Advanced Technology and Engineering Exploration (IJATEE) ISSN (P): 2394-5443 ISSN (O): 2394-7454 Vol - 8, Issue - 84, November 2021
  1. 1
    Google Scholar
An application of logistic model tree (LMT) algorithm to ameliorate Prediction accuracy of meteorological data

Sheikh Amir Fayaz, Majid Zaman and Muheet Ahmed Butt

Abstract

Traditional and ensemble methods are linear models which are considered the most popular techniques for various learning tasks for the prediction of both nominal and numerical values. In this study, we demonstrate the novel concept and working of an algorithm, which customizes the idea of various classification problems with the use of logistic regression in place of linear regression, called a Logistic Model Tree (LMT) algorithm. This study briefly describes the analytical and mathematical implementation of LMT on geographical data for the prediction of rainfall. A step-wise approach is used for the construction of a LMT, which involves a decision tree inducer (C4.5) for the splitting criteria and logistic regression functions for the pruning in which standard regression errors using Cost-Complexity Pruning (CCP) are calculated at each node. This work assesses the abilities of the LMT for the prediction of rainfall across the Kashmir province of the Union Territory of Jammu & Kashmir, India. The implementation methodology was prepared based on six years of historical-geographical data of Kashmir province. It was collected from three different substations having four explanatory independent variables, namely: max temp, min temp and humidity measured at 12 A.M and 3 P.M, moreover a target variable indicating presence and absence of rain. The overall result shows that LMT performs better with the accuracy of 87.23%. At the later stage, we compared the performance of LMT to several algorithms on the same set of data, and show that LMT produces more accurate and compact results.

Keyword

C4.5, Logistic regression, Logistic model tree, Pruning, Information gain.

Cite this article

Fayaz SA, Zaman M, Butt MA

Refference

[1][1]Zaman M, Kaul S, Ahmed M. Analytical comparison between the information gain and gini index using historical geographical data. International Journal of Advanced Computer Science and Applications. 2020; 11(5):429-40.

[2][2]Zamani NW, Khairi SS. A comparative study on data mining techniques for rainfall prediction in Subang. In AIP conference proceedings 2018. AIP Publishing LLC.

[3][3]Fayaz SA, Zaman M, Butt MA. Knowledge discovery in geographical sciences—a systematic survey of various machine learning algorithms for rainfall prediction. In international conference on innovative computing and communications 2022 (pp. 593-608). Springer, Singapore.

[4][4]Barros RC, Ruiz DD, Basgalupp MP. Evolutionary model trees for handling continuous classes in machine learning. Information Sciences. 2011; 181(5):954-71.

[5][5]Onyari EK, Ilunga FM. Application of MLP neural network and M5P model tree in predicting streamflow: a case study of Luvuvhu catchment, South Africa. International Journal of Innovation, Management and Technology. 2013; 4(1):11-5.

[6][6]Malerba D, Esposito F, Ceci M, Appice A. Top-down induction of model trees with regression and splitting nodes. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2004; 26(5):612-25.

[7][7]Holmes G, Hall M, Prank E. Generating rule sets from model trees. In Australasian joint conference on artificial intelligence 1999 (pp. 1-12). Springer, Berlin, Heidelberg.

[8][8]Rokach L, Maimon OZ. Data mining with decision trees: theory and applications. World Scientific; 2007.

[9][9]Ashraf M, Zaman M, Ahmed M. An intelligent prediction system for educational data mining based on ensemble and filtering approaches. Procedia Computer Science. 2020; 167:1471-83.

[10][10]Hassan M, Butt MA, Baba MZ. Logistic regression versus neural networks: the best accuracy in prediction of diabetes disease. Asian Journal of Computer Science and Technology. 2017; 6:33-42.

[11][11]Quinlan JR. Simplifying decision trees. International Journal of Man-Machine Studies. 1987; 27(3):221-34.

[12][12]Rokach L, Maimon O. Decision trees. In Data Mining and Knowledge Discovery Handbook 2005.

[13][13]Fayaz SA, Zaman M, Butt MA. Performance evaluation of GINI index and information gain criteria on geographical data: an empirical study based on JAVA and python. In international conference on innovative computing and communications 2022 (pp. 249-65). Springer, Singapore.

[14][14]Samadi M, Jabbari E, Azamathulla HM. Assessment of M5′ model tree and classification and regression trees for prediction of scour depth below free overfall spillways. Neural Computing and Applications. 2014; 24(2):357-66.

[15][15]Mahboobi E. The effect of sediment size on maximum scour depth in plunge pool (Unpublished Master’s Thesis). University of Science and Technology, Tehran, Iran. 1997.

[16][16]Azar FA. Effect of sediment size distribution on scour downstream of free overfall Spillway. Unpublished master’s thesis). Tarbiat Modares University, Tehran, Iran. 1998.

[17][17]Raza K. M5 model tree and gene expression programming for the prediction of metrological parameters. In international conference on computers, communications, and systems 2015 (pp. 47-51). IEEE.

[18][18]Kisi O, Shiri J, Demir V. Hydrological time series forecasting using three different heuristic regression techniques. In Handbook of Neural Computation 2017. Academic Press.

[19][19]Rezaie-balf M, Naganna SR, Ghaemi A, Deka PC. Wavelet coupled MARS and M5 Model Tree approaches for groundwater level forecasting. Journal of Hydrology. 2017; 553:356-73.

[20][20]Kaya YZ, Üneş F, Demirci M, Taşar B, Varçin H. Groundwater level prediction using artificial neural network and M5 tree models. Air and Water. Environmental Components. 2018: 195-201.

[21][21]Nourani V, Davanlou TA, Molajou A, Gokcekus H. Hybrid wavelet-M5 model tree for rainfall-runoff modeling. Journal of Hydrologic Engineering. 2019; 24(5).

[22][22]Bahmani R, Solgi A, Ouarda TB. Groundwater level simulation using gene expression programming and M5 model tree combined with wavelet transform. Hydrological Sciences Journal. 2020; 65(8):1430-42.

[23][23]Adnan RM, Petroselli A, Heddam S, Santos CA, Kisi O. Comparison of different methodologies for rainfall–runoff modeling: machine learning vs conceptual approach. Natural Hazards. 2021; 105(3):2987-3011.

[24][24]Quinlan JR. Learning with continuous classes. In 5th Australian joint conference on artificial intelligence 1992 (pp. 343-8).

[25][25]Landwehr N, Hall M, Frank E. Logistic model trees. Machine Learning. 2005; 59(1-2):161-205.

[26][26]Frank E, Wang Y, Inglis S, Holmes G, Witten IH. Using model trees for classification. Machine Learning. 1998; 32(1):63-76.

[27][27]Mohd R, Butt MA, Baba MZ. Grey wolf-based linear regression model for rainfall prediction. International Journal of Information Technologies and Systems Approach. 2022; 15(1):1-8.

[28][28]Wang Y, Witten IH. Induction of model trees for predicting continuous classes. University of Waikato Research.1996.

[29][29]Altaf I, Butt MA, Zaman M. A pragmatic comparison of supervised machine learning classifiers for disease diagnosis. In third international conference on inventive research in computing applications 2021 (pp. 1515-20). IEEE.

[30][30]Zaman M, Butt MA. Information translation: a practitioners approach. In proceedings of the world congress on engineering and computer science 2012.

[31][31]Ashraf M, Zaman M, Ahmed M. To ameliorate classification accuracy using ensemble vote approach and base classifiers. In emerging technologies in data mining and information security 2019 (pp. 321-34). Springer, Singapore.

[32][32]Ashraf M, Zaman M, Ahmed M. Performance analysis and different subject combinations: an empirical and analytical discourse of educational data mining. In international conference on cloud computing, data science & engineering (confluence) 2018 (pp. 287-92). IEEE.

[33][33]Ashraf M, Zaman M, Ahmed M. Using ensemble stackingC method and base classifiers to ameliorate prediction accuracy of pedagogical data. Procedia Computer Science. 2018; 132:1021-40.

[34][34]Mohd R, Butt MA, Baba MZ. SALM-NARX: self adaptive LM-based NARX model for the prediction of rainfall. In international conference on I-SMAC (IoT in social mobile, analytics and cloud) 2018 (pp. 580-5). IEEE.

[35][35]Mohd R, Butt MA, Baba MZ. GWLM–NARX: grey wolf levenberg–marquardt-based neural network for rainfall prediction. Data Technologies and Applications. 2020; 54(1):85-102.

[36][36]Aljawarneh S, Yassein MB, Aljundi M. An enhanced J48 classification algorithm for the anomaly intrusion detection systems. Cluster Computing. 2019; 22(5):10549-65.

[37][37]Sidiq SJ, Zaman M, Ahmed M. How machine learning is redefining geographical science: a review of literature. Journal of Emerging Technologies and Innovative Research. 2019; 6(1):1731-46.

[38][38]Fayaz SA, Zaman M, Butt MA. To ameliorate classification accuracy using ensemble distributed decision tree (DDT) vote approach: an empirical discourse of geographical data mining. Procedia Computer Science. 2021; 184:935-40.

[39][39]Fayaz SA, Altaf I, Khan AN, Wani ZH. A possible solution to grid security issue using authentication: an overview. Journal of Web Engineering & Technology. 2019; 5(3):10-4.

[40][40]Zaman M, Quadri SM, Butt MA. Generic search optimization for heterogeneous data sources. International Journal of Computer Applications. 2012; 44(5):14-7.

[41][41]Zaman M, Butt MA. Enterprise data backup & recovery: a generic approach. International Organization of Scientific Research Journal of Engineering. 2013.

[42][42]Zaman M, Butt MA. Enterprise management information system: design & architecture. International Journal of Computational Engineering Research. 2013; 2250:3005.

[43][43]Mohammad R, Ahmed MB, Zaman MB. Predictive analytics: an application perspective. International Journal of Computer Engineering and Applications. 2017; 11(8).

[44][44]Nayak D, Butt EM. Empowering cloud security through SLA. Journal of Global Research in Computer Science. 2013; 4(1):30-3.

[45][45]Hussain MW, Jamwal S, Zaman M. Congestion control techniques in a computer network: a survey. International Journal of Computer Applications. 2015; 111(2):7-10.

[46][46]Butt, EM, Quadri, SM, Zaman, EM. Star schema implementation for automation of examination records. In proceedings of the international conference on frontiers in education: computer science and computer engineering (FECS) 2012.

[47][47]Altaf I, Butt MA, Zaman M. Disease detection and prediction using the liver function test data: a review of machine learning algorithms. In international conference on innovative computing and communications 2022 (pp. 785-800). Springer, Singapore.