Prediction of mathematics performance using educational data mining techniques
Paul K Mushi and Daniel Ngondya
Abstract
Higher Learning Institutions (HLIs) nowadays store a large amount of students’ data. However, these data are not widely used to solve the students’ academic problems at the institutions such as poor performance in some courses. Educational Data Mining (EDM) is a technology that can be applied to predict the performance of students from the dataset at HLIs. This study intended to solve the problem of poor performance in mathematics by management degree students at HLIs using EDM techniques and Mzumbe University (MU) in Morogoro, Tanzania as a case study. A quantitative research approach was applied based on the design science steps. Secondary data were collected to create the dataset through a review of documents from the examination, admission, accommodation, and accounts offices, as well as the Department of Mathematics and Statistics from the Main and Mbeya campuses of MU. Different Machine Learning (ML) algorithms were applied on the training set (60%) such as K-Nearest Neighbor (K-NN), Random Forest (RF), Decision Tree (DT), Support Vector Classification (SVC), and Multilayer Perceptron (MLP). Machine Learning algorithms were validated using a 10-fold cross-validation and validation dataset (20%) and the best algorithms were established to be RF, DT, and K-NN. Further evaluation of these three ML algorithms using 20% of the dataset demonstrated that the RF algorithm was the best for model development for the prediction of mathematics performance with an accuracy of 99% and F1-scores of 99% and 100% for the fail and pass classes respectively. Moreover, DT could generate rules that can be applied to recommend the minimum grade of D in ordinary level mathematics for admission into the University for Management Degrees to reduce the failure rates at HLIs.
Keyword
Educational data mining, Machine learning, Random forest, Decision tree, Mathematics, Performance prediction.
Cite this article
Mushi PK, Ngondya D.Prediction of mathematics performance using educational data mining techniques. International Journal of Advanced Computer Research. 2021;11(56):83-102. DOI:10.19101/IJACR.2021.1152024
Refference
[1]https://www.vistacollege.edu/blog/resources/higher-education-in-the-21st-century. Accessed 13 January 2020.
[2]Osmanbegovic E, Suljic M. Data mining approach for predicting student performance. Economic Review: Journal of Economics and Business. 2012; 10(1):3-12.
[3]Ashenafi MM. A comparative analysis of selected studies in student performance prediction. International Journal of Data Mining & Knowledge Management Process. 2017; 7(4):17-32.
[4]http://hdl.handle.net/1822/8024. Accessed 13 January 2020.
[5]Sivasakthi M. Classification and prediction based data mining algorithms to predict students introductory programming performance. In international conference on inventive computing and informatics 2017 (pp. 346-50). IEEE.
[6]Olatunji SO, Aghimien DO, Oke AE, Olushola E. Factors affecting performance of undergraduate students in construction related disciplines. Journal of Education and Practice. 2016; 7(13):55-62.
[7]Fussy D. The status of academic advising in Tanzanian universities. KJEP.2018; 15(1):81-98.
[8]https://www.tcu.go.tz/sites/default/files/Admission%20Guidebook%20Direct%20Entry%2006.10.2020.pdf. Accessed 13 January 2020.
[9]Shin D, Shim J. A systematic review on data mining for mathematics and science education. International Journal of Science & Mathematics Education. 2021; 19(4):639-59.
[10]Sokkhey P, Navy S, Tong L, Okazaki T. Multi-models of educational data mining for predicting student performance in mathematics: a case study on high schools in Cambodia. IEIE Transactions on Smart Processing and Computing. 2020; 9(3):217-29.
[11]Ingale NV. Survey on prediction system for student academic performance using educational data mining. Turkish Journal of Computer and Mathematics Education. 2021; 12(13):363-9.
[12]Ma X, Zhou Z. Student pass rates prediction using optimized support vector machine and decision tree. In 8th annual computing and communication workshop and conference 2018 (pp. 209-15). IEEE.
[13]Ünal F. Data mining for student performance prediction in education. Data Mining-Methods, Applications and Systems. 2020.
[14]Nikam SS. A comparative study of classification techniques in data mining algorithms. Oriental Journal of Computer Science & Technology. 2015; 8(1):13-9.
[15]Pandey M, Taruna S. A comparative study of ensemble methods for students performance modeling. International Journal of Computer Applications. 2014; 103(8):26-32.
[16]Yadav SK, Pal S. Data mining: a prediction for performance improvement of engineering students using classification. arXiv preprint arXiv:1203.3832. 2012.
[17]Widyahastuti F, Tjhin VU. Predicting students performance in final examination using linear regression and multilayer perceptron. In international conference on human system interactions 2017 (pp. 188-92). IEEE.
[18]Kumah MS, Region V, Akpandja TK, Region V, Djondo BI, Region V. Factors contributing to the poor performance in mathematics: a case study among students in colleges of Education-Ghana. ResearchJournali s Journal of Mathematics. 2016; 3(2):1-2.
[19]Okwute AO, Musa CD. Investigating achievement of continuous assessment techniques on students discovery in mathematics performance at SSSI level at GDSS bwari, bwari area council, Abuja, FCT. International Journal of Innovative Research in Information Security. 2018; 5(06):2014-9.
[20]Saa AA, Al-Emran M, Shaalan K. Factors affecting students’ performance in higher education: a systematic review of predictive data mining techniques. Technology, Knowledge and Learning. 2019; 24(4):567-98.
[21]Kumar M, Singh AJ, Handa D. Literature survey on student’s performance prediction in education using data mining techniques. International Journal of Education and Management Engineering. 2017; 7(6):40-9.
[22]Slater S, Joksimović S, Kovanovic V, Baker RS, Gasevic D. Tools for educational data mining: a review. Journal of Educational and Behavioral Statistics. 2017; 42(1):85-106.
[23]Zaffar M, Savita KS, Hashmani MA, Rizvi SS. A study of feature selection algorithms for predicting students academic performance. International Journal of Advanced Computer Science and Applications (IJACSA). 2018; 9(5):541-9.
[24]Mduma N, Kalegele K, Machuve D. Machine learning approach for reducing students dropout rates. International Journal of Advanced Computer Research.2019; 9(42):156-69.
[25]Roy S, Garg A. Analyzing performance of students by using data mining techniques a literature survey. In uttar pradesh section international conference on electrical, computer and electronics 2017 (pp. 130-3). IEEE.
[26]Shahiri AM, Husain W. A review on predicting students performance using data mining techniques. Procedia Computer Science. 2015; 72:414-22.
[27]https://www.researchgate.net/post/What_is_the_best_way_to_divide_a_dataset_into_training_and_test_sets. Accessed 08 February 2020.
[28]Saa AA. Educational data mining & students’ performance prediction. International Journal of Advanced Computer Science and Applications. 2016; 7(5):212-20.
[29]https://medium.com/@eijaz/holdout-vs-cross-validation-in-machine-learning-7637112d3f8f. Accessed 08 February 2020.
[30]Hossin M, Sulaiman MN. A review on evaluation metrics for data classification evaluations. International Journal of Data Mining & Knowledge Management Process. 2015; 5(2):1-11.
[31]https://medium.com/analytics-vidhya/performance-metrics-for-classification-problem-74935775cdca. Accessed 20 February 2020.
[32]Rustia RA, Cruz MM, Burac MA, Palaoag TD. Predicting students board examination performance using classification algorithms. In proceedings of the 2018 7th international conference on software and computer applications 2018 (pp. 233-7).
[33]Zohair LM. Prediction of Students performance by modelling small dataset size. International Journal of Educational Technology in Higher Education. 2019; 16(1):1-8.
[34]Livieris IE, Drakopoulou K, Pintelas P. Predicting students performance using artificial neural networks. In panhellenic conference with international participation information and communication technologies in education 2012 (pp. 321-8).
[35]Vihavainen A, Luukkainen M, Kurhila J. Using students programming behavior to predict success in an introductory mathematics course. In Educational Data Mining 2013.