International Journal of Advanced Technology and Engineering Exploration (IJATEE) ISSN (P): 2394-5443 ISSN (O): 2394-7454 Vol - 8, Issue - 75, February 2021
  1. 1
    Google Scholar
Development of data-to-text (D2T) on generic data using fuzzy sets

Lala Septem Riza, Muhammad Ridwan, Enjun Junaeti and Khyrina Airin Fariza Abu Samah

Abstract

Data-to-Text (D2T) is an option for translating non-linguistic data into textual form. However, along with technological developments, the various fields of data and the variety of users are one of the focuses that must be considered in the development of D2T. This study aims to develop a D2T system with input in the form of general data so that it can receive data from any field or domain, whether the data have header information, data types, rules or not. Then fuzzy rule based systems are used to interpret data in general. The system developed can produce information in the form of data summaries, newest data information, and predictive information. It is carried out in the R programming language by utilizing several available packages. Experiments are carried out by measuring the level of readability of the news generated, computation time, and comparing the results with related research. The experimental results show that the information generated is proven to represent the data provided and can be understood by the level of students even at the elementary school level, and the computation time is quite good.

Keyword

Data-to-text, Natural language generation, Machine learning, General purpose, General corpora, Fuzzy rule based system, Time-series analysis, Linear regression, Knuth-morris-pratt.

Cite this article

Riza LS, Ridwan M, Junaeti E, Samah KA

Refference

[1][1]Gerstl P. Linking linguistic and non-linguistic information. Data & knowledge engineering. 1992; 8(3):205-22.

[2][2]Reiter E. An architecture for data-to-text systems. In proceedings of the eleventh European workshop on natural language generation (ENLG 07) 2007 (pp. 97-104).

[3][3]Gkatzia D, Lemon O, Rieser V. Data-to-text generation improves decision-making under uncertainty. IEEE Computational Intelligence Magazine. 2017; 12(3):10-7.

[4][4]McDonald DD. Natural language generation. Handbook of Natural Language Processing. 2010; 2:121-44.

[5][5]Soehn JP, Zinsmeister H, Rehm G. Requirements of a user-friendly, general-purpose corpus query interface. Proceedings of the LREC Workshop Sustainability of Language Resources and Tools for Natural Language Processing. 2008 (pp. 27-32).

[6][6]Goldberg E, Driedger N, Kittredge RI. Using natural-language processing to produce weather forecasts. IEEE Expert. 1994; 9(2):45-53.

[7][7]Riza LS, Putra B, Wihardi YA, Paramita B. Data to text for generating information of weather and air quality in the R programming language. Journal of Engineering Science and Technology. 2019; 14(1):498-508.

[8][8]Reiter E, Sripada SG, Robertson R. Acquiring correct knowledge for natural language generation. Journal of Artificial Intelligence Research. 2003; 18:491-516.

[9][9]Portet F, Reiter E, Gatt A, Hunter J, Sripada S, Freer Y, Sykes C. Automatic generation of textual summaries from neonatal intensive care data. Artificial Intelligence. 2009; 173(7-8):789-816.

[10][10]Hunter J, Freer Y, Gatt A, Reiter E, Sripada S, Sykes C, Westwater D. BT-Nurse: computer generation of natural language shift summaries from complex heterogeneous medical data. Journal of the American Medical Informatics Association. 2011; 18(5):621-4.

[11][11]Kukich K. Design of a knowledge-based report generator. In meeting of the association for computational linguistics 1983 (pp. 145-50).

[12][12]Carbonell JG, Michalski RS, Mitchell TM. An overview of machine learning. Machine Learning. 1983; 1:3-23.

[13][13]Riza LS, Handian D, Megasari R, Abdullah AG, Nandiyanto AB, Nazir S. Development of R package and experimental analysis on prediction of the CO2 compressibility factor using gradient descent. Journal of Engineering Science and Technology. 2018; 13(8):2342-51.

[14][14]Riza LS, Nasrulloh IF, Junaeti E, Zain R, Nandiyanto AB. gradDescentR: An R package implementing gradient descent and its variants for regression tasks. In international conference on information technology, information systems and electrical engineering 2016 (pp. 125-9). IEEE.

[15][15]Riza LS, Rachmat AB, Munir TH, Nazir S. Genomic repeat detection using the knuth-morris-pratt algorithm on r high-performance-computing package. International Journal of Advances in Soft Computing and its Applications. 2019; 11(1):94-111.

[16][16]Riza LS, Firmansyah MI, Siregar H, Budiana D, Rosales-Pérez A. Determining strategies on playing badminton using the Knuth-Morris-Pratt algorithm. TELKOMNIKA Telecommunication Computing Electronics and Control. 2018; 16(6):2763-70.

[17][17]Riza LS, Anwar FS, Rahman EF, Abdullah CU, Nazir S. Natural language processing and levenshtein distance for generating error identification typed questions on TOEFL. Journal of Computers for Society. 2020; 1(1):1-23.

[18][18]Atilgan A, Tanriverdi C, Yucel A, Oz H, Degirmenci H. Analysis of long-term temperature data using Mann–Kendall trend test and linear regression methods: the case of the southeastern Anatolia region. Scientific Papers Series a Agronomy LX. 2017:455-62.

[19][19]Régnier M. Knuth-Morris-Pratt algorithm: an analysis. In international symposium on mathematical foundations of computer science 1989 (pp. 431-44). Springer, Berlin, Heidelberg.

[20][20]Ostertagova E, Ostertag O. Forecasting using simple exponential smoothing method. Acta Electrotechnica et Informatica. 2012; 12(3):62-6.

[21][21]Castillo-Ortega R, Marín N, Martinez-Cruz C, Sánchez D. A proposal for the hierarchical segmentation of time series. application to trend-based linguistic description. In IEEE international conference on fuzzy systems (fuzz-IEEE) 2014 (pp. 489-96). IEEE.

[22][22]Vallero DA. Fundamentals of air pollution. Academic Press; 2014.

[23][23]https://www.unc.edu/~rowlett/units/scales/beaufort.html. Accessed 20 May 2018.

[24][24]http://snowfence.umn.edu/Components/winddirectionanddegreeswithouttable3.htm. Accessed 20 May 2018.

[25][25]http://www.theweatherprediction.com/habyhints/189/. Accessed 20 May 2018.

[26][26]Belz A. Probabilistic generation of weather forecast texts. In human language technologies 2007: the conference of the north american chapter of the association for computational linguistics; proceedings of the main conference 2007 (pp. 164-71).

[27][27]Ramos-Soto A, Bugarín A, Barro S. Fuzzy sets across the natural language generation pipeline. Progress in Artificial Intelligence. 2016; 5(4):261-76.

[28][28]Reiter E, Dale R. Building applied natural language generation systems. Natural Language Engineering. 1997; 3(1):57-87.

[29][29]Ramos-Soto A, Bugarín A, Barro S. On the role of linguistic descriptions of data in the building of natural language generation systems. Fuzzy Sets and Systems. 2016; 285:31-51.

[30][30]Gkatzia D, Lemon O, Rieser V. Natural language generation enhances human decision-making with uncertain information. arXiv preprint arXiv:1606.03254. 2016.