Deciphering the past: enhancing Assyrian Cuneiform recognition with YOLOv8 object detection
Elaf A. Saeed, Ammar D. Jasim and Munther A. Abdul Malik
Abstract
Cuneiform writing offers insight into our distant past. Emerging in the latter part of the 4th millennium BCE, cuneiform script is among the earliest known writing systems, alongside Egyptian hieroglyphs. It is believed to have originated with the Sumerians in southern Mesopotamia. Used for nearly three thousand years, it was eventually replaced by more accessible alphabet-based systems. Cuneiform texts were inscribed on various materials, but clay tablets were preferred due to their availability. Over 500,000 cuneiform documents have been found, with many yet to be analyzed by philologists. This highlights the need for effective methods to study the extensive cuneiform writings, traditionally examined manually. Deciphering ancient tablets is time-consuming, requiring extensive expertise. Signs on Assyrian cuneiform tablets were aimed to be detected in this study using the YOLOv8 object detection pretraining model. About 900 images of Assyrian tablets from the Iraq Museum were compiled and expanded to over 2000 through preprocessing and augmentation. This led to the identification of 11 new Assyrian references, with a mean average precision (mAP) at 50% of 82.7%, a precision of 71.3%, and a recall of 85.6% being achieved. The detection of cuneiform signs, as well as the selection and pronunciation of the modern Assyrian dialect, was facilitated by this research, aiding researchers in reading with a pre-trained model.
Keyword
Cuneiform writing, YOLOv8, Assyrian tablets, Philological analysis, Ancient script decipherment.
Cite this article
Saeed EA, Jasim AD, Malik MA.Deciphering the past: enhancing Assyrian Cuneiform recognition with YOLOv8 object detection. International Journal of Advanced Technology and Engineering Exploration. 2023;10(109):1604-1621. DOI:10.19101/IJATEE.2023.10102331
Refference
[1]Rahma AM, Saeid AA, Hussien MJ. Recognize assyrian cuneiform characters by virtual dataset. In 6th international conference on information and communication technology and accessibility 2017 (pp. 1-7). IEEE.
[2]Abitbol R, Shimshoni I, Ben-dov J. Machine learning based assembly of fragments of ancient papyrus. Journal on Computing and Cultural Heritage. 2021; 14(3):1-21.
[3]Alstola T, Zaia S, Sahala A, Jauhiainen H, Svärd S, Lindén K. Aššur and his friends: a statistical analysis of neo-Assyrian texts. Journal of Cuneiform Studies. 2019; 71(1):159-80.
[4]Luo J, Hartmann F, Santus E, Barzilay R, Cao Y. Deciphering undersegmented ancient scripts using phonetic prior. Transactions of the Association for Computational Linguistics. 2021; 9:69-81.
[5]Sahala A. Contributions to computational Assyriology. Doctoral Dissertation, University of Helsinki. 2021.
[6]Snyder B, Barzilay R, Knight K. A statistical model for lost language decipherment. In proceedings of the 48th annual meeting of the association for computational linguistics 2010 (pp. 1048-1057). Association for Computational Linguistics.
[7]Fisseler D, Weichert F, Müller G, Cammarosano M. Towards an interactive and automated script feature analysis of 3D scanned cuneiform tablets. Scientific Computing and Cultural Heritage. 2013:16-7.
[8]Mara H, Krömker S. Vectorization of 3D-characters by integral invariant filtering of high-resolution triangular meshes. In 12th international conference on document analysis and recognition 2013 (pp. 62-6). IEEE.
[9]https://mimno.github.io/Mallet/index. Accessed 06 October 2023.
[10]Anderson SE, Levoy M. Unwrapping and visualizing cuneiform tablets. IEEE Computer Graphics and Applications. 2002; 22(6):82-8.
[11]https://github.com/ultralytics/ultralytics. Accessed 06 October 2023.
[12]Mara H, Krömker S, Jakob S, Breuckmann B. GigaMesh and gilgamesh: –3D multiscale integral invariant cuneiform character extraction. In proceedings of the 11th international conference on virtual reality, archaeology and cultural heritage 2010 (pp. 131-8). ACM.
[13]Hamdany AH, Omar-nima RR, Albak LH. Translating cuneiform symbols using artificial neural network. Telkomnika (Telecommunication Computing Electronics and Control). 2021; 19(2):438-43.
[14]Hamplová A, Franc D, Pavlíček J, Romach A, Gordin S. Cuneiform reading using computer vision algorithms. In proceedings of the 5th international conference on signal processing and machine learning 2022 (pp. 242-5).
[15]Rusakov E, Brandenbusch K, Fisseler D, Somel T, Fink GA, Weichert F, et al. Generating cuneiform signs with cycle-consistent adversarial networks. In proceedings of the 5th international workshop on historical document imaging and processing 2019 (pp. 19-24). ACM.
[16]Dencker T, Klinkisch P, Maul SM, Ommer B. Deep learning of cuneiform sign detection with weak supervision using transliteration alignment. Plos one. 2020; 15(12):1-21.
[17]Rusakov E, Somel T, Fink GA, Müller GG. Towards query-by-eXpression retrieval of cuneiform signs. In 17th international conference on frontiers in handwriting recognition 2020 (pp. 43-8). IEEE.
[18]Saeid AA, Rahma AM, Hussien MJ. Cuneiform tablets image preprocessing proposed algorithms techniques for pattern recognition. Iraqi Journal of Science. 2018:1326-38.
[19]Mara H. Multi-scale integral invariants for robust character extraction from irregular polygon mesh data (Doctoral dissertation). 2012.
[20]Howe NR. Part-structured inkball models for one-shot handwritten word spotting. In 12th international conference on document analysis and recognition 2013 (pp. 582-6). IEEE.
[21]Bogacz B, Gertz M, Mara H. Cuneiform character similarity using graph representations. 20th computer vision winter workshop 2015 (pp. 1-8).
[22]Bogacz B, Howe N, Mara H. Segmentation free spotting of cuneiform using part structured models. In 15th international conference on frontiers in handwriting recognition 2016 (pp. 301-6). IEEE.
[23]Massa J, Bogacz B, Krömker S, Mara H. Cuneiform detection in vectorized raster images. 21st Computer Vision Winter Workshop 2016 (pp. 1-9).
[24]Rothacker L, Fisseler D, Müller GG, Weichert F, Fink GA. Retrieving cuneiform structures in a segmentation-free word spotting framework. In proceedings of the 3rd international workshop on historical document imaging and processing 2015 (pp. 129-36). ACM.
[25]Rath TM, Manmatha R. Word spotting for historical documents. International Journal of Document Analysis and Recognition. 2007; 9:139-52.
[26]Rusiñol M, Aldavert D, Toledo R, Lladós J. Efficient segmentation-free keyword spotting in historical document collections. Pattern Recognition. 2015; 48(2):545-55.
[27]Saeid AA, Rahma AM. Cuneiform symbols recognition by support vector machine (SVM). Journal of AL-Qadisiyah for Computer Science and Mathematics. 2019; 11(1).
[28]Sahala A, Silfverberg M, Arppe A, Lindén K. Automated phonological transcription of Akkadian cuneiform text. In proceedings of the 12th conference on language resources and evaluation 2020. European Language Resources Association (ELRA).
[29]Liu L, Ouyang W, Wang X, Fieguth P, Chen J, Liu X, et al. Deep learning for generic object detection: a survey. International Journal of Computer Vision. 2020; 128:261-318.
[30]Everingham M, Van GL, Williams CK, Winn J, Zisserman A. The pascal visual object classes (VOC) challenge. International Journal of Computer Vision. 2010; 88:303-38.
[31]Ren S, He K, Girshick R, Sun J. Faster R-CNN: towards real-time object detection with region proposal networks. Advances in Neural Information Processing Systems. 2015.
[32]Redmon J, Farhadi A. Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767. 2018.
[33]Diwan T, Anirudh G, Tembhurne JV. Object detection using YOLO: challenges, architectural successors, datasets and applications. Multimedia Tools and Applications. 2023; 82(6):9243-75.
[34]Lin TY, Dollár P, Girshick R, He K, Hariharan B, Belongie S. Feature pyramid networks for object detection. In proceedings of the conference on computer vision and pattern recognition 2017 (pp. 2117-25). IEEE.
[35]https://roboflow.com/model/yolov8. Accessed 06 October 2023.
[36]Jiao L, Zhang F, Liu F, Yang S, Li L, Feng Z, et al. A survey of deep learning-based object detection. IEEE Access. 2019; 7:128837-68.
[37]Jiang P, Ergu D, Liu F, Cai Y, Ma B. A review of Yolo algorithm developments. Procedia Computer Science. 2022; 199:1066-73.
[38]Li Z, Peng C, Yu G, Zhang X, Deng Y, Sun J. Detnet: a backbone network for object detection. arXiv preprint arXiv:1804.06215. 2018.
[39]Zheng Z, Wang P, Liu W, Li J, Ye R, Ren D. Distance-IoU loss: faster and better learning for bounding box regression. In proceedings of the AAAI conference on artificial intelligence 2020 (pp. 12993-3000).
[40]Li X, Wang W, Wu L, Chen S, Hu X, Li J, et al. Generalized focal loss: learning qualified and distributed bounding boxes for dense object detection. Advances in Neural Information Processing Systems. 2020; 33:21002-12.
[41]Terven J, Cordova-esparza D. A comprehensive review of YOLO: from YOLOv1 and beyond. arXiv 2023. arXiv preprint arXiv:2304.00501. 2023.
[42]Tamang S, Sen B, Pradhan A, Sharma K, Singh VK. Enhancing covid-19 safety: exploring yolov8 object detection for accurate face mask classification. International Journal of Intelligent Systems and Applications in Engineering. 2023; 11(2):892-7.
[43]Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, et al. Microsoft coco: common objects in context. In computer vision–ECCV 2014: 13th European conference, Zurich, Switzerland, 2014 (pp. 740-55). Springer International Publishing.
[44]Lee Y, Kim T, Lee SY. Voice imitating text-to-speech neural networks. arXiv preprint arXiv:1806.00927. 2018.
[45]Gordin S, Gutherz G, Elazary A, Romach A, Jiménez E, Berant J, et al. Reading Akkadian cuneiform using natural language processing. PloS one. 2020; 15(10):1-16.
[46]Gutherz G, Gordin S, Sáenz L, Levy O, Berant J. Translating Akkadian to English with neural machine translation. PNAS nexus. 2023; 2(5):1-10.
[47]Jauhiainen T, Jauhiainen H, Alstola T, Lindén K. Language and dialect identification of cuneiform texts. arXiv preprint arXiv:1903.01891. 2019.
[48]Cho J, Lee K, Shin E, Choy G, Do S. How much data is needed to train a medical image deep learning system to achieve necessary high accuracy? arXiv preprint arXiv:1511.06348. 2015.
[49]Huang Z, Li L, Krizek GC, Sun L. Research on traffic sign detection based on improved YOLOv8. Journal of Computer and Communications. 2023; 11(7):226-32.
[50]Ahmad T, Ma Y, Yahya M, Ahmad B, Nazir S, Haq AU. Object detection through modified YOLO neural network. Scientific Programming. 2020; 2020:1-10.
[51]Ghosh T, Abedin MM, Chowdhury SM, Tasnim Z, Karim T, Reza SS, et al. Bangla handwritten character recognition using MobileNet V1 architecture. Bulletin of Electrical Engineering and Informatics. 2020; 9(6):2547-54.