Advancing Digital Papyrology: Machine Learning and Blockchain Tools for Modernizing the Study of Ancient Greek Manuscripts

No Thumbnail Available
Date
2024
Authors
Swindall, Matthew I.
Journal Title
Journal ISSN
Volume Title
Publisher
Middle Tennessee State University
Abstract
The study and preservation of ancient Greek papyri poses unique challenges due to the degraded and fragmented state of these highly damaged, ancient manuscripts. While digital imaging has aided in documenting these texts, manual transcription by experts remains a formidable bottleneck. In my work I present novel machine learning and blockchain-based approaches designed to accelerate and streamline the transcription and archiving of papyrological manuscripts. Key to this work is the creation of crowdsourced datasets of ancient Greek character images, annotated through a crowdsourcing initiative, that enable the training of deep learning models for character detection, segmentation, and recognition. Other contributions include augmenting datasets with synthetically generated characters to reduce sampling bias, and techniques for identifying annotation uncertainty via ensemble modeling to improve classification accuracy. The models and algorithms created in these works form the core of a pipeline that combines human oversight with automated processes for diplomatic transcription of papyrus fragments. Current and future work, based on these contributions, includes advances in optical manuscript dating an novel approaches to character spotting. To support collaborative scholarship within the field of papyrology, a blockchain framework utilizing smart contracts and decentralized storage is proposed for managing versions of transcribed texts. Implemented as a prototype, this framework demonstrates feasibility and potential benefits over traditional editorial workflows. Collectively, the methods developed aim to provide an AI-assisted platform tailored for papyrologists and other humanities researchers. By uniting machine learning, human computation, and distributed ledger technologies, this interdisciplinary research proposes a modernized paradigm for studying the ancient world through its surviving manuscripts.
Description
Keywords
Blockchain, Ensemble Modeling, Generative AI, Greek, Optical Character Recognition, Papyrology, Artificial intelligence, Ancient languages, Computer science
Citation