Advanced Document Structure Recognition in Devanagari Script Using an Optimized YOLOv8 Pipeline
Published: 2025
Author(s) Name: Shweta Singh, Sudeep Varshney and Ankur Choudhary |
Author(s) Affiliation: Sharda University, Greater Noida, Uttar Pradesh, India.
Locked
Subscribed
Available for All
Abstract
Quality recognition of structural areas on printed documents is vital in other downstream processes like optical character recognition, document retrieval and digital archiving. Devanagari script has further problems that include the close morphology, headline writing and multi-tiered character constructions which often undermine the efficiency of generic document-analytic models. In this paper, Authors presented a deep learning model that is founded on YOLOv8 to identify structural elements in Devanagari printed texts. The method is trained and tested on PubLayNet dataset which is a large scale benchmark that consists of a variety of document structures and later adjusted to Devanagari script using targeted fine-tuning and region-specific annotations. The model is very precise in detecting fundamental structural components of text block, titles, table, lists, and figures and it shows high generalization in complicated pages of Devanagari. The experimental findings prove that YOLOv8 is an appropriate solution to detect document-structure quickly and accurately and provide a solid basis of larger Indic-language document processing pipelines. The structure helps to achieve high efficiency in the process of document digitization and structural preservation that is needed to support high-quality script-specific OCR systems.
Keywords: Deep learning, Devanagari script, Document structure detection, Object detection, PubLayNet, YOLOv8.
View PDF