A COMPREHENSIVE EVALUATION OF YOLOv5s AND YOLOv5m FOR DOCUMENT LAYOUT ANALYSIS

Huda Salim; Fadwa S. Mustafa

pdf

Published: Jan 19, 2024

Keywords:

Computer Vision, Document Layout Analysis, YOLOvs, YOLOv5m.

Huda Salim

Department of Computer Engineering Technology, Technical College of Mosul Northern Technical University, Mosul, Iraq

Fadwa S. Mustafa

Technical Institute of Mosul, Northern Technical University Mosul, Iraq

Abstract

Document Layout Analysis (DLA) in images, is highly dynamic within computer vision. Presently, deep learning architectures, particularly YOLOv5s and YOLOv5m, take the forefront in addressing this challenge This paper meticulously examines their performance, both qualitatively and quantitatively, measured by Average Precision (AP) on COCO datasets. Significant improvements are observed through fine-tuning specific datasets, notably books in Arabic and English languages. A comparative evaluation of YOLOv5m and YOLOv5s in the realm of DLA unfolds. Despite YOLOv5s showcasing an impressive Frames Per Second (FPS) of 123, surpassing YOLOv5m by 2 units, the latter proves to be the optimal model for DLA systems. Its comprehensive performance superiority shines through, boasting an mAP of 94.2%, outperforming other models in this study. Noteworthy is YOLOv5m's lower FPS, compensated by its respectable detection speed, rendering it a pragmatic choice for real-world applications where accuracy is paramount.

How to Cite

Huda Salim, & Fadwa S. Mustafa. (2024). A COMPREHENSIVE EVALUATION OF YOLOv5s AND YOLOv5m FOR DOCUMENT LAYOUT ANALYSIS. European Journal of Interdisciplinary Research and Development, 23, 21–33. Retrieved from http://ejird.journalspark.org/index.php/ejird/article/view/957

Issue

Vol. 23 (2024)

Section

Articles

Article Sidebar

Main Article Content

Abstract

Article Details