A COMPREHENSIVE EVALUATION OF YOLOv5s AND YOLOv5m FOR DOCUMENT LAYOUT ANALYSIS

Authors

  • Huda Salim Department of Computer Engineering Technology, Technical College of Mosul Northern Technical University, Mosul, Iraq
  • Fadwa S. Mustafa Technical Institute of Mosul, Northern Technical University Mosul, Iraq

Keywords:

Computer Vision, Document Layout Analysis, YOLOvs, YOLOv5m.

Abstract

Document Layout Analysis (DLA) in images, is highly dynamic within computer vision. Presently, deep learning architectures, particularly YOLOv5s and YOLOv5m, take the forefront in addressing this challenge This paper meticulously examines their performance, both qualitatively and quantitatively, measured by Average Precision (AP) on COCO datasets. Significant improvements are observed through fine-tuning specific datasets, notably books in Arabic and English languages. A comparative evaluation of YOLOv5m and YOLOv5s in the realm of DLA unfolds. Despite YOLOv5s showcasing an impressive Frames Per Second (FPS) of 123, surpassing YOLOv5m by 2 units, the latter proves to be the optimal model for DLA systems. Its comprehensive performance superiority shines through, boasting an mAP of 94.2%, outperforming other models in this study. Noteworthy is YOLOv5m's lower FPS, compensated by its respectable detection speed, rendering it a pragmatic choice for real-world applications where accuracy is paramount.

Downloads

Published

2024-01-19

Issue

Section

Articles

How to Cite

A COMPREHENSIVE EVALUATION OF YOLOv5s AND YOLOv5m FOR DOCUMENT LAYOUT ANALYSIS. (2024). European Journal of Interdisciplinary Research and Development , 23, 21-33. https://ejird.journalspark.org/index.php/ejird/article/view/957