This paper focuses on enhancing Bengali Document Layout Analysis (DLA) using
the YOLOv8 model and innovative post-processing techniques. We tackle
challenges unique to the complex Bengali script by employing data augmentation
for model robustness. After meticulous validation set evaluation, we fine-tune
our approach on the complete dataset, leading to a two-stage prediction
strategy for accurate element segmentation. Our ensemble model, combined with
post-processing, outperforms individual base architectures, addressing issues
identified in the BaDLAD dataset. By leveraging this approach, we aim to
advance Bengali document analysis, contributing to improved OCR and document
comprehension and BaDLAD serves as a foundational resource for this endeavor,
aiding future research in the field. Furthermore, our experiments provided key
insights to incorporate new strategies into the established solution