2 research outputs found
YOLOrtho -- A Unified Framework for Teeth Enumeration and Dental Disease Detection
Detecting dental diseases through panoramic X-rays images is a standard
procedure for dentists. Normally, a dentist need to identify diseases and find
the infected teeth. While numerous machine learning models adopting this
two-step procedure have been developed, there has not been an end-to-end model
that can identify teeth and their associated diseases at the same time. To fill
the gap, we develop YOLOrtho, a unified framework for teeth enumeration and
dental disease detection. We develop our model on Dentex Challenge 2023 data,
which consists of three distinct types of annotated data. The first part is
labeled with quadrant, and the second part is labeled with quadrant and
enumeration and the third part is labeled with quadrant, enumeration and
disease. To further improve detection, we make use of Tufts Dental public
dataset. To fully utilize the data and learn both teeth detection and disease
identification simultaneously, we formulate diseases as attributes attached to
their corresponding teeth. Due to the nature of position relation in teeth
enumeration, We replace convolution layer with CoordConv in our model to
provide more position information for the model. We also adjust the model
architecture and insert one more upsampling layer in FPN in favor of large
object detection. Finally, we propose a post-process strategy for teeth layout
that corrects teeth enumeration based on linear sum assignment. Results from
experiments show that our model exceeds large Diffusion-based model
OOD-CV: A Benchmark for Robustness to Out-of-Distribution Shifts of Individual Nuisances in Natural Images
Enhancing the robustness of vision algorithms in real-world scenarios is
challenging. One reason is that existing robustness benchmarks are limited, as
they either rely on synthetic data or ignore the effects of individual nuisance
factors. We introduce OOD-CV, a benchmark dataset that includes
out-of-distribution examples of 10 object categories in terms of pose, shape,
texture, context and the weather conditions, and enables benchmarking models
for image classification, object detection, and 3D pose estimation. In addition
to this novel dataset, we contribute extensive experiments using popular
baseline methods, which reveal that: 1. Some nuisance factors have a much
stronger negative effect on the performance compared to others, also depending
on the vision task. 2. Current approaches to enhance robustness have only
marginal effects, and can even reduce robustness. 3. We do not observe
significant differences between convolutional and transformer architectures. We
believe our dataset provides a rich testbed to study robustness and will help
push forward research in this area.Comment: Project webpage: https://bzhao.me/ROBIN/, this work is accepted as
Oral at ECCV 202