3,571 research outputs found
Lesion detection and Grading of Diabetic Retinopathy via Two-stages Deep Convolutional Neural Networks
We propose an automatic diabetic retinopathy (DR) analysis algorithm based on
two-stages deep convolutional neural networks (DCNN). Compared to existing
DCNN-based DR detection methods, the proposed algorithm have the following
advantages: (1) Our method can point out the location and type of lesions in
the fundus images, as well as giving the severity grades of DR. Moreover, since
retina lesions and DR severity appear with different scales in fundus images,
the integration of both local and global networks learn more complete and
specific features for DR analysis. (2) By introducing imbalanced weighting map,
more attentions will be given to lesion patches for DR grading, which
significantly improve the performance of the proposed algorithm. In this study,
we label 12,206 lesion patches and re-annotate the DR grades of 23,595 fundus
images from Kaggle competition dataset. Under the guidance of clinical
ophthalmologists, the experimental results show that our local lesion detection
net achieve comparable performance with trained human observers, and the
proposed imbalanced weighted scheme also be proved to significantly improve the
capability of our DCNN-based DR grading algorithm
Minority-Oriented Vicinity Expansion with Attentive Aggregation for Video Long-Tailed Recognition
A dramatic increase in real-world video volume with extremely diverse and
emerging topics naturally forms a long-tailed video distribution in terms of
their categories, and it spotlights the need for Video Long-Tailed Recognition
(VLTR). In this work, we summarize the challenges in VLTR and explore how to
overcome them. The challenges are: (1) it is impractical to re-train the whole
model for high-quality features, (2) acquiring frame-wise labels requires
extensive cost, and (3) long-tailed data triggers biased training. Yet, most
existing works for VLTR unavoidably utilize image-level features extracted from
pretrained models which are task-irrelevant, and learn by video-level labels.
Therefore, to deal with such (1) task-irrelevant features and (2) video-level
labels, we introduce two complementary learnable feature aggregators. Learnable
layers in each aggregator are to produce task-relevant representations, and
each aggregator is to assemble the snippet-wise knowledge into a video
representative. Then, we propose Minority-Oriented Vicinity Expansion (MOVE)
that explicitly leverages the class frequency into approximating the vicinity
distributions to alleviate (3) biased training. By combining these solutions,
our approach achieves state-of-the-art results on large-scale VideoLT and
synthetically induced Imbalanced-MiniKinetics200. With VideoLT features from
ResNet-50, it attains 18% and 58% relative improvements on head and tail
classes over the previous state-of-the-art method, respectively.Comment: Accepted to AAAI 2023. Code is available at
https://github.com/wjun0830/MOV
Constructing Balance from Imbalance for Long-tailed Image Recognition
Long-tailed image recognition presents massive challenges to deep learning
systems since the imbalance between majority (head) classes and minority (tail)
classes severely skews the data-driven deep neural networks. Previous methods
tackle with data imbalance from the viewpoints of data distribution, feature
space, and model design, etc.In this work, instead of directly learning a
recognition model, we suggest confronting the bottleneck of head-to-tail bias
before classifier learning, from the previously omitted perspective of
balancing label space. To alleviate the head-to-tail bias, we propose a concise
paradigm by progressively adjusting label space and dividing the head classes
and tail classes, dynamically constructing balance from imbalance to facilitate
the classification. With flexible data filtering and label space mapping, we
can easily embed our approach to most classification models, especially the
decoupled training methods. Besides, we find the separability of head-tail
classes varies among different features with different inductive biases. Hence,
our proposed model also provides a feature evaluation method and paves the way
for long-tailed feature learning. Extensive experiments show that our method
can boost the performance of state-of-the-arts of different types on
widely-used benchmarks. Code is available at https://github.com/silicx/DLSA.Comment: Accepted to ECCV 202
- …