11,733 research outputs found
Incorporating Intra-Class Variance to Fine-Grained Visual Recognition
Fine-grained visual recognition aims to capture discriminative
characteristics amongst visually similar categories. The state-of-the-art
research work has significantly improved the fine-grained recognition
performance by deep metric learning using triplet network. However, the impact
of intra-category variance on the performance of recognition and robust feature
representation has not been well studied. In this paper, we propose to leverage
intra-class variance in metric learning of triplet network to improve the
performance of fine-grained recognition. Through partitioning training images
within each category into a few groups, we form the triplet samples across
different categories as well as different groups, which is called Group
Sensitive TRiplet Sampling (GS-TRS). Accordingly, the triplet loss function is
strengthened by incorporating intra-class variance with GS-TRS, which may
contribute to the optimization objective of triplet network. Extensive
experiments over benchmark datasets CompCar and VehicleID show that the
proposed GS-TRS has significantly outperformed state-of-the-art approaches in
both classification and retrieval tasks.Comment: 6 pages, 5 figure
Fuzzy Clustering of Histopathological Images Using Deep Learning Embeddings
Metric learning is a machine learning approach that aims to learn a new distance metric by increasing (reducing) the similarity of examples belonging to the same (different) classes. The output of these approaches are embeddings, where the input data are mapped to improve a crisp or fuzzy classification process. The deep metric learning approaches regard metric learning, implemented by using deep neural networks. Such models have the advantage to discover very representative nonlinear embeddings. In this work, we propose a triplet network deep metric learning approach, based on ResNet50, to find a representative embedding for the unsupervised fuzzy classification of benign and malignant histopathological images of breast cancer tissues. Experiments computed on the BreakHis benchmark dataset, using Fuzzy C-Means Clustering, show the benefit of using very low dimensional embeddings found by the deep metric learning approach
Analysis of Few-Shot Techniques for Fungal Plant Disease Classification and Evaluation of Clustering Capabilities Over Real Datasets
[EN] Plant fungal diseases are one of the most important causes of crop yield losses. Therefore, plant disease identification algorithms have been seen as a useful tool to detect them at early stages to mitigate their effects. Although deep-learning based algorithms can achieve high detection accuracies, they require large and manually annotated image datasets that is not always accessible, specially for rare and new diseases. This study focuses on the development of a plant disease detection algorithm and strategy requiring few plant images (Few-shot learning algorithm). We extend previous work by using a novel challenging dataset containing more than 100,000 images. This dataset includes images of leaves, panicles and stems of five different crops (barley, corn, rape seed, rice, and wheat) for a total of 17 different diseases, where each disease is shown at different disease stages. In this study, we propose a deep metric learning based method to extract latent space representations from plant diseases with just few images by means of a Siamese network and triplet loss function. This enhances previous methods that require a support dataset containing a high number of annotated images to perform metric learning and few-shot classification. The proposed method was compared over a traditional network that was trained with the cross-entropy loss function. Exhaustive experiments have been performed for validating and measuring the benefits of metric learning techniques over classical methods. Results show that the features extracted by the metric learning based approach present better discriminative and clustering properties. Davis-Bouldin index and Silhouette score values have shown that triplet loss network improves the clustering properties with respect to the categorical-cross entropy loss. Overall, triplet loss approach improves the DB index value by 22.7% and Silhouette score value by 166.7% compared to the categorical cross-entropy loss model. Moreover, the F-score parameter obtained from the Siamese network with the triplet loss performs better than classical approaches when there are few images for training, obtaining a 6% improvement in the F-score mean value. Siamese networks with triplet loss have improved the ability to learn different plant diseases using few images of each class. These networks based on metric learning techniques improve clustering and classification results over traditional categorical cross-entropy loss networks for plant disease identification.This project was partially supported by the Spanish Government through CDTI Centro para el Desarrollo Tecnológico e Industrial project AI4ES (ref CER-20211030), by the University of the Basque Country (UPV/EHU) under grant COLAB20/01 and by the Basque Government through grant IT1229-19
Analysis of Few-Shot Techniques for Fungal Plant Disease Classification and Evaluation of Clustering Capabilities Over Real Datasets
Plant fungal diseases are one of the most important causes of crop yield losses. Therefore, plant disease identification algorithms have been seen as a useful tool to detect them at early stages to mitigate their effects. Although deep-learning based algorithms can achieve high detection accuracies, they require large and manually annotated image datasets that is not always accessible, specially for rare and new diseases. This study focuses on the development of a plant disease detection algorithm and strategy requiring few plant images (Few-shot learning algorithm). We extend previous work by using a novel challenging dataset containing more than 100,000 images. This dataset includes images of leaves, panicles and stems of five different crops (barley, corn, rape seed, rice, and wheat) for a total of 17 different diseases, where each disease is shown at different disease stages. In this study, we propose a deep metric learning based method to extract latent space representations from plant diseases with just few images by means of a Siamese network and triplet loss function. This enhances previous methods that require a support dataset containing a high number of annotated images to perform metric learning and few-shot classification. The proposed method was compared over a traditional network that was trained with the cross-entropy loss function. Exhaustive experiments have been performed for validating and measuring the benefits of metric learning techniques over classical methods. Results show that the features extracted by the metric learning based approach present better discriminative and clustering properties. Davis-Bouldin index and Silhouette score values have shown that triplet loss network improves the clustering properties with respect to the categorical-cross entropy loss. Overall, triplet loss approach improves the DB index value by 22.7% and Silhouette score value by 166.7% compared to the categorical cross-entropy loss model. Moreover, the F-score parameter obtained from the Siamese network with the triplet loss performs better than classical approaches when there are few images for training, obtaining a 6% improvement in the F-score mean value. Siamese networks with triplet loss have improved the ability to learn different plant diseases using few images of each class. These networks based on metric learning techniques improve clustering and classification results over traditional categorical cross-entropy loss networks for plant disease identification.This project was partially supported by the Spanish Government through CDTI Centro para el Desarrollo Tecnológico e Industrial project AI4ES (ref CER-20211030), by the University of the Basque Country (UPV/EHU) under grant COLAB20/01 and by the Basque Government through grant IT1229-19
Active Learning of Ordinal Embeddings: A User Study on Football Data
Humans innately measure distance between instances in an unlabeled dataset
using an unknown similarity function. Distance metrics can only serve as proxy
for similarity in information retrieval of similar instances. Learning a good
similarity function from human annotations improves the quality of retrievals.
This work uses deep metric learning to learn these user-defined similarity
functions from few annotations for a large football trajectory dataset. We
adapt an entropy-based active learning method with recent work from triplet
mining to collect easy-to-answer but still informative annotations from human
participants and use them to train a deep convolutional network that
generalizes to unseen samples. Our user study shows that our approach improves
the quality of the information retrieval compared to a previous deep metric
learning approach that relies on a Siamese network. Specifically, we shed light
on the strengths and weaknesses of passive sampling heuristics and active
learners alike by analyzing the participants' response efficacy. To this end,
we collect accuracy, algorithmic time complexity, the participants' fatigue and
time-to-response, qualitative self-assessment and statements, as well as the
effects of mixed-expertise annotators and their consistency on model
performance and transfer-learning.Comment: 23 pages, 17 figure
Deep Attributes Driven Multi-Camera Person Re-identification
The visual appearance of a person is easily affected by many factors like
pose variations, viewpoint changes and camera parameter differences. This makes
person Re-Identification (ReID) among multiple cameras a very challenging task.
This work is motivated to learn mid-level human attributes which are robust to
such visual appearance variations. And we propose a semi-supervised attribute
learning framework which progressively boosts the accuracy of attributes only
using a limited number of labeled data. Specifically, this framework involves a
three-stage training. A deep Convolutional Neural Network (dCNN) is first
trained on an independent dataset labeled with attributes. Then it is
fine-tuned on another dataset only labeled with person IDs using our defined
triplet loss. Finally, the updated dCNN predicts attribute labels for the
target dataset, which is combined with the independent dataset for the final
round of fine-tuning. The predicted attributes, namely \emph{deep attributes}
exhibit superior generalization ability across different datasets. By directly
using the deep attributes with simple Cosine distance, we have obtained
surprisingly good accuracy on four person ReID datasets. Experiments also show
that a simple metric learning modular further boosts our method, making it
significantly outperform many recent works.Comment: Person Re-identification; 17 pages; 5 figures; In IEEE ECCV 201
- …