9 research outputs found
Effective Data Augmentation with Multi-Domain Learning GANs
For deep learning applications, the massive data development (e.g.,
collecting, labeling), which is an essential process in building practical
applications, still incurs seriously high costs. In this work, we propose an
effective data augmentation method based on generative adversarial networks
(GANs), called Domain Fusion. Our key idea is to import the knowledge contained
in an outer dataset to a target model by using a multi-domain learning GAN. The
multi-domain learning GAN simultaneously learns the outer and target dataset
and generates new samples for the target tasks. The simultaneous learning
process makes GANs generate the target samples with high fidelity and variety.
As a result, we can obtain accurate models for the target tasks by using these
generated samples even if we only have an extremely low volume target dataset.
We experimentally evaluate the advantages of Domain Fusion in image
classification tasks on 3 target datasets: CIFAR-100, FGVC-Aircraft, and Indoor
Scene Recognition. When trained on each target dataset reduced the samples to
5,000 images, Domain Fusion achieves better classification accuracy than the
data augmentation using fine-tuned GANs. Furthermore, we show that Domain
Fusion improves the quality of generated samples, and the improvements can
contribute to higher accuracy.Comment: AAAI-202
AugNet: Dynamic Test-Time Augmentation via Differentiable Functions
Distribution shifts, which often occur in the real world, degrade the
accuracy of deep learning systems, and thus improving robustness is essential
for practical applications. To improve robustness, we study an image
enhancement method that generates recognition-friendly images without
retraining the recognition model. We propose a novel image enhancement method,
AugNet, which is based on differentiable data augmentation techniques and
generates a blended image from many augmented images to improve the recognition
accuracy under distribution shifts. In addition to standard data augmentations,
AugNet can also incorporate deep neural network-based image transformation,
which further improves the robustness. Because AugNet is composed of
differentiable functions, AugNet can be directly trained with the
classification loss of the recognition model. AugNet is evaluated on widely
used image recognition datasets using various classification models, including
Vision Transformer and MLP-Mixer. AugNet improves the robustness with almost no
reduction in classification accuracy for clean images, which is a better result
than the existing methods. Furthermore, we show that interpretation of
distribution shifts using AugNet and retraining based on that interpretation
can greatly improve robustness
Signaling emotion in tagclouds
ABSTRACT In order to create more attractive tagclouds that get people interested in tagged content, we propose a simple but novel tagcloud where font size is determined by tag's entropy value, not the popularity to its content. Our method raises users' emotional interest in the content by emphasizing more emotional tags. Our initial experiments show that emotional tagclouds attract more attention than normal tagclouds at first look; thus they will enhance the role of tagcloud as a social signaller
XMLボク ノ タメノ ドウテキ ハンイ ラベルズケ シュホウ
http://library.naist.jp/mylimedio/dllimedio/show.cgi?bookid=100038191&oldid=70022修士 (Master)工学 (Engineering)修第2175
Learning to Cascade: Confidence Calibration for Improving the Accuracy and Computational Cost of Cascade Inference Systems
Recently, deep neural networks have become to be used in a variety of applications.
While the accuracy of deep neural networks is increasing, the confidence score, which indicates the reliability of the prediction results, is becoming more important.
Deep neural networks are seen as highly accurate but known to be overconfident, making it important to calibrate the confidence score.
Many studies have been conducted on confidence calibration.
They calibrate the confidence score of the model to match its accuracy, but it is not clear whether these confidence scores can improve the performance of systems that use confidence scores.
This paper focuses on cascade inference systems, one kind of systems using confidence scores, and discusses the desired confidence score to improve system performance in terms of inference accuracy and computational cost.
Based on the discussion, we propose a new confidence calibration method, Learning to Cascade.
Learning to Cascade is a simple but novel method that optimizes the loss term for confidence calibration simultaneously with the original loss term.
Experiments are conducted using two datasets, CIFAR-100 and ImageNet, in two system settings, and show that naive application of existing calibration methods to cascade inference systems sometimes performs worse.
However, Learning to Cascade always achieves a better trade-off between inference accuracy and computational cost.
The simplicity of Learning to Cascade allows it to be easily applied to improve the performance of existing systems
Accelerating Instant Question Search with Database Techniques
Distributed question answering services, like Yahoo Answer 1 and Aardvark 2, are known to be useful for end users and have also opened up numerous topics ranging in many research fields. In this paper, we propose a user-support tool for composing questions in such services. Our system incrementally recommends similar questions while users are typing their question in a sentence, which gives the users opportunities to know that there are similar questions that have already been solved. A question database is semantically analyzed and searched in the semantic space by boosting the performance of similarity searches with database techniques such as server/client caching and LSH (Locality Sensitive Hashing). The more text the user enters, the more similar the recommendations will become to the ultimately desired question. This unconscious editing-as-a-sequence-of-searches approach helps users to form their question incrementally through interactive supplementary information. Not only askers nor repliers, but also service providers have advantages such as that the knowledge of the service will be autonomously refined by avoiding for novice users to repeat questions which have been already solved
Improving Image Coding for Machines through Optimizing Encoder via Auxiliary Loss
Image coding for machines (ICM) aims to compress images for machine analysis
using recognition models rather than human vision. Hence, in ICM, it is
important for the encoder to recognize and compress the information necessary
for the machine recognition task. There are two main approaches in learned ICM;
optimization of the compression model based on task loss, and Region of
Interest (ROI) based bit allocation. These approaches provide the encoder with
the recognition capability. However, optimization with task loss becomes
difficult when the recognition model is deep, and ROI-based methods often
involve extra overhead during evaluation. In this study, we propose a novel
training method for learned ICM models that applies auxiliary loss to the
encoder to improve its recognition capability and rate-distortion performance.
Our method achieves Bjontegaard Delta rate improvements of 27.7% and 20.3% in
object detection and semantic segmentation tasks, compared to the conventional
training method.Comment: This version has been removed by arXiv administrators as the
submitter did not have the right to agree to the license at the time of
submissio
Test-time Adaptation Meets Image Enhancement: Improving Accuracy via Uncertainty-aware Logit Switching
Deep neural networks have achieved remarkable success in a variety of
computer vision applications. However, there is a problem of degrading accuracy
when the data distribution shifts between training and testing. As a solution
of this problem, Test-time Adaptation~(TTA) has been well studied because of
its practicality. Although TTA methods increase accuracy under distribution
shift by updating the model at test time, using high-uncertainty predictions is
known to degrade accuracy. Since the input image is the root of the
distribution shift, we incorporate a new perspective on enhancing the input
image into TTA methods to reduce the prediction's uncertainty. We hypothesize
that enhancing the input image reduces prediction's uncertainty and increase
the accuracy of TTA methods. On the basis of our hypothesis, we propose a novel
method: Test-time Enhancer and Classifier Adaptation~(TECA). In TECA, the
classification model is combined with the image enhancement model that
transforms input images into recognition-friendly ones, and these models are
updated by existing TTA methods. Furthermore, we found that the prediction from
the enhanced image does not always have lower uncertainty than the prediction
from the original image. Thus, we propose logit switching, which compares the
uncertainty measure of these predictions and outputs the lower one. In our
experiments, we evaluate TECA with various TTA methods and show that TECA
reduces prediction's uncertainty and increases accuracy of TTA methods despite
having no hyperparameters and little parameter overhead.Comment: Accepted to IJCNN202