452 research outputs found
Data Dropout: Optimizing Training Data for Convolutional Neural Networks
Deep learning models learn to fit training data while they are highly
expected to generalize well to testing data. Most works aim at finding such
models by creatively designing architectures and fine-tuning parameters. To
adapt to particular tasks, hand-crafted information such as image prior has
also been incorporated into end-to-end learning. However, very little progress
has been made on investigating how an individual training sample will influence
the generalization ability of a model. In other words, to achieve high
generalization accuracy, do we really need all the samples in a training
dataset? In this paper, we demonstrate that deep learning models such as
convolutional neural networks may not favor all training samples, and
generalization accuracy can be further improved by dropping those unfavorable
samples. Specifically, the influence of removing a training sample is
quantifiable, and we propose a Two-Round Training approach, aiming to achieve
higher generalization accuracy. We locate unfavorable samples after the first
round of training, and then retrain the model from scratch with the reduced
training dataset in the second round. Since our approach is essentially
different from fine-tuning or further training, the computational cost should
not be a concern. Our extensive experimental results indicate that, with
identical settings, the proposed approach can boost performance of the
well-known networks on both high-level computer vision problems such as image
classification, and low-level vision problems such as image denoising
Statistical inference using SGD
We present a novel method for frequentist statistical inference in
-estimation problems, based on stochastic gradient descent (SGD) with a
fixed step size: we demonstrate that the average of such SGD sequences can be
used for statistical inference, after proper scaling. An intuitive analysis
using the Ornstein-Uhlenbeck process suggests that such averages are
asymptotically normal. From a practical perspective, our SGD-based inference
procedure is a first order method, and is well-suited for large scale problems.
To show its merits, we apply it to both synthetic and real datasets, and
demonstrate that its accuracy is comparable to classical statistical methods,
while requiring potentially far less computation.Comment: To appear in AAAI 201
Prompt Learning for Oriented Power Transmission Tower Detection in High-Resolution SAR Images
Detecting transmission towers from synthetic aperture radar (SAR) images
remains a challenging task due to the comparatively small size and side-looking
geometry, with background clutter interference frequently hindering tower
identification. A large number of interfering signals superimposes the return
signal from the tower. We found that localizing or prompting positions of power
transmission towers is beneficial to address this obstacle. Based on this
revelation, this paper introduces prompt learning into the oriented object
detector (P2Det) for multimodal information learning. P2Det contains the sparse
prompt coding and cross-attention between the multimodal data. Specifically,
the sparse prompt encoder (SPE) is proposed to represent point locations,
converting prompts into sparse embeddings. The image embeddings are generated
through the Transformer layers. Then a two-way fusion module (TWFM) is proposed
to calculate the cross-attention of the two different embeddings. The
interaction of image-level and prompt-level features is utilized to address the
clutter interference. A shape-adaptive refinement module (SARM) is proposed to
reduce the effect of aspect ratio. Extensive experiments demonstrated the
effectiveness of the proposed model on high-resolution SAR images. P2Det
provides a novel insight for multimodal object detection due to its competitive
performance.Comment: 22 pages, 12figure
Urban air quality: What is the optimal place to reduce transport emissions?
We develop a linear model based on a complex network approach that predicts the effect of emission changes on air pollution exposure in urban street networks
including NO–NO2–O3-chemisty. The operational air quality model SIRANE is used to create a weighted adjacency matrix A describing the relation between
emissions of a passive scalar inside streets and the resulting concentrations in the street network. A case study in South Kensington (London) is used, and the ad-
jacency matrix A0 is determined for one wind speed and eight different wind directions. The physics of the underlying problem is used to infer A for different wind
speeds. Good agreement between SIRANE predictions and the model is observed for all but the lowest wind speed, despite non-linearities in SIRANE’s model
formulation. An indicator for exposure in the street is developed, and it is shown that the out-degree of the exposure matrix E represents the effect of a change in
emissions on the exposure reduction in all streets in the network. The approach is then extended to NO–NO2–O3-chemisty, which introduces a non-linearity. It is
shown that a linearised model agrees well with the fully nonlinear SIRANE predictions. The model shows that roads with large height-to-width ratios are the first in
which emissions should be reduced in order to maximise exposure reduction
SDF-GA:a service domain feature-oriented approach for manufacturing cloud service composition
SDF-GA:a service domain feature-oriented approach for manufacturing cloud service composition
- …