56,833 research outputs found
Urban Land Cover Classification with Missing Data Modalities Using Deep Convolutional Neural Networks
Automatic urban land cover classification is a fundamental problem in remote
sensing, e.g. for environmental monitoring. The problem is highly challenging,
as classes generally have high inter-class and low intra-class variance.
Techniques to improve urban land cover classification performance in remote
sensing include fusion of data from different sensors with different data
modalities. However, such techniques require all modalities to be available to
the classifier in the decision-making process, i.e. at test time, as well as in
training. If a data modality is missing at test time, current state-of-the-art
approaches have in general no procedure available for exploiting information
from these modalities. This represents a waste of potentially useful
information. We propose as a remedy a convolutional neural network (CNN)
architecture for urban land cover classification which is able to embed all
available training modalities in a so-called hallucination network. The network
will in effect replace missing data modalities in the test phase, enabling
fusion capabilities even when data modalities are missing in testing. We
demonstrate the method using two datasets consisting of optical and digital
surface model (DSM) images. We simulate missing modalities by assuming that DSM
images are missing during testing. Our method outperforms both standard CNNs
trained only on optical images as well as an ensemble of two standard CNNs. We
further evaluate the potential of our method to handle situations where only
some DSM images are missing during testing. Overall, we show that we can
clearly exploit training time information of the missing modality during
testing
Training Curricula for Open Domain Answer Re-Ranking
In precision-oriented tasks like answer ranking, it is more important to rank
many relevant answers highly than to retrieve all relevant answers. It follows
that a good ranking strategy would be to learn how to identify the easiest
correct answers first (i.e., assign a high ranking score to answers that have
characteristics that usually indicate relevance, and a low ranking score to
those with characteristics that do not), before incorporating more complex
logic to handle difficult cases (e.g., semantic matching or reasoning). In this
work, we apply this idea to the training of neural answer rankers using
curriculum learning. We propose several heuristics to estimate the difficulty
of a given training sample. We show that the proposed heuristics can be used to
build a training curriculum that down-weights difficult samples early in the
training process. As the training process progresses, our approach gradually
shifts to weighting all samples equally, regardless of difficulty. We present a
comprehensive evaluation of our proposed idea on three answer ranking datasets.
Results show that our approach leads to superior performance of two leading
neural ranking architectures, namely BERT and ConvKNRM, using both pointwise
and pairwise losses. When applied to a BERT-based ranker, our method yields up
to a 4% improvement in MRR and a 9% improvement in P@1 (compared to the model
trained without a curriculum). This results in models that can achieve
comparable performance to more expensive state-of-the-art techniques.Comment: Accepted at SIGIR 2020 (long
A Spectral Network Model of Pitch Perception
A model of pitch perception, called the Spatial Pitch Network or SPINET model, is developed and analyzed. The model neurally instantiates ideas front the spectral pitch modeling literature and joins them to basic neural network signal processing designs to simulate a broader range of perceptual pitch data than previous spectral models. The components of the model arc interpreted as peripheral mechanical and neural processing stages, which arc capable of being incorporated into a larger network architecture for separating multiple sound sources in the environment.
The core of the new model transforms a spectral representation of an acoustic source into a spatial distribution of pitch strengths. The SPINET model uses a weighted "harmonic sieve" whereby the strength of activation of a given pitch depends upon a weighted sum of narrow regions around the harmonics of the nominal pitch value, and higher harmonics contribute less to a pitch than lower ones. Suitably chosen harmonic weighting functions enable computer simulations of pitch perception data involving mistuned components, shifted harmonics, and various types of continuous spectra including rippled noise. It is shown how the weighting functions produce the dominance region, how they lead to octave shifts of pitch in response to ambiguous stimuli, and how they lead to a pitch region in response to the octave-spaced Shepard tone complexes and Deutsch tritones without the use of attentional mechanisms to limit pitch choices. An on-center off-surround network in the model helps to produce noise suppression, partial masking and edge pitch. Finally, it is shown how peripheral filtering and short term energy measurements produce a model pitch estimate that is sensitive to certain component phase relationships.Air Force Office of Scientific Research (F49620-92-J-0225); American Society for Engineering Educatio
- …