14,820 research outputs found
Nonlinear Online Learning with Adaptive Nystr\"{o}m Approximation
Use of nonlinear feature maps via kernel approximation has led to success in
many online learning tasks. As a popular kernel approximation method,
Nystr\"{o}m approximation, has been well investigated, and various landmark
points selection methods have been proposed to improve the approximation
quality. However, these improved Nystr\"{o}m methods cannot be directly applied
to the online learning setting as they need to access the entire dataset to
learn the landmark points, while we need to update model on-the-fly in the
online setting. To address this challenge, we propose Adaptive Nystr\"{o}m
approximation for solving nonlinear online learning problems. The key idea is
to adaptively modify the landmark points via online kmeans and adjust the model
accordingly via solving least square problem followed by a gradient descent
step. We show that the resulting algorithm outperforms state-of-the-art online
learning methods under the same budget
Area Attention
Existing attention mechanisms are trained to attend to individual items in a
collection (the memory) with a predefined, fixed granularity, e.g., a word
token or an image grid. We propose area attention: a way to attend to areas in
the memory, where each area contains a group of items that are structurally
adjacent, e.g., spatially for a 2D memory such as images, or temporally for a
1D memory such as natural language sentences. Importantly, the shape and the
size of an area are dynamically determined via learning, which enables a model
to attend to information with varying granularity. Area attention can easily
work with existing model architectures such as multi-head attention for
simultaneously attending to multiple areas in the memory. We evaluate area
attention on two tasks: neural machine translation (both character and
token-level) and image captioning, and improve upon strong (state-of-the-art)
baselines in all the cases. These improvements are obtainable with a basic form
of area attention that is parameter free.Comment: @InProceedings{pmlr-v97-li19e, title = {Area Attention}, author =
{Li, Yang and Kaiser, Lukasz and Bengio, Samy and Si, Si}, booktitle =
{Proceedings of the 36th International Conference on Machine Learning}, pages
= {3846--3855}, year = {2019}, volume = {97}, series = {Proceedings of
Machine Learning Research}, publisher = {PMLR}
Learning to Screen for Fast Softmax Inference on Large Vocabulary Neural Networks
Neural language models have been widely used in various NLP tasks, including
machine translation, next word prediction and conversational agents. However,
it is challenging to deploy these models on mobile devices due to their slow
prediction speed, where the bottleneck is to compute top candidates in the
softmax layer. In this paper, we introduce a novel softmax layer approximation
algorithm by exploiting the clustering structure of context vectors. Our
algorithm uses a light-weight screening model to predict a much smaller set of
candidate words based on the given context, and then conducts an exact softmax
only within that subset. Training such a procedure end-to-end is challenging as
traditional clustering methods are discrete and non-differentiable, and thus
unable to be used with back-propagation in the training process. Using the
Gumbel softmax, we are able to train the screening model end-to-end on the
training set to exploit data distribution. The algorithm achieves an order of
magnitude faster inference than the original softmax layer for predicting
top- words in various tasks such as beam search in machine translation or
next words prediction. For example, for machine translation task on German to
English dataset with around 25K vocabulary, we can achieve 20.4 times speed up
with 98.9\% precision@1 and 99.3\% precision@5 with the original softmax layer
prediction, while state-of-the-art ~\citep{MSRprediction} only achieves 6.7x
speedup with 98.7\% precision@1 and 98.1\% precision@5 for the same task
Localization Trajectory and Chern-Simons axion coupling for Bilayer Quantum Anomalous Hall Systems
Quantum anomalous Hall (QAH) multilayers provide a platform of topological
materials with high Chern numbers. We investigate the localization routes of
bilayer QAH systems with Chern number C = 2 under strong disorder, by numerical
simulations on their quantum transport properties and the Chern-Simons axion
coupling. Compared to the single layer counterpart with C = 2, the localization
trajectories present much richer behaviors, for example, the existence of the
stable intermediate state with C = 1 can be tuned by model parameters. This
state was always unstable in the single layer case. Furthermore, the two
parameter scaling trajectories also exhibit multiple patterns, some of which
were not captured by the standard Pruisken picture. During the process towards
localization, the Chern-Simons axion coupling shows a surprisingly remarkable
peak which is even higher and sharper in the large size limit. Therefore the
disordered bilayer QAH system can be a good candidate for this nontrivial
magnetoelectric coupling mediated by orbital motions.Comment: 11 pages, 11 figure
Cyclone intensity estimate with context-aware cyclegan
Deep learning approaches to cyclone intensity estimationhave recently shown
promising results. However, sufferingfrom the extreme scarcity of cyclone data
on specific in-tensity, most existing deep learning methods fail to
achievesatisfactory performance on cyclone intensity estimation,especially on
classes with few instances. To avoid the degra-dation of recognition
performance caused by scarce samples,we propose a context-aware CycleGAN which
learns the la-tent evolution features from adjacent cyclone intensity
andsynthesizes CNN features of classes lacking samples fromunpaired source
classes. Specifically, our approach synthe-sizes features conditioned on the
learned evolution features,while the extra information is not required.
Experimentalresults of several evaluation methods show the effectivenessof our
approach, even can predicting unseen classes.Comment: 5 page
Cosmic Reionization Study : Principle Component Analysis After Planck
The study of reionization history plays an important role in understanding
the evolution of our universe. It is commonly believed that the intergalactic
medium (IGM) in our universe are fully ionized today, however the reionizing
process remains to be mysterious. A simple instantaneous reionization process
is usually adopted in modern cosmology without direct observational evidence.
However, the history of ionization fraction, will influence cosmic
microwave background (CMB) observables and constraints on optical depth .
With the mocked future data sets based on featured reionization model, we find
the bias on introduced by instantaneous model can not be neglected. In
this paper, we study the cosmic reionization history in a model independent
way, the so called principle component analysis (PCA) method, and reconstruct
at different redshift with the data sets of Planck, WMAP 9 years
temperature and polarization power spectra, combining with the baryon acoustic
oscillation (BAO) from galaxy survey and type Ia supernovae (SN) Union 2.1
sample respectively. The results show that reconstructed is consistent
with instantaneous behavior, however, there exists slight deviation from this
behavior at some epoch. With PCA method, after abandoning the noisy modes, we
get stronger constraints, and the hints for featured evolution could
become a little more obvious.Comment: 12 pages, 10 figure
Monogamy deficit for quantum correlation in multipartite quantum system
We introduce the concept of monogamy deficit for quantum correlation by
combining together two types of monogamy inequalities depending on different
measurement sides. For tripartite pure state, we demonstrate a relation which
connects two types of monogamy inequalities for quantum discord and provide the
difference between them. By using this relation, we obtain an unified physical
interpretation for these two monogamy deficit. In addition, we find an
interesting fact that there is a general monogamy condition for several quantum
correlations for tripartite pure states. We then provide a necessary and
sufficient condition for the establishment of one kind of monogamy inequality
for tripartite mixed state and generalize it to multipartite quantum state.Comment: 8 pages, 1 figur
Deformed Legendre Polynomial and Its Application
A new kind of deformed calculus was introduced recently in studying of
parabosonic coordinate representation. Based on this deformed calculus, a new
deformation of Legendre polynomials is proposed in this paper, some properties
and applications of which are also discussed.Comment: 11 pages, LaTe
Tibets Window on Primordial Gravitational Waves
As an essential part of China’s Gravitational Waves Program, the Ali CMB
Polarization Telescope (AliCPT) is a ground-based experiment aiming at the
Primordial Gravitational Waves (PGWs) by measuring B-mode polarization of
Cosmic Microwave Background (CMB). First proposed in 2014 and currently in fast
construction phase, AliCPT is China’s first CMB project that plans for
commissioning in 2019. Led by the Institute of High Energy Physics (IHEP) under
the Chinese Academy of Sciences (CAS), the project is a worldwide collaboration
of more than fifteen universities and research institutes. Ali CMB Project is
briefly introduced
Measurement of weak static magnetic fields with nitrogen-vacancy color center
We propose a strategy to measure weak static magnetic fields with
nitrogen-vacancy color center in diamond. Inspired by avian magnetoreception
models, we consider the feasibility of utilizing quantum coherence phenomena to
measure weak static magnetic fields. Nitrogen-vacancy (NV) color centers are
regarded as the ideal platform to study quantum sciences as a result of its
long coherence time up to a millisecond timescale. In high-purity diamond,
hyperfine interaction with 13C nuclear spins dominates the decoherence process.
In this paper, we numerically simulate the decoherence process between 0 and +1
of the individual NV color center spin in 13C nuclear baths with various of
magnitudes of external magnetic fields. By applying Hahn echo into the system,
we obtain the coherence of NV color center spin as a function of total
evolution time and magnetic field. Furthermore we obtain the high-accuracy
relationship between the three decoherence-characteristic timescales, i.e. T_W,
T_R, T_2, and magnetic field B. And we draw a conclusion that T_R has the
highest sensitivity about magnetic field among the three time-scales. Thus, for
a certain NV color center, T_R can be the scale for the magnitude of magnetic
field, or rather, the component along the NV electronic spin axis. When
measuring an unknown magnetic field, we adjust the NV axis to three mutually
orthogonal directions respectively. By this means, we obtain the three
components of the magnetic field and thus the magnitude and direction of the
actual magnetic field. The accuracy could reach 60 nT/Hz^{1/2},and could be
greatly improved by using an ensemble of NV color centers or diamond crystals
purified with 12C atoms.Comment: 17 pages, 5 figures, 1 tabl
- …