14,552 research outputs found

    Nonlinear Online Learning with Adaptive Nystr\"{o}m Approximation

    Full text link
    Use of nonlinear feature maps via kernel approximation has led to success in many online learning tasks. As a popular kernel approximation method, Nystr\"{o}m approximation, has been well investigated, and various landmark points selection methods have been proposed to improve the approximation quality. However, these improved Nystr\"{o}m methods cannot be directly applied to the online learning setting as they need to access the entire dataset to learn the landmark points, while we need to update model on-the-fly in the online setting. To address this challenge, we propose Adaptive Nystr\"{o}m approximation for solving nonlinear online learning problems. The key idea is to adaptively modify the landmark points via online kmeans and adjust the model accordingly via solving least square problem followed by a gradient descent step. We show that the resulting algorithm outperforms state-of-the-art online learning methods under the same budget

    Area Attention

    Full text link
    Existing attention mechanisms are trained to attend to individual items in a collection (the memory) with a predefined, fixed granularity, e.g., a word token or an image grid. We propose area attention: a way to attend to areas in the memory, where each area contains a group of items that are structurally adjacent, e.g., spatially for a 2D memory such as images, or temporally for a 1D memory such as natural language sentences. Importantly, the shape and the size of an area are dynamically determined via learning, which enables a model to attend to information with varying granularity. Area attention can easily work with existing model architectures such as multi-head attention for simultaneously attending to multiple areas in the memory. We evaluate area attention on two tasks: neural machine translation (both character and token-level) and image captioning, and improve upon strong (state-of-the-art) baselines in all the cases. These improvements are obtainable with a basic form of area attention that is parameter free.Comment: @InProceedings{pmlr-v97-li19e, title = {Area Attention}, author = {Li, Yang and Kaiser, Lukasz and Bengio, Samy and Si, Si}, booktitle = {Proceedings of the 36th International Conference on Machine Learning}, pages = {3846--3855}, year = {2019}, volume = {97}, series = {Proceedings of Machine Learning Research}, publisher = {PMLR}

    Learning to Screen for Fast Softmax Inference on Large Vocabulary Neural Networks

    Full text link
    Neural language models have been widely used in various NLP tasks, including machine translation, next word prediction and conversational agents. However, it is challenging to deploy these models on mobile devices due to their slow prediction speed, where the bottleneck is to compute top candidates in the softmax layer. In this paper, we introduce a novel softmax layer approximation algorithm by exploiting the clustering structure of context vectors. Our algorithm uses a light-weight screening model to predict a much smaller set of candidate words based on the given context, and then conducts an exact softmax only within that subset. Training such a procedure end-to-end is challenging as traditional clustering methods are discrete and non-differentiable, and thus unable to be used with back-propagation in the training process. Using the Gumbel softmax, we are able to train the screening model end-to-end on the training set to exploit data distribution. The algorithm achieves an order of magnitude faster inference than the original softmax layer for predicting top-kk words in various tasks such as beam search in machine translation or next words prediction. For example, for machine translation task on German to English dataset with around 25K vocabulary, we can achieve 20.4 times speed up with 98.9\% precision@1 and 99.3\% precision@5 with the original softmax layer prediction, while state-of-the-art ~\citep{MSRprediction} only achieves 6.7x speedup with 98.7\% precision@1 and 98.1\% precision@5 for the same task

    Localization Trajectory and Chern-Simons axion coupling for Bilayer Quantum Anomalous Hall Systems

    Full text link
    Quantum anomalous Hall (QAH) multilayers provide a platform of topological materials with high Chern numbers. We investigate the localization routes of bilayer QAH systems with Chern number C = 2 under strong disorder, by numerical simulations on their quantum transport properties and the Chern-Simons axion coupling. Compared to the single layer counterpart with C = 2, the localization trajectories present much richer behaviors, for example, the existence of the stable intermediate state with C = 1 can be tuned by model parameters. This state was always unstable in the single layer case. Furthermore, the two parameter scaling trajectories also exhibit multiple patterns, some of which were not captured by the standard Pruisken picture. During the process towards localization, the Chern-Simons axion coupling shows a surprisingly remarkable peak which is even higher and sharper in the large size limit. Therefore the disordered bilayer QAH system can be a good candidate for this nontrivial magnetoelectric coupling mediated by orbital motions.Comment: 11 pages, 11 figure

    Cyclone intensity estimate with context-aware cyclegan

    Full text link
    Deep learning approaches to cyclone intensity estimationhave recently shown promising results. However, sufferingfrom the extreme scarcity of cyclone data on specific in-tensity, most existing deep learning methods fail to achievesatisfactory performance on cyclone intensity estimation,especially on classes with few instances. To avoid the degra-dation of recognition performance caused by scarce samples,we propose a context-aware CycleGAN which learns the la-tent evolution features from adjacent cyclone intensity andsynthesizes CNN features of classes lacking samples fromunpaired source classes. Specifically, our approach synthe-sizes features conditioned on the learned evolution features,while the extra information is not required. Experimentalresults of several evaluation methods show the effectivenessof our approach, even can predicting unseen classes.Comment: 5 page

    Monogamy deficit for quantum correlation in multipartite quantum system

    Full text link
    We introduce the concept of monogamy deficit for quantum correlation by combining together two types of monogamy inequalities depending on different measurement sides. For tripartite pure state, we demonstrate a relation which connects two types of monogamy inequalities for quantum discord and provide the difference between them. By using this relation, we obtain an unified physical interpretation for these two monogamy deficit. In addition, we find an interesting fact that there is a general monogamy condition for several quantum correlations for tripartite pure states. We then provide a necessary and sufficient condition for the establishment of one kind of monogamy inequality for tripartite mixed state and generalize it to multipartite quantum state.Comment: 8 pages, 1 figur

    Cosmic Reionization Study : Principle Component Analysis After Planck

    Full text link
    The study of reionization history plays an important role in understanding the evolution of our universe. It is commonly believed that the intergalactic medium (IGM) in our universe are fully ionized today, however the reionizing process remains to be mysterious. A simple instantaneous reionization process is usually adopted in modern cosmology without direct observational evidence. However, the history of ionization fraction, xe(z)x_e(z) will influence cosmic microwave background (CMB) observables and constraints on optical depth Ï„\tau. With the mocked future data sets based on featured reionization model, we find the bias on Ï„\tau introduced by instantaneous model can not be neglected. In this paper, we study the cosmic reionization history in a model independent way, the so called principle component analysis (PCA) method, and reconstruct xe(z)x_e (z) at different redshift zz with the data sets of Planck, WMAP 9 years temperature and polarization power spectra, combining with the baryon acoustic oscillation (BAO) from galaxy survey and type Ia supernovae (SN) Union 2.1 sample respectively. The results show that reconstructed xe(z)x_e(z) is consistent with instantaneous behavior, however, there exists slight deviation from this behavior at some epoch. With PCA method, after abandoning the noisy modes, we get stronger constraints, and the hints for featured xe(z)x_e(z) evolution could become a little more obvious.Comment: 12 pages, 10 figure

    Deformed Legendre Polynomial and Its Application

    Full text link
    A new kind of deformed calculus was introduced recently in studying of parabosonic coordinate representation. Based on this deformed calculus, a new deformation of Legendre polynomials is proposed in this paper, some properties and applications of which are also discussed.Comment: 11 pages, LaTe

    Tibet′^\primes Window on Primordial Gravitational Waves

    Full text link
    As an essential part of China’s Gravitational Waves Program, the Ali CMB Polarization Telescope (AliCPT) is a ground-based experiment aiming at the Primordial Gravitational Waves (PGWs) by measuring B-mode polarization of Cosmic Microwave Background (CMB). First proposed in 2014 and currently in fast construction phase, AliCPT is China’s first CMB project that plans for commissioning in 2019. Led by the Institute of High Energy Physics (IHEP) under the Chinese Academy of Sciences (CAS), the project is a worldwide collaboration of more than fifteen universities and research institutes. Ali CMB Project is briefly introduced

    Measurement of weak static magnetic fields with nitrogen-vacancy color center

    Full text link
    We propose a strategy to measure weak static magnetic fields with nitrogen-vacancy color center in diamond. Inspired by avian magnetoreception models, we consider the feasibility of utilizing quantum coherence phenomena to measure weak static magnetic fields. Nitrogen-vacancy (NV) color centers are regarded as the ideal platform to study quantum sciences as a result of its long coherence time up to a millisecond timescale. In high-purity diamond, hyperfine interaction with 13C nuclear spins dominates the decoherence process. In this paper, we numerically simulate the decoherence process between 0 and +1 of the individual NV color center spin in 13C nuclear baths with various of magnitudes of external magnetic fields. By applying Hahn echo into the system, we obtain the coherence of NV color center spin as a function of total evolution time and magnetic field. Furthermore we obtain the high-accuracy relationship between the three decoherence-characteristic timescales, i.e. T_W, T_R, T_2, and magnetic field B. And we draw a conclusion that T_R has the highest sensitivity about magnetic field among the three time-scales. Thus, for a certain NV color center, T_R can be the scale for the magnitude of magnetic field, or rather, the component along the NV electronic spin axis. When measuring an unknown magnetic field, we adjust the NV axis to three mutually orthogonal directions respectively. By this means, we obtain the three components of the magnetic field and thus the magnitude and direction of the actual magnetic field. The accuracy could reach 60 nT/Hz^{1/2},and could be greatly improved by using an ensemble of NV color centers or diamond crystals purified with 12C atoms.Comment: 17 pages, 5 figures, 1 tabl
    • …
    corecore