78,686 research outputs found

    Scalable and Sustainable Deep Learning via Randomized Hashing

    Full text link
    Current deep learning architectures are growing larger in order to learn from complex datasets. These architectures require giant matrix multiplication operations to train millions of parameters. Conversely, there is another growing trend to bring deep learning to low-power, embedded devices. The matrix operations, associated with both training and testing of deep networks, are very expensive from a computational and energy standpoint. We present a novel hashing based technique to drastically reduce the amount of computation needed to train and test deep networks. Our approach combines recent ideas from adaptive dropouts and randomized hashing for maximum inner product search to select the nodes with the highest activation efficiently. Our new algorithm for deep learning reduces the overall computational cost of forward and back-propagation by operating on significantly fewer (sparse) nodes. As a consequence, our algorithm uses only 5% of the total multiplications, while keeping on average within 1% of the accuracy of the original model. A unique property of the proposed hashing based back-propagation is that the updates are always sparse. Due to the sparse gradient updates, our algorithm is ideally suited for asynchronous and parallel training leading to near linear speedup with increasing number of cores. We demonstrate the scalability and sustainability (energy efficiency) of our proposed algorithm via rigorous experimental evaluations on several real datasets

    On the Nature of X-ray Variability in Ark 564

    Full text link
    We use data from a recent long ASCA observation of the Narrow Line Seyfert 1 Ark 564 to investigate in detail its timing properties. We show that a thorough analysis of the time series, employing techniques not generally applied to AGN light curves, can provide useful information to characterize the engines of these powerful sources.We searched for signs of non-stationarity in the data, but did not find strong evidences for it. We find that the process causing the variability is very likely nonlinear, suggesting that variability models based on many active regions, as the shot noise model, may not be applicable to Ark 564. The complex light curve can be viewed, for a limited range of time scales, as a fractal object with non-trivial fractal dimension and statistical self-similarity. Finally, using a nonlinear statistic based on the scaling index as a tool to discriminate time series, we demonstrate that the high and low count rate states, which are indistinguishable on the basis of their autocorrelation, structure and probability density functions, are intrinsically different, with the high state characterized by higher complexity.Comment: 13 pages, 13 figures, accepted for publication in A&

    Entropy-scaling search of massive biological data

    Get PDF
    Many datasets exhibit a well-defined structure that can be exploited to design faster search tools, but it is not always clear when such acceleration is possible. Here, we introduce a framework for similarity search based on characterizing a dataset's entropy and fractal dimension. We prove that searching scales in time with metric entropy (number of covering hyperspheres), if the fractal dimension of the dataset is low, and scales in space with the sum of metric entropy and information-theoretic entropy (randomness of the data). Using these ideas, we present accelerated versions of standard tools, with no loss in specificity and little loss in sensitivity, for use in three domains---high-throughput drug screening (Ammolite, 150x speedup), metagenomics (MICA, 3.5x speedup of DIAMOND [3,700x BLASTX]), and protein structure search (esFragBag, 10x speedup of FragBag). Our framework can be used to achieve "compressive omics," and the general theory can be readily applied to data science problems outside of biology.Comment: Including supplement: 41 pages, 6 figures, 4 tables, 1 bo

    Turbulence in the Solar Atmosphere: Manifestations and Diagnostics via Solar Image Processing

    Full text link
    Intermittent magnetohydrodynamical turbulence is most likely at work in the magnetized solar atmosphere. As a result, an array of scaling and multi-scaling image-processing techniques can be used to measure the expected self-organization of solar magnetic fields. While these techniques advance our understanding of the physical system at work, it is unclear whether they can be used to predict solar eruptions, thus obtaining a practical significance for space weather. We address part of this problem by focusing on solar active regions and by investigating the usefulness of scaling and multi-scaling image-processing techniques in solar flare prediction. Since solar flares exhibit spatial and temporal intermittency, we suggest that they are the products of instabilities subject to a critical threshold in a turbulent magnetic configuration. The identification of this threshold in scaling and multi-scaling spectra would then contribute meaningfully to the prediction of solar flares. We find that the fractal dimension of solar magnetic fields and their multi-fractal spectrum of generalized correlation dimensions do not have significant predictive ability. The respective multi-fractal structure functions and their inertial-range scaling exponents, however, probably provide some statistical distinguishing features between flaring and non-flaring active regions. More importantly, the temporal evolution of the above scaling exponents in flaring active regions probably shows a distinct behavior starting a few hours prior to a flare and therefore this temporal behavior may be practically useful in flare prediction. The results of this study need to be validated by more comprehensive works over a large number of solar active regions.Comment: 26 pages, 7 figure
    • …
    corecore