40,696 research outputs found

    Speculative Approximations for Terascale Analytics

    Full text link
    Model calibration is a major challenge faced by the plethora of statistical analytics packages that are increasingly used in Big Data applications. Identifying the optimal model parameters is a time-consuming process that has to be executed from scratch for every dataset/model combination even by experienced data scientists. We argue that the incapacity to evaluate multiple parameter configurations simultaneously and the lack of support to quickly identify sub-optimal configurations are the principal causes. In this paper, we develop two database-inspired techniques for efficient model calibration. Speculative parameter testing applies advanced parallel multi-query processing methods to evaluate several configurations concurrently. The number of configurations is determined adaptively at runtime, while the configurations themselves are extracted from a distribution that is continuously learned following a Bayesian process. Online aggregation is applied to identify sub-optimal configurations early in the processing by incrementally sampling the training dataset and estimating the objective function corresponding to each configuration. We design concurrent online aggregation estimators and define halting conditions to accurately and timely stop the execution. We apply the proposed techniques to distributed gradient descent optimization -- batch and incremental -- for support vector machines and logistic regression models. We implement the resulting solutions in GLADE PF-OLA -- a state-of-the-art Big Data analytics system -- and evaluate their performance over terascale-size synthetic and real datasets. The results confirm that as many as 32 configurations can be evaluated concurrently almost as fast as one, while sub-optimal configurations are detected accurately in as little as a 1/20th1/20^{\text{th}} fraction of the time

    Subtraction-noise projection in gravitational-wave detector networks

    Get PDF
    In this paper, we present a successful implementation of a subtraction-noise projection method into a simple, simulated data analysis pipeline of a gravitational-wave search. We investigate the problem to reveal a weak stochastic background signal which is covered by a strong foreground of compact-binary coalescences. The foreground which is estimated by matched filters, has to be subtracted from the data. Even an optimal analysis of foreground signals will leave subtraction noise due to estimation errors of template parameters which may corrupt the measurement of the background signal. The subtraction noise can be removed by a noise projection. We apply our analysis pipeline to the proposed future-generation space-borne Big Bang Observer (BBO) mission which seeks for a stochastic background of primordial GWs in the frequency range ∼0.1−1\sim 0.1-1 Hz covered by a foreground of black-hole and neutron-star binaries. Our analysis is based on a simulation code which provides a dynamical model of a time-delay interferometer (TDI) network. It generates the data as time series and incorporates the analysis pipeline together with the noise projection. Our results confirm previous ad hoc predictions which say that BBO will be sensitive to backgrounds with fractional energy densities below Ω=10−16\Omega=10^{-16}Comment: 54 pages, 15 figure

    All Transients, All the Time: Real-Time Radio Transient Detection with Interferometric Closure Quantities

    Full text link
    We demonstrate a new technique for detecting radio transients based on interferometric closure quantities. The technique uses the bispectrum, the product of visibilities around a closed-loop of baselines of an interferometer. The bispectrum is calibration independent, resistant to interference, and computationally efficient, so it can be built into correlators for real-time transient detection. Our technique could find celestial transients anywhere in the field of view and localize them to arcsecond precision. At the Karl G. Jansky Very Large Array (VLA), such a system would have a high survey speed and a 5-sigma sensitivity of 38 mJy on 10 ms timescales with 1 GHz of bandwidth. The ability to localize dispersed millisecond pulses to arcsecond precision in large volumes of interferometer data has several unique science applications. Localizing individual pulses from Galactic pulsars will help find X-ray counterparts that define their physical properties, while finding host galaxies of extragalactic transients will measure the electron density of the intergalactic medium with a single dispersed pulse. Exoplanets and active stars have distinct millisecond variability that can be used to identify them and probe their magnetospheres. We use millisecond time scale visibilities from the Allen Telescope Array (ATA) and VLA to show that the bispectrum can detect dispersed pulses and reject local interference. The computational and data efficiency of the bispectrum will help find transients on a range of time scales with next-generation radio interferometers.Comment: Accepted to ApJ. 8 pages, 5 figures, 2 tables. Revised to include discussion of non-Gaussian statistics of techniqu

    Machine Learning in Wireless Sensor Networks: Algorithms, Strategies, and Applications

    Get PDF
    Wireless sensor networks monitor dynamic environments that change rapidly over time. This dynamic behavior is either caused by external factors or initiated by the system designers themselves. To adapt to such conditions, sensor networks often adopt machine learning techniques to eliminate the need for unnecessary redesign. Machine learning also inspires many practical solutions that maximize resource utilization and prolong the lifespan of the network. In this paper, we present an extensive literature review over the period 2002-2013 of machine learning methods that were used to address common issues in wireless sensor networks (WSNs). The advantages and disadvantages of each proposed algorithm are evaluated against the corresponding problem. We also provide a comparative guide to aid WSN designers in developing suitable machine learning solutions for their specific application challenges.Comment: Accepted for publication in IEEE Communications Surveys and Tutorial
    • …
    corecore