40,696 research outputs found
Speculative Approximations for Terascale Analytics
Model calibration is a major challenge faced by the plethora of statistical
analytics packages that are increasingly used in Big Data applications.
Identifying the optimal model parameters is a time-consuming process that has
to be executed from scratch for every dataset/model combination even by
experienced data scientists. We argue that the incapacity to evaluate multiple
parameter configurations simultaneously and the lack of support to quickly
identify sub-optimal configurations are the principal causes. In this paper, we
develop two database-inspired techniques for efficient model calibration.
Speculative parameter testing applies advanced parallel multi-query processing
methods to evaluate several configurations concurrently. The number of
configurations is determined adaptively at runtime, while the configurations
themselves are extracted from a distribution that is continuously learned
following a Bayesian process. Online aggregation is applied to identify
sub-optimal configurations early in the processing by incrementally sampling
the training dataset and estimating the objective function corresponding to
each configuration. We design concurrent online aggregation estimators and
define halting conditions to accurately and timely stop the execution. We apply
the proposed techniques to distributed gradient descent optimization -- batch
and incremental -- for support vector machines and logistic regression models.
We implement the resulting solutions in GLADE PF-OLA -- a state-of-the-art Big
Data analytics system -- and evaluate their performance over terascale-size
synthetic and real datasets. The results confirm that as many as 32
configurations can be evaluated concurrently almost as fast as one, while
sub-optimal configurations are detected accurately in as little as a
fraction of the time
Subtraction-noise projection in gravitational-wave detector networks
In this paper, we present a successful implementation of a subtraction-noise
projection method into a simple, simulated data analysis pipeline of a
gravitational-wave search. We investigate the problem to reveal a weak
stochastic background signal which is covered by a strong foreground of
compact-binary coalescences. The foreground which is estimated by matched
filters, has to be subtracted from the data. Even an optimal analysis of
foreground signals will leave subtraction noise due to estimation errors of
template parameters which may corrupt the measurement of the background signal.
The subtraction noise can be removed by a noise projection. We apply our
analysis pipeline to the proposed future-generation space-borne Big Bang
Observer (BBO) mission which seeks for a stochastic background of primordial
GWs in the frequency range Hz covered by a foreground of
black-hole and neutron-star binaries. Our analysis is based on a simulation
code which provides a dynamical model of a time-delay interferometer (TDI)
network. It generates the data as time series and incorporates the analysis
pipeline together with the noise projection. Our results confirm previous ad
hoc predictions which say that BBO will be sensitive to backgrounds with
fractional energy densities below Comment: 54 pages, 15 figure
All Transients, All the Time: Real-Time Radio Transient Detection with Interferometric Closure Quantities
We demonstrate a new technique for detecting radio transients based on
interferometric closure quantities. The technique uses the bispectrum, the
product of visibilities around a closed-loop of baselines of an interferometer.
The bispectrum is calibration independent, resistant to interference, and
computationally efficient, so it can be built into correlators for real-time
transient detection. Our technique could find celestial transients anywhere in
the field of view and localize them to arcsecond precision. At the Karl G.
Jansky Very Large Array (VLA), such a system would have a high survey speed and
a 5-sigma sensitivity of 38 mJy on 10 ms timescales with 1 GHz of bandwidth.
The ability to localize dispersed millisecond pulses to arcsecond precision in
large volumes of interferometer data has several unique science applications.
Localizing individual pulses from Galactic pulsars will help find X-ray
counterparts that define their physical properties, while finding host galaxies
of extragalactic transients will measure the electron density of the
intergalactic medium with a single dispersed pulse. Exoplanets and active stars
have distinct millisecond variability that can be used to identify them and
probe their magnetospheres. We use millisecond time scale visibilities from the
Allen Telescope Array (ATA) and VLA to show that the bispectrum can detect
dispersed pulses and reject local interference. The computational and data
efficiency of the bispectrum will help find transients on a range of time
scales with next-generation radio interferometers.Comment: Accepted to ApJ. 8 pages, 5 figures, 2 tables. Revised to include
discussion of non-Gaussian statistics of techniqu
Machine Learning in Wireless Sensor Networks: Algorithms, Strategies, and Applications
Wireless sensor networks monitor dynamic environments that change rapidly
over time. This dynamic behavior is either caused by external factors or
initiated by the system designers themselves. To adapt to such conditions,
sensor networks often adopt machine learning techniques to eliminate the need
for unnecessary redesign. Machine learning also inspires many practical
solutions that maximize resource utilization and prolong the lifespan of the
network. In this paper, we present an extensive literature review over the
period 2002-2013 of machine learning methods that were used to address common
issues in wireless sensor networks (WSNs). The advantages and disadvantages of
each proposed algorithm are evaluated against the corresponding problem. We
also provide a comparative guide to aid WSN designers in developing suitable
machine learning solutions for their specific application challenges.Comment: Accepted for publication in IEEE Communications Surveys and Tutorial
- …