111,938 research outputs found
Speculative Approximations for Terascale Analytics
Model calibration is a major challenge faced by the plethora of statistical
analytics packages that are increasingly used in Big Data applications.
Identifying the optimal model parameters is a time-consuming process that has
to be executed from scratch for every dataset/model combination even by
experienced data scientists. We argue that the incapacity to evaluate multiple
parameter configurations simultaneously and the lack of support to quickly
identify sub-optimal configurations are the principal causes. In this paper, we
develop two database-inspired techniques for efficient model calibration.
Speculative parameter testing applies advanced parallel multi-query processing
methods to evaluate several configurations concurrently. The number of
configurations is determined adaptively at runtime, while the configurations
themselves are extracted from a distribution that is continuously learned
following a Bayesian process. Online aggregation is applied to identify
sub-optimal configurations early in the processing by incrementally sampling
the training dataset and estimating the objective function corresponding to
each configuration. We design concurrent online aggregation estimators and
define halting conditions to accurately and timely stop the execution. We apply
the proposed techniques to distributed gradient descent optimization -- batch
and incremental -- for support vector machines and logistic regression models.
We implement the resulting solutions in GLADE PF-OLA -- a state-of-the-art Big
Data analytics system -- and evaluate their performance over terascale-size
synthetic and real datasets. The results confirm that as many as 32
configurations can be evaluated concurrently almost as fast as one, while
sub-optimal configurations are detected accurately in as little as a
fraction of the time
Early Accurate Results for Advanced Analytics on MapReduce
Approximate results based on samples often provide the only way in which
advanced analytical applications on very massive data sets can satisfy their
time and resource constraints. Unfortunately, methods and tools for the
computation of accurate early results are currently not supported in
MapReduce-oriented systems although these are intended for `big data'.
Therefore, we proposed and implemented a non-parametric extension of Hadoop
which allows the incremental computation of early results for arbitrary
work-flows, along with reliable on-line estimates of the degree of accuracy
achieved so far in the computation. These estimates are based on a technique
called bootstrapping that has been widely employed in statistics and can be
applied to arbitrary functions and data distributions. In this paper, we
describe our Early Accurate Result Library (EARL) for Hadoop that was designed
to minimize the changes required to the MapReduce framework. Various tests of
EARL of Hadoop are presented to characterize the frequent situations where EARL
can provide major speed-ups over the current version of Hadoop.Comment: VLDB201
Positioning of High-speed Trains using 5G New Radio Synchronization Signals
We study positioning of high-speed trains in 5G new radio (NR) networks by
utilizing specific NR synchronization signals. The studies are based on
simulations with 3GPP-specified radio channel models including path loss,
shadowing and fast fading effects. The considered positioning approach exploits
measurement of Time-Of-Arrival (TOA) and Angle-Of-Departure (AOD), which are
estimated from beamformed NR synchronization signals. Based on the given
measurements and the assumed train movement model, the train position is
tracked by using an Extended Kalman Filter (EKF), which is able to handle the
non-linear relationship between the TOA and AOD measurements, and the estimated
train position parameters. It is shown that in the considered scenario the TOA
measurements are able to achieve better accuracy compared to the AOD
measurements. However, as shown by the results, the best tracking performance
is achieved, when both of the measurements are considered. In this case, a very
high, sub-meter, tracking accuracy can be achieved for most (>75%) of the
tracking time, thus achieving the positioning accuracy requirements envisioned
for the 5G NR. The pursued high-accuracy and high-availability positioning
technology is considered to be in a key role in several envisioned HST use
cases, such as mission-critical autonomous train systems.Comment: 6 pages, 5 figures, IEEE WCNC 2018 (Wireless Communications and
Networking Conference
- …