111,938 research outputs found

    Speculative Approximations for Terascale Analytics

    Full text link
    Model calibration is a major challenge faced by the plethora of statistical analytics packages that are increasingly used in Big Data applications. Identifying the optimal model parameters is a time-consuming process that has to be executed from scratch for every dataset/model combination even by experienced data scientists. We argue that the incapacity to evaluate multiple parameter configurations simultaneously and the lack of support to quickly identify sub-optimal configurations are the principal causes. In this paper, we develop two database-inspired techniques for efficient model calibration. Speculative parameter testing applies advanced parallel multi-query processing methods to evaluate several configurations concurrently. The number of configurations is determined adaptively at runtime, while the configurations themselves are extracted from a distribution that is continuously learned following a Bayesian process. Online aggregation is applied to identify sub-optimal configurations early in the processing by incrementally sampling the training dataset and estimating the objective function corresponding to each configuration. We design concurrent online aggregation estimators and define halting conditions to accurately and timely stop the execution. We apply the proposed techniques to distributed gradient descent optimization -- batch and incremental -- for support vector machines and logistic regression models. We implement the resulting solutions in GLADE PF-OLA -- a state-of-the-art Big Data analytics system -- and evaluate their performance over terascale-size synthetic and real datasets. The results confirm that as many as 32 configurations can be evaluated concurrently almost as fast as one, while sub-optimal configurations are detected accurately in as little as a 1/20th1/20^{\text{th}} fraction of the time

    Early Accurate Results for Advanced Analytics on MapReduce

    Full text link
    Approximate results based on samples often provide the only way in which advanced analytical applications on very massive data sets can satisfy their time and resource constraints. Unfortunately, methods and tools for the computation of accurate early results are currently not supported in MapReduce-oriented systems although these are intended for `big data'. Therefore, we proposed and implemented a non-parametric extension of Hadoop which allows the incremental computation of early results for arbitrary work-flows, along with reliable on-line estimates of the degree of accuracy achieved so far in the computation. These estimates are based on a technique called bootstrapping that has been widely employed in statistics and can be applied to arbitrary functions and data distributions. In this paper, we describe our Early Accurate Result Library (EARL) for Hadoop that was designed to minimize the changes required to the MapReduce framework. Various tests of EARL of Hadoop are presented to characterize the frequent situations where EARL can provide major speed-ups over the current version of Hadoop.Comment: VLDB201

    Positioning of High-speed Trains using 5G New Radio Synchronization Signals

    Get PDF
    We study positioning of high-speed trains in 5G new radio (NR) networks by utilizing specific NR synchronization signals. The studies are based on simulations with 3GPP-specified radio channel models including path loss, shadowing and fast fading effects. The considered positioning approach exploits measurement of Time-Of-Arrival (TOA) and Angle-Of-Departure (AOD), which are estimated from beamformed NR synchronization signals. Based on the given measurements and the assumed train movement model, the train position is tracked by using an Extended Kalman Filter (EKF), which is able to handle the non-linear relationship between the TOA and AOD measurements, and the estimated train position parameters. It is shown that in the considered scenario the TOA measurements are able to achieve better accuracy compared to the AOD measurements. However, as shown by the results, the best tracking performance is achieved, when both of the measurements are considered. In this case, a very high, sub-meter, tracking accuracy can be achieved for most (>75%) of the tracking time, thus achieving the positioning accuracy requirements envisioned for the 5G NR. The pursued high-accuracy and high-availability positioning technology is considered to be in a key role in several envisioned HST use cases, such as mission-critical autonomous train systems.Comment: 6 pages, 5 figures, IEEE WCNC 2018 (Wireless Communications and Networking Conference
    • …
    corecore