101,241 research outputs found

    Near-Optimal Straggler Mitigation for Distributed Gradient Methods

    Full text link
    Modern learning algorithms use gradient descent updates to train inferential models that best explain data. Scaling these approaches to massive data sizes requires proper distributed gradient descent schemes where distributed worker nodes compute partial gradients based on their partial and local data sets, and send the results to a master node where all the computations are aggregated into a full gradient and the learning model is updated. However, a major performance bottleneck that arises is that some of the worker nodes may run slow. These nodes a.k.a. stragglers can significantly slow down computation as the slowest node may dictate the overall computational time. We propose a distributed computing scheme, called Batched Coupon's Collector (BCC) to alleviate the effect of stragglers in gradient methods. We prove that our BCC scheme is robust to a near optimal number of random stragglers. We also empirically demonstrate that our proposed BCC scheme reduces the run-time by up to 85.4% over Amazon EC2 clusters when compared with other straggler mitigation strategies. We also generalize the proposed BCC scheme to minimize the completion time when implementing gradient descent-based algorithms over heterogeneous worker nodes

    Direct numerical simulation of turbulence on a Connection Machine CM-5

    Get PDF
    In this paper we report on our first experiences with direct numerical simulation of turbulent flow on a 16-node Connection Machine CM-5. The CM-5 has been programmed at a global level using data parallel Fortran. A two-dimensional direct simulation, where the pressure is solved using a Conjugate Gradient method without preconditioning, runs at 23% of the peak. Due to higher communication costs, 3D simulations run at 13% of the peak. A diagonalwise re-ordered Incomplete Choleski Conjugate Gradient method cannot compete with a standard CG-method on the CM-5.

    Optical boundaries for LED-based indoor positioning system

    Get PDF
    Overlap of footprints of light emitting diodes (LEDs) increases the positioning accuracy of wearable LED indoor positioning systems (IPS) but such an approach assumes that the footprint boundaries are defined. In this work, we develop a mathematical model for defining the footprint boundaries of an LED in terms of a threshold angle instead of the conventional half or full angle. To show the effect of the threshold angle, we compare how overlaps and receiver tilts affect the performance of an LED-based IPS when the optical boundary is defined at the threshold angle and at the full angle. Using experimental measurements, simulations, and theoretical analysis, the effect of the defined threshold angle is estimated. The results show that the positional time when using the newly defined threshold angle is 12 times shorter than the time when the full angle is used. When the effect of tilt is considered, the threshold angle time is 22 times shorter than the full angle positioning time. Regarding accuracy, it is shown in this work that a positioning error as low as 230 mm can be obtained. Consequently, while the IPS gives a very low positioning error, a defined threshold angle reduces delays in an overlap-based LED IPS
    corecore