2,924 research outputs found

    Building Wavelet Histograms on Large Data in MapReduce

    Full text link
    MapReduce is becoming the de facto framework for storing and processing massive data, due to its excellent scalability, reliability, and elasticity. In many MapReduce applications, obtaining a compact accurate summary of data is essential. Among various data summarization tools, histograms have proven to be particularly important and useful for summarizing data, and the wavelet histogram is one of the most widely used histograms. In this paper, we investigate the problem of building wavelet histograms efficiently on large datasets in MapReduce. We measure the efficiency of the algorithms by both end-to-end running time and communication cost. We demonstrate straightforward adaptations of existing exact and approximate methods for building wavelet histograms to MapReduce clusters are highly inefficient. To that end, we design new algorithms for computing exact and approximate wavelet histograms and discuss their implementation in MapReduce. We illustrate our techniques in Hadoop, and compare to baseline solutions with extensive experiments performed in a heterogeneous Hadoop cluster of 16 nodes, using large real and synthetic datasets, up to hundreds of gigabytes. The results suggest significant (often orders of magnitude) performance improvement achieved by our new algorithms.Comment: VLDB201

    Chebyshev Polynomial Approximation for Distributed Signal Processing

    Get PDF
    Unions of graph Fourier multipliers are an important class of linear operators for processing signals defined on graphs. We present a novel method to efficiently distribute the application of these operators to the high-dimensional signals collected by sensor networks. The proposed method features approximations of the graph Fourier multipliers by shifted Chebyshev polynomials, whose recurrence relations make them readily amenable to distributed computation. We demonstrate how the proposed method can be used in a distributed denoising task, and show that the communication requirements of the method scale gracefully with the size of the network.Comment: 8 pages, 5 figures, to appear in the Proceedings of the IEEE International Conference on Distributed Computing in Sensor Systems (DCOSS), June, 2011, Barcelona, Spai

    The Parabolic variance (PVAR), a wavelet variance based on least-square fit

    Full text link
    This article introduces the Parabolic Variance (PVAR), a wavelet variance similar to the Allan variance, based on the Linear Regression (LR) of phase data. The companion article arXiv:1506.05009 [physics.ins-det] details the Ω\Omega frequency counter, which implements the LR estimate. The PVAR combines the advantages of AVAR and MVAR. PVAR is good for long-term analysis because the wavelet spans over 2τ2 \tau, the same of the AVAR wavelet; and good for short-term analysis because the response to white and flicker PM is 1/τ31/\tau^3 and 1/τ21/\tau^2, same as the MVAR. After setting the theoretical framework, we study the degrees of freedom and the confidence interval for the most common noise types. Then, we focus on the detection of a weak noise process at the transition - or corner - where a faster process rolls off. This new perspective raises the question of which variance detects the weak process with the shortest data record. Our simulations show that PVAR is a fortunate tradeoff. PVAR is superior to MVAR in all cases, exhibits the best ability to divide between fast noise phenomena (up to flicker FM), and is almost as good as AVAR for the detection of random walk and drift

    On the application of raised-cosine wavelets for multicarrier systems design

    Get PDF
    YesNew orthogonal wavelet transforms can be designed by changing the wavelet basis functions or by constructing new low-pass filters (LPF). One family of wavelet may appeal, in use, to a particular application than another. In this study, the wavelet transform based on raisedcosine spectrum is used as an independent orthogonal wavelet to study multicarrier modulation behaviour over multipath channel environment. Then, the raised-cosine wavelet is compared with other well-known orthogonal wavelets that are used, also, to build multicarrier modulation systems. Traditional orthogonal wavelets do not have side-lobes, while the raised-cosine wavelets have lots of side-lobes; these characteristics influence the wavelet behaviour. It will be shown that the raised-cosine wavelet transform, as an orthogonal wavelet, does not support the design of multicarrier application well like the existing well-known orthogonal wavelets

    Discrete multitone modulation with principal component filter banks

    Get PDF
    Discrete multitone (DMT) modulation is an attractive method for communication over a nonflat channel with possibly colored noise. The uniform discrete Fourier transform (DFT) filter bank and cosine modulated filter bank have in the past been used in this system because of low complexity. We show in this paper that principal component filter banks (PCFB) which are known to be optimal for data compression and denoising applications, are also optimal for a number of criteria in DMT modulation communication. For example, the PCFB of the effective channel noise power spectrum (noise psd weighted by the inverse of the channel gain) is optimal for DMT modulation in the sense of maximizing bit rate for fixed power and error probabilities. We also establish an optimality property of the PCFB when scalar prefilters and postfilters are used around the channel. The difference between the PCFB and a traditional filter bank such as the brickwall filter bank or DFT filter bank is significant for effective power spectra which depart considerably from monotonicity. The twisted pair channel with its bridged taps, next and fext noises, and AM interference, therefore appears to be a good candidate for the application of a PCFB. This is demonstrated with the help of numerical results for the case of the ADSL channel
    corecore