2,924 research outputs found
Building Wavelet Histograms on Large Data in MapReduce
MapReduce is becoming the de facto framework for storing and processing
massive data, due to its excellent scalability, reliability, and elasticity. In
many MapReduce applications, obtaining a compact accurate summary of data is
essential. Among various data summarization tools, histograms have proven to be
particularly important and useful for summarizing data, and the wavelet
histogram is one of the most widely used histograms. In this paper, we
investigate the problem of building wavelet histograms efficiently on large
datasets in MapReduce. We measure the efficiency of the algorithms by both
end-to-end running time and communication cost. We demonstrate straightforward
adaptations of existing exact and approximate methods for building wavelet
histograms to MapReduce clusters are highly inefficient. To that end, we design
new algorithms for computing exact and approximate wavelet histograms and
discuss their implementation in MapReduce. We illustrate our techniques in
Hadoop, and compare to baseline solutions with extensive experiments performed
in a heterogeneous Hadoop cluster of 16 nodes, using large real and synthetic
datasets, up to hundreds of gigabytes. The results suggest significant (often
orders of magnitude) performance improvement achieved by our new algorithms.Comment: VLDB201
Chebyshev Polynomial Approximation for Distributed Signal Processing
Unions of graph Fourier multipliers are an important class of linear
operators for processing signals defined on graphs. We present a novel method
to efficiently distribute the application of these operators to the
high-dimensional signals collected by sensor networks. The proposed method
features approximations of the graph Fourier multipliers by shifted Chebyshev
polynomials, whose recurrence relations make them readily amenable to
distributed computation. We demonstrate how the proposed method can be used in
a distributed denoising task, and show that the communication requirements of
the method scale gracefully with the size of the network.Comment: 8 pages, 5 figures, to appear in the Proceedings of the IEEE
International Conference on Distributed Computing in Sensor Systems (DCOSS),
June, 2011, Barcelona, Spai
The Parabolic variance (PVAR), a wavelet variance based on least-square fit
This article introduces the Parabolic Variance (PVAR), a wavelet variance
similar to the Allan variance, based on the Linear Regression (LR) of phase
data. The companion article arXiv:1506.05009 [physics.ins-det] details the
frequency counter, which implements the LR estimate.
The PVAR combines the advantages of AVAR and MVAR. PVAR is good for long-term
analysis because the wavelet spans over , the same of the AVAR wavelet;
and good for short-term analysis because the response to white and flicker PM
is and , same as the MVAR.
After setting the theoretical framework, we study the degrees of freedom and
the confidence interval for the most common noise types. Then, we focus on the
detection of a weak noise process at the transition - or corner - where a
faster process rolls off. This new perspective raises the question of which
variance detects the weak process with the shortest data record. Our
simulations show that PVAR is a fortunate tradeoff. PVAR is superior to MVAR in
all cases, exhibits the best ability to divide between fast noise phenomena (up
to flicker FM), and is almost as good as AVAR for the detection of random walk
and drift
On the application of raised-cosine wavelets for multicarrier systems design
YesNew orthogonal wavelet transforms can be designed by changing the wavelet basis functions or by constructing new low-pass filters (LPF). One family of wavelet may appeal, in use, to a particular application than another. In this study, the wavelet transform based on raisedcosine spectrum is used as an independent orthogonal wavelet to study multicarrier modulation behaviour over multipath channel environment. Then, the raised-cosine wavelet is compared with other well-known orthogonal wavelets that are used, also, to build multicarrier modulation systems. Traditional orthogonal wavelets do not have side-lobes, while the raised-cosine wavelets have lots of side-lobes; these characteristics influence the wavelet behaviour. It will be shown that the raised-cosine wavelet transform, as an orthogonal wavelet, does not support the design of multicarrier application well like the existing well-known orthogonal wavelets
Discrete multitone modulation with principal component filter banks
Discrete multitone (DMT) modulation is an attractive method for communication over a nonflat channel with possibly colored noise. The uniform discrete Fourier transform (DFT) filter bank and cosine modulated filter bank have in the past been used in this system because of low complexity. We show in this paper that principal component filter banks (PCFB) which are known to be optimal for data compression and denoising applications, are also optimal for a number of criteria in DMT modulation communication. For example, the PCFB of the effective channel noise power spectrum (noise psd weighted by the inverse of the channel gain) is optimal for DMT modulation in the sense of maximizing bit rate for fixed power and error probabilities. We also establish an optimality property of the PCFB when scalar prefilters and postfilters are used around the channel. The difference between the PCFB and a traditional filter bank such as the brickwall filter bank or DFT filter bank is significant for effective power spectra which depart considerably from monotonicity. The twisted pair channel with its bridged taps, next and fext noises, and AM interference, therefore appears to be a good candidate for the application of a PCFB. This is demonstrated with the help of numerical results for the case of the ADSL channel
- …