55 research outputs found
Nonparametric Bayesian Modeling for Automated Database Schema Matching
The problem of merging databases arises in many government and commercial
applications. Schema matching, a common first step, identifies equivalent
fields between databases. We introduce a schema matching framework that builds
nonparametric Bayesian models for each field and compares them by computing the
probability that a single model could have generated both fields. Our
experiments show that our method is more accurate and faster than the existing
instance-based matching algorithms in part because of the use of nonparametric
Bayesian models
The Pros and Cons of Compressive Sensing for Wideband Signal Acquisition: Noise Folding vs. Dynamic Range
Compressive sensing (CS) exploits the sparsity present in many signals to
reduce the number of measurements needed for digital acquisition. With this
reduction would come, in theory, commensurate reductions in the size, weight,
power consumption, and/or monetary cost of both signal sensors and any
associated communication links. This paper examines the use of CS in the design
of a wideband radio receiver in a noisy environment. We formulate the problem
statement for such a receiver and establish a reasonable set of requirements
that a receiver should meet to be practically useful. We then evaluate the
performance of a CS-based receiver in two ways: via a theoretical analysis of
its expected performance, with a particular emphasis on noise and dynamic
range, and via simulations that compare the CS receiver against the performance
expected from a conventional implementation. On the one hand, we show that
CS-based systems that aim to reduce the number of acquired measurements are
somewhat sensitive to signal noise, exhibiting a 3dB SNR loss per octave of
subsampling, which parallels the classic noise-folding phenomenon. On the other
hand, we demonstrate that since they sample at a lower rate, CS-based systems
can potentially attain a significantly larger dynamic range. Hence, we conclude
that while a CS-based system has inherent limitations that do impose some
restrictions on its potential applications, it also has attributes that make it
highly desirable in a number of important practical settings
Electronic Commerce Fraud: Towards an Understanding of the Phenomenon
The objective of this paper is to determine the factors that contribute to electronic commerce fraud. We present a model that identifies five causes: the incentives of criminals, the characteristics of victims, the role of technology, the role of enforcement, and system related factors. The Internet has lowered the barriers to entry for criminal enterprises. Victims are unable to determine which sites are real and which ones are fraudulent and lack of reporting further facilitates this type of crime. The lack of enforcement, resulting from inadequate resources and laws, contributes to the lowering of entry barriers to fraudulent businesses. An analysis of FTC cases shows that most crimes are not technologically sophisticated and that greater awareness and experience with this type of schemes people will avoid being victimized
Regime Change: Sampling Rate vs. Bit-Depth in Compressive Sensing
The compressive sensing (CS) framework aims to ease the burden on analog-to-digital converters (ADCs) by exploiting inherent structure in natural and man-made signals. It has been demonstrated that structured signals can be acquired with just a small number of linear measurements, on the order of the signal complexity. In practice, this enables lower sampling rates that can be more easily achieved by current hardware designs. The primary bottleneck that limits ADC sampling rates is quantization, i.e., higher bit-depths impose lower sampling rates. Thus, the decreased sampling rates of CS ADCs accommodate the otherwise limiting quantizer of conventional ADCs. In this thesis, we consider a different approach to CS ADC by shifting towards lower quantizer bit-depths rather than lower sampling rates. We explore the extreme case where each measurement is quantized to just one bit, representing its sign. We develop a new theoretical framework to analyze this extreme case and develop new algorithms for signal reconstruction from such coarsely quantized measurements. The 1-bit CS framework leads us to scenarios where it may be more appropriate to reduce bit-depth instead of sampling rate. We find that there exist two distinct regimes of operation that correspond to high/low signal-to-noise ratio (SNR). In the measurement compression (MC) regime, a high SNR favors acquiring fewer measurements with more bits per measurement (as in conventional CS); in the quantization compression (QC) regime, a low SNR favors acquiring more measurements with fewer bits per measurement (as in this thesis). A surprise from our analysis and experiments is that in many practical applications it is better to operate in the QC regime, even acquiring as few as 1 bit per measurement. The above philosophy extends further to practical CS ADC system designs. We propose two new CS architectures, one of which takes advantage of the fact that the sampling and quantization operations are performed by two different hardware components. The former can be employed at high rates with minimal costs while the latter cannot. Thus, we develop a system that discretizes in time, performs CS preconditioning techniques, and then quantizes at a low rate
Beyond Nyquist: Efficient Sampling of Sparse Bandlimited Signals
Wideband analog signals push contemporary analog-to-digital conversion
systems to their performance limits. In many applications, however, sampling at
the Nyquist rate is inefficient because the signals of interest contain only a
small number of significant frequencies relative to the bandlimit, although the
locations of the frequencies may not be known a priori. For this type of sparse
signal, other sampling strategies are possible. This paper describes a new type
of data acquisition system, called a random demodulator, that is constructed
from robust, readily available components. Let K denote the total number of
frequencies in the signal, and let W denote its bandlimit in Hz. Simulations
suggest that the random demodulator requires just O(K log(W/K)) samples per
second to stably reconstruct the signal. This sampling rate is exponentially
lower than the Nyquist rate of W Hz. In contrast with Nyquist sampling, one
must use nonlinear methods, such as convex programming, to recover the signal
from the samples taken by the random demodulator. This paper provides a
detailed theoretical analysis of the system's performance that supports the
empirical observations.Comment: 24 pages, 8 figure
- β¦