55 research outputs found

    Nonparametric Bayesian Modeling for Automated Database Schema Matching

    Full text link
    The problem of merging databases arises in many government and commercial applications. Schema matching, a common first step, identifies equivalent fields between databases. We introduce a schema matching framework that builds nonparametric Bayesian models for each field and compares them by computing the probability that a single model could have generated both fields. Our experiments show that our method is more accurate and faster than the existing instance-based matching algorithms in part because of the use of nonparametric Bayesian models

    The Pros and Cons of Compressive Sensing for Wideband Signal Acquisition: Noise Folding vs. Dynamic Range

    Full text link
    Compressive sensing (CS) exploits the sparsity present in many signals to reduce the number of measurements needed for digital acquisition. With this reduction would come, in theory, commensurate reductions in the size, weight, power consumption, and/or monetary cost of both signal sensors and any associated communication links. This paper examines the use of CS in the design of a wideband radio receiver in a noisy environment. We formulate the problem statement for such a receiver and establish a reasonable set of requirements that a receiver should meet to be practically useful. We then evaluate the performance of a CS-based receiver in two ways: via a theoretical analysis of its expected performance, with a particular emphasis on noise and dynamic range, and via simulations that compare the CS receiver against the performance expected from a conventional implementation. On the one hand, we show that CS-based systems that aim to reduce the number of acquired measurements are somewhat sensitive to signal noise, exhibiting a 3dB SNR loss per octave of subsampling, which parallels the classic noise-folding phenomenon. On the other hand, we demonstrate that since they sample at a lower rate, CS-based systems can potentially attain a significantly larger dynamic range. Hence, we conclude that while a CS-based system has inherent limitations that do impose some restrictions on its potential applications, it also has attributes that make it highly desirable in a number of important practical settings

    Electronic Commerce Fraud: Towards an Understanding of the Phenomenon

    Get PDF
    The objective of this paper is to determine the factors that contribute to electronic commerce fraud. We present a model that identifies five causes: the incentives of criminals, the characteristics of victims, the role of technology, the role of enforcement, and system related factors. The Internet has lowered the barriers to entry for criminal enterprises. Victims are unable to determine which sites are real and which ones are fraudulent and lack of reporting further facilitates this type of crime. The lack of enforcement, resulting from inadequate resources and laws, contributes to the lowering of entry barriers to fraudulent businesses. An analysis of FTC cases shows that most crimes are not technologically sophisticated and that greater awareness and experience with this type of schemes people will avoid being victimized

    Regime Change: Sampling Rate vs. Bit-Depth in Compressive Sensing

    Get PDF
    The compressive sensing (CS) framework aims to ease the burden on analog-to-digital converters (ADCs) by exploiting inherent structure in natural and man-made signals. It has been demonstrated that structured signals can be acquired with just a small number of linear measurements, on the order of the signal complexity. In practice, this enables lower sampling rates that can be more easily achieved by current hardware designs. The primary bottleneck that limits ADC sampling rates is quantization, i.e., higher bit-depths impose lower sampling rates. Thus, the decreased sampling rates of CS ADCs accommodate the otherwise limiting quantizer of conventional ADCs. In this thesis, we consider a different approach to CS ADC by shifting towards lower quantizer bit-depths rather than lower sampling rates. We explore the extreme case where each measurement is quantized to just one bit, representing its sign. We develop a new theoretical framework to analyze this extreme case and develop new algorithms for signal reconstruction from such coarsely quantized measurements. The 1-bit CS framework leads us to scenarios where it may be more appropriate to reduce bit-depth instead of sampling rate. We find that there exist two distinct regimes of operation that correspond to high/low signal-to-noise ratio (SNR). In the measurement compression (MC) regime, a high SNR favors acquiring fewer measurements with more bits per measurement (as in conventional CS); in the quantization compression (QC) regime, a low SNR favors acquiring more measurements with fewer bits per measurement (as in this thesis). A surprise from our analysis and experiments is that in many practical applications it is better to operate in the QC regime, even acquiring as few as 1 bit per measurement. The above philosophy extends further to practical CS ADC system designs. We propose two new CS architectures, one of which takes advantage of the fact that the sampling and quantization operations are performed by two different hardware components. The former can be employed at high rates with minimal costs while the latter cannot. Thus, we develop a system that discretizes in time, performs CS preconditioning techniques, and then quantizes at a low rate

    Beyond Nyquist: Efficient Sampling of Sparse Bandlimited Signals

    Get PDF
    Wideband analog signals push contemporary analog-to-digital conversion systems to their performance limits. In many applications, however, sampling at the Nyquist rate is inefficient because the signals of interest contain only a small number of significant frequencies relative to the bandlimit, although the locations of the frequencies may not be known a priori. For this type of sparse signal, other sampling strategies are possible. This paper describes a new type of data acquisition system, called a random demodulator, that is constructed from robust, readily available components. Let K denote the total number of frequencies in the signal, and let W denote its bandlimit in Hz. Simulations suggest that the random demodulator requires just O(K log(W/K)) samples per second to stably reconstruct the signal. This sampling rate is exponentially lower than the Nyquist rate of W Hz. In contrast with Nyquist sampling, one must use nonlinear methods, such as convex programming, to recover the signal from the samples taken by the random demodulator. This paper provides a detailed theoretical analysis of the system's performance that supports the empirical observations.Comment: 24 pages, 8 figure
    • …
    corecore