1,105 research outputs found

    Learning Optimization-inspired Image Propagation with Control Mechanisms and Architecture Augmentations for Low-level Vision

    Full text link
    In recent years, building deep learning models from optimization perspectives has becoming a promising direction for solving low-level vision problems. The main idea of most existing approaches is to straightforwardly combine numerical iterations with manually designed network architectures to generate image propagations for specific kinds of optimization models. However, these heuristic learning models often lack mechanisms to control the propagation and rely on architecture engineering heavily. To mitigate the above issues, this paper proposes a unified optimization-inspired deep image propagation framework to aggregate Generative, Discriminative and Corrective (GDC for short) principles for a variety of low-level vision tasks. Specifically, we first formulate low-level vision tasks using a generic optimization objective and construct our fundamental propagative modules from three different viewpoints, i.e., the solution could be obtained/learned 1) in generative manner; 2) based on discriminative metric, and 3) with domain knowledge correction. By designing control mechanisms to guide image propagations, we then obtain convergence guarantees of GDC for both fully- and partially-defined optimization formulations. Furthermore, we introduce two architecture augmentation strategies (i.e., normalization and automatic search) to respectively enhance the propagation stability and task/data-adaption ability. Extensive experiments on different low-level vision applications demonstrate the effectiveness and flexibility of GDC.Comment: 15 page

    Single- and multi-microphone speech dereverberation using spectral enhancement

    Get PDF
    In speech communication systems, such as voice-controlled systems, hands-free mobile telephones, and hearing aids, the received microphone signals are degraded by room reverberation, background noise, and other interferences. This signal degradation may lead to total unintelligibility of the speech and decreases the performance of automatic speech recognition systems. In the context of this work reverberation is the process of multi-path propagation of an acoustic sound from its source to one or more microphones. The received microphone signal generally consists of a direct sound, reflections that arrive shortly after the direct sound (commonly called early reverberation), and reflections that arrive after the early reverberation (commonly called late reverberation). Reverberant speech can be described as sounding distant with noticeable echo and colouration. These detrimental perceptual effects are primarily caused by late reverberation, and generally increase with increasing distance between the source and microphone. Conversely, early reverberations tend to improve the intelligibility of speech. In combination with the direct sound it is sometimes referred to as the early speech component. Reduction of the detrimental effects of reflections is evidently of considerable practical importance, and is the focus of this dissertation. More specifically the dissertation deals with dereverberation techniques, i.e., signal processing techniques to reduce the detrimental effects of reflections. In the dissertation, novel single- and multimicrophone speech dereverberation algorithms are developed that aim at the suppression of late reverberation, i.e., at estimation of the early speech component. This is done via so-called spectral enhancement techniques that require a specific measure of the late reverberant signal. This measure, called spectral variance, can be estimated directly from the received (possibly noisy) reverberant signal(s) using a statistical reverberation model and a limited amount of a priori knowledge about the acoustic channel(s) between the source and the microphone(s). In our work an existing single-channel statistical reverberation model serves as a starting point. The model is characterized by one parameter that depends on the acoustic characteristics of the environment. We show that the spectral variance estimator that is based on this model, can only be used when the source-microphone distance is larger than the so-called critical distance. This is, crudely speaking, the distance where the direct sound power is equal to the total reflective power. A generalization of the statistical reverberation model in which the direct sound is incorporated is developed. This model requires one additional parameter that is related to the ratio between the direct sound energy and the sound energy of all reflections. The generalized model is used to derive a novel spectral variance estimator. When the novel estimator is used for dereverberation rather than the existing estimator, and the source-microphone distance is smaller than the critical distance, the dereverberation performance is significantly increased. Single-microphone systems only exploit the temporal and spectral diversity of the received signal. Reverberation, of course, also induces spatial diversity. To additionally exploit this diversity, multiple microphones must be used, and their outputs must be combined by a suitable spatial processor such as the so-called delay and sum beamformer. It is not a priori evident whether spectral enhancement is best done before or after the spatial processor. For this reason we investigate both possibilities, as well as a merge of the spatial processor and the spectral enhancement technique. An advantage of the latter option is that the spectral variance estimator can be further improved. Our experiments show that the use of multiple microphones affords a significant improvement of the perceptual speech quality. The applicability of the theory developed in this dissertation is demonstrated using a hands-free communication system. Since hands-free systems are often used in a noisy and reverberant environment, the received microphone signal does not only contain the desired signal but also interferences such as room reverberation that is caused by the desired source, background noise, and a far-end echo signal that results from a sound that is produced by the loudspeaker. Usually an acoustic echo canceller is used to cancel the far-end echo. Additionally a post-processor is used to suppress background noise and residual echo, i.e., echo which could not be cancelled by the echo canceller. In this work a novel structure and post-processor for an acoustic echo canceller are developed. The post-processor suppresses late reverberation caused by the desired source, residual echo, and background noise. The late reverberation and late residual echo are estimated using the generalized statistical reverberation model. Experimental results convincingly demonstrate the benefits of the proposed system for suppressing late reverberation, residual echo and background noise. The proposed structure and post-processor have a low computational complexity, a highly modular structure, can be seamlessly integrated into existing hands-free communication systems, and affords a significant increase of the listening comfort and speech intelligibility

    Time of Arrival and Angle of Arrival Estimation of LTE Signals for Positioning Applications

    Get PDF
    With the increase of services that need accurate location of the user, new techniques that cooperate with the Global Navigation Satellite System (GNSS) are necessary. Toward this objective, this thesis presents our research work about the estimation of the time of arrival (TOA) and of the angle of arrival (AOA) exploiting modern cellular signals. In particular, we focus on the Third Generation Partnership Project (3GPP) Long Term Evolution (LTE) standard, and in particular uplink and downlink reference signals are exploited to this purposes. The current release of the 3GPP LTE specification supports a UTDOA localization technique based on the Sounding Reference Signal (SRS). In real environments, however, user equipments (UE) are rarely set up to transmit this particular signal. The main original contribution of this thesis consists in a new TOA estimation method based on uplink transmission. In particular, we explore the possibility of performing radio localization exploiting the uplink Demodulation Reference Signal (DM-RS), which is always sent by UEs during data transmission. Real uplink transmissions are modeled in simulations and the performance of known algorithms like SAGE and IAA-APES are evaluated for TOA estimation. A new method to estimate the initial conditions of the SAGE algorithm is proposed and the estimation performance in uplink scenarios is evaluated. The analysis revealed that the proposed method outperforms the non-coherent initial conditions estimation proposed in the literature, when uplink transmission are used. Then, the benefits of our proposal are evaluated and the feasibility of TOA estimation exploiting the DM-RS is demonstrated by means of experiments using real DM-RS signals generated by an LTE module. A second original contribution is given by AOA estimation. In particular, the independence of AOA estimation with respect to uplink and downlink transmission is verified. According to this result, the performance of IAA-APES and SAGE in real-world AOA experiments is evaluated in the downlink scenarios. Based on the overall results, we conclude that the proposed radio localization method, exploiting the uplink Demodulation Reference Signal (DM-RS), can be extended also to joint TOA, AOA using SAGE, for hybrid localization techniques. We can also conclude that the proposed method can be easily extended to downlink transmission exploiting the cell specific reference signal (CRS)

    Design and Development of Intelligent Sensors

    Get PDF
    In this project, we make an extensive study of Intelligent Sensors and devise methods for analyzing them through various proposed algorithms broadly classified into Direct and Inverse Modeling. Also we look at the analysis of Blind Equalization in any sensor. A regular sensor is a device which simply measures a signal and converts it into another signal which can be read by an observer and an instrument. A sensor's sensitivity indicates how much the sensor's output changes when the measured quantity changes. Ideal sensors are designed to be linear. The output signal of such a sensor is linearly proportional to the value of the measured property. The sensitivity is then defined as the ratio between output signal and measured property. For example, if a sensor measures temperature and gives a voltage output, the sensitivity is a constant with the unit [V/K]; this sensor is linear because the ratio is constant at all points of measurement. If the sensor is not ideal, several types of deviations can occur which render the sensor results inaccurate. On the other hand, an intelligent sensor takes some predefined action when it senses the appropriate input (light, heat, sound, motion, touch, etc.).A sensor is intelligent when it is capable of correcting errors occurred during measurement both at the input and output ends. It generally processes the signal by means of suitable methods implemented in the device before communicating it. As we discussed an ideal sensor should have linear relationship with the measures quantity. But since in practice there are several factors which introduce non-linearity in a system, we need intelligent sensors. This particular project concentrates on the compensation of difficulties faced due to the non-linear response characteristics of a capacitive pressure sensor (CPS).It studies the design of an intelligent CPS using direct and inverse modeling switched-capacitor circuit(SCC) converts the change in capacitance of the pressure-sensor into an equivalent voltage output . The effect of change in environmental conditions on the CPS and subsequently on the output of the SCC is such that it makes the output non-linear in nature. Especially change in ambient temperature causes response characteristics of the CPS to become highly nonlinear, and complex signal processing may be required to obtain correct results. The performance of the control system depends on the performance of the sensing element. It is observed that many sensors exhibit nonlinear input-output characteristics. Due to such nonlinearities direct digital readout is not possible. As a result we are forced to employ the sensors only in the linear region of their characteristics. In other words their usable range gets restricted due to the presence of nonlinearity. If a sensor is used for full range of its nonlinear characteristics, accuracy of measurement is severely affected. Similar effect is also observed in case of LVDT. The nonlinearity present is usually time-varying and unpredictable as it depends on many uncertain factors. Nonlinearity also creeps in due to change in environmental conditions such as temperature and humidity. In addition ageing of the sensors also introduces nonlinearity. The proposed scheme incorporates intelligence into the sensor. We use many algorithms and ANN models to make the sensor ‘intelligent’. Also there is an analysis of the Blind Deconvolution Techniques that maybe used for Channel Estimation. As it is a relatively new field of work, the challenges are huge but opportunities are many as well. We try to make sensors more intelligent as they would allow a varied application of them in industry, academic and domestic environments
    corecore