100 research outputs found

    In Car Audio

    Get PDF
    This chapter presents implementations of advanced in Car Audio Applications. The system is composed by three main different applications regarding the In Car listening and communication experience. Starting from a high level description of the algorithms, several implementations on different levels of hardware abstraction are presented, along with empirical results on both the design process undergone and the performance results achieved

    DNN-Based Source Enhancement to Increase Objective Sound Quality Assessment Score

    Get PDF
    We propose a training method for deep neural network (DNN)-based source enhancement to increase objective sound quality assessment (OSQA) scores such as the perceptual evaluation of speech quality (PESQ). In many conventional studies, DNNs have been used as a mapping function to estimate time-frequency masks and trained to minimize an analytically tractable objective function such as the mean squared error (MSE). Since OSQA scores have been used widely for soundquality evaluation, constructing DNNs to increase OSQA scores would be better than using the minimum-MSE to create highquality output signals. However, since most OSQA scores are not analytically tractable, i.e., they are black boxes, the gradient of the objective function cannot be calculated by simply applying back-propagation. To calculate the gradient of the OSQA-based objective function, we formulated a DNN optimization scheme on the basis of black-box optimization, which is used for training a computer that plays a game. For a black-box-optimization scheme, we adopt the policy gradient method for calculating the gradient on the basis of a sampling algorithm. To simulate output signals using the sampling algorithm, DNNs are used to estimate the probability-density function of the output signals that maximize OSQA scores. The OSQA scores are calculated from the simulated output signals, and the DNNs are trained to increase the probability of generating the simulated output signals that achieve high OSQA scores. Through several experiments, we found that OSQA scores significantly increased by applying the proposed method, even though the MSE was not minimized

    Image Quality Modeling and Optimization for Non-Conventional Aperture Imaging Systems

    Get PDF
    The majority of image quality studies have been performed on systems with conventional aperture functions. These systems have straightforward aperture designs and well-understood behavior. Image quality for these systems can be predicted by the General Image Quality Equation (GIQE). However, in order to continue pushing the boundaries of imaging, more control over the point spread function of an imaging system may be necessary. This requires modifications in the pupil plane of a system, causing a departure from the realm of most image quality studies. Examples include sparse apertures, synthetic apertures, coded apertures and phase elements. This work will focus on sparse aperture telescopes and the image quality issues associated with them, however, the methods presented will be applicable to other non-conventional aperture systems. \\ In this research, an approach for modeling the image quality of non-conventional aperture systems will be introduced. While the modeling approach is based in previous work, a novel validation study will be performed, which accounts for the effects of both broadband illumination and wavefront error. One of the key image quality challenges for sparse apertures is post-processing ringing artifacts. These artifacts have been observed in modeled data, but a validation study will be performed to observe them in measured data and to compare them to model predictions. Once validated, the modeling approach will be used to perform a small set of design studies for sparse aperture systems, including spectral bandpass selection and aperture layout optimization

    An Evaluation of multispectral earth-observing multi-aperture telescope designs for target detection and characterization

    Get PDF
    Earth-observing satellites have fundamental size and weight design limits since they must be launched into space. These limits serve to constrain the spatial resolutions that such imaging systems can achieve with traditional telescope design strategies. Segmented and sparse-aperture imaging system designs may offer solutions to this problem. Segmented and sparse-aperture designs can be viewed as competing technologies; both approaches offer solutions for achieving finer resolution imaging from space. Segmented-aperture systems offer greater fill factor, and therefore greater signal-to-noise ratio (SNR), for a given encircled diameter than their sparse aperture counterparts, though their larger segments often suffer from greater optical aberration than those of smaller, sparse designs. Regardless, the use of any multi-aperture imaging system comes at a price; their increased effective aperture size and improvement in spatial resolution are offset by a reduction in image quality due to signal loss (less photon-collecting area) and aberrations introduced by misalignments between individual sub-apertures as compared with monolithic collectors. Introducing multispectral considerations to a multi-aperture imaging system further starves the system of photons and reduces SNR in each spectral band. This work explores multispectral design considerations inherent in 9-element tri-arm sparse aperture, hexagonal-element segmented aperture, and monolithic aperture imaging systems. The primary thrust of this work is to develop an objective target detection-based metric that can be used to compare the achieved image utility of these competing multi-aperture telescope designs over a designated design parameter trade space. Characterizing complex multi-aperture system designs in this way may lead to improved assessment of programmatic risk and reward in the development of higher-resolution imaging capabilities. This method assumes that the stringent requirements for limiting the wavefront error (WFE) associated with multi-aperture imaging systems when producing imagery for visual assessment, can be relaxed when employing target detection-based metrics for evaluating system utility. Simple target detection algorithms were used to determine Receiver Operating Characteristic (ROC) curves for the various simulated multi-aperture system designs that could be used in an objective assessment of each system\u27s ability to support target detection activities. Also, a set of regressed equations was developed that allow one to predict multi-aperture system target detection performance within the bounds of the designated trade space. Suitable metrics for comparing the shapes of two individual ROC curves, such as the total area under the curve (AUC) and the sample Pearson correlation coefficient, were found to be useful tools in validating the predicted results of the trade space regression models. And lastly, some simple rules of thumb relating to multi-aperture system design were identified from the inspection of various points of equivalency between competing system designs, as determined from the comparison metrics employed. The goal of this work, the development of a process for simulating multi-aperture imaging systems and comparing them in terms of target detection tasks, was successfully accomplished. The process presented here could be tailored to the needs of any specific multi-aperture development effort and used as a tool for system design engineers

    Report on the meta-analysis of crop modelling for climate change and food security survey

    Get PDF

    Denoising sparse images from GRAPPA using the nullspace method

    Get PDF
    To accelerate magnetic resonance imaging using uniformly undersampled (nonrandom) parallel imaging beyond what is achievable with generalized autocalibrating partially parallel acquisitions (GRAPPA) alone, the DEnoising of Sparse Images from GRAPPA using the Nullspace method is developed. The trade-off between denoising and smoothing the GRAPPA solution is studied for different levels of acceleration. Several brain images reconstructed from uniformly undersampled k-space data using DEnoising of Sparse Images from GRAPPA using the Nullspace method are compared against reconstructions using existing methods in terms of difference images (a qualitative measure), peak-signal-to-noise ratio, and noise amplification (g-factors) as measured using the pseudo-multiple replica method. Effects of smoothing, including contrast loss, are studied in synthetic phantom data. In the experiments presented, the contrast loss and spatial resolution are competitive with existing methods. Results for several brain images demonstrate significant improvements over GRAPPA at high acceleration factors in denoising performance with limited blurring or smoothing artifacts. In addition, the measured g-factors suggest that DEnoising of Sparse Images from GRAPPA using the Nullspace method mitigates noise amplification better than both GRAPPA and L1 iterative self-consistent parallel imaging reconstruction (the latter limited here by uniform undersampling).National Science Foundation (U.S.) (CAREER Grant 0643836)National Institutes of Health (U.S.) (Grant NIH R01 EB007942)National Institutes of Health (U.S.) (Grant NIH R01 EB006847)National Center for Research Resources (U.S.) (Grant P41 RR014075)Siemens CorporationNational Science Foundation (U.S.). Graduate Research Fellowship Progra

    Subband beamforming with higher order statistics for distant speech recognition

    Get PDF
    This dissertation presents novel beamforming methods for distant speech recognition (DSR). Such techniques can relieve users from the necessity of putting on close talking microphones. DSR systems are useful in many applications such as humanoid robots, voice control systems for automobiles, automatic meeting transcription systems and so on. A main problem in DSR is that recognition performance is seriously degraded when a speaker is far from the microphones. In order to avoid the degradation, noise and reverberation should be removed from signals received with the microphones. Acoustic beamforming techniques have a potential to enhance speech from the far field with little distortion since they can maintain a distortionless constraint for a look direction. In beamforming, multiple signals propagating from a position are captured with multiple microphones. Typical conventional beamformers then adjust their weights so as to minimize the variance of their own outputs subject to a distortionless constraint in a look direction. The variance is the average of the second power (square) of the beamformer\u27s outputs. Accordingly, it is considered that the conventional beamformer uses second orderstatistics (SOS) of the beamformer\u27s outputs. The conventional beamforming techniques can effectively place a null on any source of interference. However, the desired signal is also canceled in reverberant environments, which is known as the signal cancellation problem. To avoid that problem, many algorithms have been developed. However, none of the algorithms can essentially solve the signal cancellation problem in reverberant environments. While many efforts have been made in order to overcome the signal cancellation problem in the field of acoustic beamforming, researchers have addressed another research issue with the microphone array, that is, blind source separation (BSS) [1]. The BSS techniques aim at separating sources from the mixture of signals without information about the geometry of the microphone array and positions of sources. It is achieved by multiplying an un-mixing matrix with input signals. The un-mixing matrix is constructed so that the outputs are stochastically independent. Measuring the stochastic independence of the signals is based on the theory of the independent component analysis (ICA) [1]. The field of ICA is based on the fact that distributions of information-bearing signals are not Gaussian and distributions of sums of various signals are close to Gaussian. There are two popular criteria for measuring the degree of the non-Gaussianity, namely, kurtosis and negentropy. As described in detail in this thesis, both criteria use more than the second moment. Accordingly, it is referred to as higher order statistics (HOS) in contrast to SOS. HOS is not considered in the field of acoustic beamforming well although Arai et al. showed the similarity between acoustic beamforming and BSS [2]. This thesis investigates new beamforming algorithms which take into consideration higher-order statistics (HOS). The new beamforming methods adjust the beamformer\u27s weights based on one of the following criteria: • minimum mutual information of the two beamformer\u27s outputs, • maximum negentropy of the beamformer\u27s outputs and • maximum kurtosis of the beamformer\u27s outputs. Those algorithms do not suffer from the signal cancellation, which is shown in this thesis. Notice that the new beamforming techniques can keep the distortionless constraint for the direction of interest in contrast to the BSS algorithms. The effectiveness of the new techniques is finally demonstrated through a series of distant automatic speech recognition experiments on real data recorded with real sensors unlike other work where signals artificially convolved with measured impulse responses are considered. Significant improvements are achieved by the beamforming algorithms proposed here.Diese Dissertation präsentiert neue Methoden zur Spracherkennung auf Entfernung. Mit diesen Methoden ist es möglich auf Nahbesprechungsmikrofone zu verzichten. Spracherkennungssysteme, die auf Nahbesprechungsmikrofone verzichten, sind in vielen Anwendungen nützlich, wie zum Beispiel bei Humanoiden-Robotern, in Voice Control Systemen für Autos oder bei automatischen Transcriptionssystemen von Meetings. Ein Hauptproblem in der Spracherkennung auf Entfernung ist, dass mit zunehmendem Abstand zwischen Sprecher und Mikrofon, die Genauigkeit der Spracherkennung stark abnimmt. Aus diesem Grund ist es elementar die Störungen, nämlich Hintergrundgeräusche, Hall und Echo, aus den Mikrofonsignalen herauszurechnen. Durch den Einsatz von mehreren Mikrofonen ist eine räumliche Trennung des Nutzsignals von den Störungen möglich. Diese Methode wird als akustisches Beamformen bezeichnet. Konventionelle akustische Beamformer passen ihre Gewichte so an, dass die Varianz des Ausgangssignals minimiert wird, wobei das Signal in "Blickrichtung" die Bedingung der Verzerrungsfreiheit erfüllen muss. Die Varianz ist definiert als das quadratische Mittel des Ausgangssignals.Somit werden bei konventionellen Beamformingmethoden Second-Order Statistics (SOS) des Ausgangssignals verwendet. Konventionelle Beamformer können Störquellen effizient unterdrücken, aber leider auch das Nutzsignal. Diese unerwünschte Unterdrückung des Nutzsignals wird im Englischen signal cancellation genannt und es wurden bereits viele Algorithmen entwickelt um dies zu vermeiden. Keiner dieser Algorithmen, jedoch, funktioniert effektiv in verhallter Umgebung. Eine weitere Methode das Nutzsignal von den Störungen zu trennen, diesesmal jedoch ohne die geometrische Information zu nutzen, wird Blind Source Separation (BSS) [1] genannt. Hierbei wird eine Matrixmultiplikation mit dem Eingangssignal durchgeführt. Die Matrix muss so konstruiert werden, dass die Ausgangssignale statistisch unabhängig voneinander sind. Die statistische Unabhängigkeit wird mit der Theorie der Independent Component Analysis (ICA) gemessen [1]. Die ICA nimmt an, dass informationstragende Signale, wie z.B. Sprache, nicht gaußverteilt sind, wohingegen die Summe der Signale, z.B. das Hintergrundrauschen, gaußverteilt sind. Es gibt zwei gängige Arten um den Grad der Nichtgaußverteilung zu bestimmen, Kurtosis und Negentropy. Wie in dieser Arbeit beschrieben, werden hierbei höhere Momente als das zweite verwendet und somit werden diese Methoden als Higher-Order Statistics (HOS) bezeichnet. Obwohl Arai et al. zeigten, dass sich Beamforming und BSS ähnlich sind, werden HOS beim akustischen Beamforming bisher nicht verwendet [2] und beruhen weiterhin auf SOS. In der hier vorliegenden Dissertation werden neue Beamformingalgorithmen entwickelt und evaluiert, die auf HOS basieren. Die neuen Beamformingmethoden passen ihre Gewichte anhand eines der folgenden Kriterien an: • Minimum Mutual Information zweier Beamformer Ausgangssignale • Maximum Negentropy der Beamformer Ausgangssignale und • Maximum Kurtosis der Beamformer Ausgangssignale. Es wird anhand von Spracherkennerexperimenten (gemessen in Wortfehlerrate) gezeigt, dass die hier entwickelten Beamformingtechniken auch erfolgreich Störquellen in verhallten Umgebungen unterdrücken, was ein klarer Vorteil gegenüber den herkömmlichen Methoden ist

    雑音特性の変動を伴う多様な環境で実用可能な音声強調

    Get PDF
    筑波大学 (University of Tsukuba)201
    corecore