102 research outputs found

    Universal rank-order transform to extract signals from noisy data

    Get PDF
    We introduce an ordinate method for noisy data analysis, based solely on rank information and thus insensitive to outliers. The method is nonparametric and objective, and the required data processing is parsimonious. The main ingredients include a rank-order data matrix and its transform to a stable form, which provide linear trends in excellent agreement with least squares regression, despite the loss of magnitude information. A group symmetry orthogonal decomposition of the 2D rank-order transform for iid (white) noise is further ordered by principal component analysis. This two-step procedure provides a noise “etalon” used to characterize arbitrary stationary stochastic processes. The method readily distinguishes both the Ornstein-Uhlenbeck process and chaos generated by the logistic map from white noise. Ranking within randomness differs fundamentally from that in deterministic chaos and signals, thus forming the basis for signal detection. To further illustrate the breadth of applications, we apply this ordinate method to the canonical nonlinear parameter estimation problem of two-species radioactive decay, outperforming special-purpose least squares software. We demonstrate that the method excels when extracting trends in heavy-tailed noise and, unlike the Thiele-Sen estimator, is not limited to linear regression. A simple expression is given that yields a close approximation for signal extraction of an underlying, generally nonlinear signal

    The Role of Loss Functions in Regression Problems

    Get PDF
    In regression analysis, the goal is to capture the influence of one or more explanatory variables X1, . . . ,Xm on a response variable Y in terms of a regression function g : Rm -> R. An estimate ĝ of g is then found or evaluated in terms of its ability to predict a prespecified statistical functional T of the conditional distribution L(Y |X1, . . . ,Xm). This is done with the help of a loss function that penalizes estimates that perform poorly in predicting T(L(Y |X1, . . . ,Xm)). More precisely, it is done by using loss functions that are consistent for T. Clearly, the outcome of the evaluation or estimation strongly depends on the functional T. However, when we focus on a specific functional T a vast collection of suitable loss functions may be available and the result can still be sensitive to the choice of loss function. There are several viable solution strategies to approach this issue. We can, for instance, impose additional properties on the loss function or the resulting estimate so that only one of the possible loss functions remains reasonable. In this doctoral thesis we adopt another approach. The underlying idea is that we would naturally prefer an estimate ĝ that is optimal with respect to several consistent loss functions for T, as then the choice of loss function seems to impact the outcome less severely. In Chapter 1, we consider the nonparametric isotonic regression problem. We show that this regression problem is special in that for identifiable functionals T, solutions which are simultaneously optimal with respect to an entire class of consistent losses exist and can be characterized. There are, however, several functionals of interest that are not identifiable. The expected shortfall is just one prominent example. However, some of those functionals can be obtained as a function of a vector-valued elicitable functional. In the second Chapter, we investigate when simultaneous optimality with respect to a class of consistent losses holds for these functionals and introduce the solution to the isotonic regression problem for a specific loss in the case where simultaneous optimality is not fulfilled. In parametric regression, on the other hand, different consistent loss functions often yield different parameter estimates under misspecification. This motivates to consider the set of these parameters as a way to measure misspecification. We introduce this approach in Chapter 3 and show how the set of these model parameters can be calculated on the population and on the sample level for an isotonic regression function g

    Applications and enhancements of aircraft design optimization techniques

    No full text
    The aircraft industry has been at the forefront in developing design optimization strategies ever since the advent of high performance computing. Thanks to the large computational resources now available, many new as well as more mature optimization methods have become well established. However, the same cannot be said for other stages along the optimization process - chiefly, and this is where the present thesis seeks to make its first main contribution, at the geometry parameterization stage.The first major part of the thesis is dedicated to the goal of reducing the size of the search space by reducing the dimensionality of existing parameterization schemes, thus improving the effectiveness of search strategies based upon them. Specifically, a refinement to the Kulfan parameterization method is presented, based on using Genetic Programming and a local search within a Baldwinian learning strategy to evolve a set of analytical expressions to replace the standard 'class function' at the basis of the Kulfan method. The method is shown to significantly reduce the number of parameters and improves optimization performance - this is demonstrated using a simple aerodynamic design case study.The second part describes an industrial level case study, combining sophisticated, high fidelity, as well as fast, low fidelity numerical analysis with a complex physical experiment. The objective is the analysis of a topical design question relating to reducing the environmental impact of aviation: what is the optimum layout of an over-the-wing turbofan engine installation designed to enable the airframe to shield near-airport communities on the ground from fan noise. An experiment in an anechoic chamber reveals that a simple half-barrier noise model can be used as a first order approximation to the change of inlet broadband noise shielding by the airframe with engine position, which can be used within design activities. Moreover, the experimental results are condensed into an acoustic shielding performance metric to be used in a Multidisciplinary Design Optimization study, together with drag and engine performance values acquired through CFD. By using surrogate models of these three performance metrics we are able to find a set of non-dominated engine positions comprising a Pareto Front of these objectives. This may give designers of future aircraft an insight into an appropriate engine position above a wing, as well as a template for blending multiple levels of computational analysis with physical experiments into a multidisciplinary design optimization framework

    A multimodal pattern recognition framework for speaker detection

    Get PDF
    Speaker detection is an important component of a speech-based user interface. Audiovisual speaker detection, speech and speaker recognition or speech synthesis for example find multiple applications in human-computer interaction, multimedia content indexing, biometrics, etc. Generally speaking, any interface which relies on speech for communication requires an estimate of the user's speaking state (i.e. whether or not he/she is speaking to the system) for its reliable functioning. One needs therefore to identify the speaker and discriminate from other users or background noise. A human observer would perform such a task very easily, although this decision results from a complex cognitive process referred to as decision-making. Generally speaking, this process starts with the acquisition by the human being of information about the environment, through each of its five senses. The brain then integrates these multiple information. An amazing property of this multi-sensory integration by the brain, as pointed out by cognitive sciences, is the perception of stimuli of different modalities as originating from a single source, provided they are synchronized in space and time. A speaker is a bimodal source emitting jointly an auditory signal and a visual signal (the motion of the articulators during speech production). The two signals are obviously co-occurring spatio-temporally. This interesting property allows us – as human observers – to discriminate between a speaking mouth and a mouth whose motion is not related with the auditory signal. This dissertation deals with the modelling of such a complex decision-making, using a pattern recognition procedure. A pattern recognition process comprises all the stages of an investigation, from data acquisition to classification and assessment of the results. In the audiovisual speaker detection problem, tackled more specifically in this thesis, the data are acquired using only one microphone and camera. The pattern recognizer integrates and combines these two modalities to perform and is therefore denoted as "multimodal". This multimodal approach is expected to increase the performance of the system. But it also raises many questions such as what should be fused, when in the decision process this fusion should take place, and how is it to be achieved. This thesis provides answers to each of these issues through the proposition of detailed solutions for each step of the classification process. The basic principle is to evaluate the synchrony between the audio and video features extracted from potentially speaking mouths, in order to classify each mouth as speaking or not. This synchrony is evaluated through a mutual information based function. A key to success is the extraction of suitable features. The audiovisual data are then processed through an information theoretic feature extraction framework after having been acquired and represented in a tractable way. This feature extraction framework uses jointly the two modalities in a feature-level fusion scheme. This way, the information originating from the common source is recovered while the independent noise is discarded. This approach is shown to minimize the probability of committing an error on the source estimate. These optimal features are put as inputs of the classifier, defined through a hypothesis testing approach. Using jointly the two modalities, it outputs a single decision about the class label of each candidate mouth region ("speaker" or "non-speaker"). Therefore, the acoustic and visual information are combined at both the feature and the decision levels, so that we can talk about a hybrid fusion method. The hypothesis testing approach gives means for evaluating the performance of the classifier itself but also of the whole pattern recognition system. In particular, the added-value offered by the feature extraction step can be assessed. The framework is applied in a first time with a particular emphasis on the audio modality: the information theoretic feature extraction addresses the optimization of the audio features using jointly the video information. As a result, audio features specific to speech production are produced. The system evaluation framework establishes that putting these features at input of the classifier increases its discrimination power with respect to equivalent non-optimized features. Then the enhancement of the video content is addressed more specifically. The mouth motion is obviously the suitable video representation for handling a task such as speaker detection. However, only an estimate of this motion, the optical flow, can be obtained. This estimation relies on the intensity gradient of the image sequence. Graph theory is used to establish a probabilistic model of the relationships between the audio, the motion and the image intensity gradient, in the particular case of a speaking mouth. The interpretation of this model leads back to the optimization function defined for the information theoretic feature extraction. As a result, a scale-space approach is proposed for estimating the optical flow, where the strength of the smoothness constraint is controlled via a mutual information based criterion involving both the audio and the video information. First results are promising even if more extensive tests should be carried out, in noisy conditions in particular. As a conclusion, this thesis proposes a complete pattern recognition framework dedicated to audiovisual speaker detection and minimizing the probability of misclassifying a mouth as "speaker" or "non-speaker". The importance of fusing the audio and video content as soon as at the feature level is demonstrated through the system evaluation stage included in the pattern recognition process
    corecore