1,988 research outputs found

    Efficient duration modelling in the hierarchical hidden semi-Markov models and their applications

    Get PDF
    Modeling patterns in temporal data has arisen as an important problem in engineering and science. This has led to the popularity of several dynamic models, in particular the renowned hidden Markov model (HMM) [Rabiner, 1989]. Despite its widespread success in many cases, the standard HMM often fails to model more complex data whose elements are correlated hierarchically or over a long period. Such problems are, however, frequently encountered in practice. Existing efforts to overcome this weakness often address either one of these two aspects separately, mainly due to computational intractability. Motivated by this modeling challenge in many real world problems, in particular, for video surveillance and segmentation, this thesis aims to develop tractable probabilistic models that can jointly model duration and hierarchical information in a unified framework. We believe that jointly exploiting statistical strength from both properties will lead to more accurate and robust models for the needed task. To tackle the modeling aspect, we base our work on an intersection between dynamic graphical models and statistics of lifetime modeling. Realizing that the key bottleneck found in the existing works lies in the choice of the distribution for a state, we have successfully integrated the discrete Coxian distribution [Cox, 1955], a special class of phase-type distributions, into the HMM to form a novel and powerful stochastic model termed as the Coxian Hidden Semi-Markov Model (CxHSMM). We show that this model can still be expressed as a dynamic Bayesian network, and inference and learning can be derived analytically.Most importantly, it has four superior features over existing semi-Markov modelling: the parameter space is compact, computation is fast (almost the same as the HMM), close-formed estimation can be derived, and the Coxian is flexible enough to approximate a large class of distributions. Next, we exploit hierarchical decomposition in the data by borrowing analogy from the hierarchical hidden Markov model in [Fine et al., 1998, Bui et al., 2004] and introduce a new type of shallow structured graphical model that combines both duration and hierarchical modelling into a unified framework, termed the Coxian Switching Hidden Semi-Markov Models (CxSHSMM). The top layer is a Markov sequence of switching variables, while the bottom layer is a sequence of concatenated CxHSMMs whose parameters are determined by the switching variable at the top. Again, we provide a thorough analysis along with inference and learning machinery. We also show that semi-Markov models with arbitrary depth structure can easily be developed. In all cases we further address two practical issues: missing observations to unstable tracking and the use of partially labelled data to improve training accuracy. Motivated by real-world problems, our application contribution is a framework to recognize complex activities of daily livings (ADLs) and detect anomalies to provide better intelligent caring services for the elderly.Coarser activities with self duration distributions are represented using the CxHSMM. Complex activities are made of a sequence of coarser activities and represented at the top level in the CxSHSMM. Intensive experiments are conducted to evaluate our solutions against existing methods. In many cases, the superiority of the joint modeling and the Coxian parameterization over traditional methods is confirmed. The robustness of our proposed models is further demonstrated in a series of more challenging experiments, in which the tracking is often lost and activities considerably overlap. Our final contribution is an application of the switching Coxian model to segment education-oriented videos into coherent topical units. Our results again demonstrate such segmentation processes can benefit greatly from the joint modeling of duration and hierarchy

    Some New Results on the Estimation of Sinusoids in Noise

    Get PDF

    Characterization of damage evolution on metallic components using ultrasonic non-destructive methods

    Get PDF
    When fatigue is considered, it is expected that structures and machinery eventually fail. Still, when this damage is unexpected, besides of the negative economic impact that it produces, life of people could be potentially at risk. Thus, nowadays it is imperative that the infrastructure managers, ought to program regular inspection and maintenance for their assets; in addition, designers and materials manufacturers, can access to appropriate diagnostic tools in order to build superior and more reliable materials. In this regard, and for a number of applications, non-destructive evaluation techniques have proven to be an efficient and helpful alternative to traditional destructive assays of materials. Particularly, for the design area of materials, in recent times researchers have exploited the Acoustic Emission (AE) phenomenon as an additional assessing tool with which characterize the mechanical properties of specimens. Nevertheless, several challenges arise when treat said phenomenon, since its intensity, duration and arrival behavior is essentially stochastic for traditional signal processing means, leading to inaccuracies for the outcome assessment. In this dissertation, efforts are focused on assisting in the characterization of the mechanical properties of advanced high strength steels during under uniaxial tensile tests. Particularly of interest, is being able to detect the nucleation and growth of a crack throughout said test. Therefore, the resulting AE waves generated by the specimen during the test are assessed with the aim of characterize their evolution. For this, on the introduction, a brief review about non-destructive methods emphasizing the AE phenomenon is introduced. Next is presented, an exhaustive analysis with regard to the challenge and deficiencies of detecting and segmenting each AE event over a continuous data-stream with the traditional threshold detection method, and additionally, with current state of the art methods. Following, a novel AE event detection method is proposed, with the aim of overcome the aforementioned limitations. Evidence showed that the proposed method (which is based on the short-time features of the waveform of the AE signal), excels the detection capabilities of current state of the art methods, when onset and endtime precision, as well as when quality of detection and computational speed are also considered. Finally, a methodology aimed to analyze the frequency spectrum evolution of the AE phenomenon during the tensile test, is proposed. Results indicate that it is feasible to correlate nucleation and growth of a crack with the frequency content evolution of AE events.Cuando se considera la fatiga de los materiales, se espera que eventualmente las estructuras y las maquinarias fallen. Sin embargo, cuando este daño es inesperado, además del impacto económico que este produce, la vida de las personas podría estar potencialmente en riesgo. Por lo que hoy en día, es imperativo que los administradores de las infraestructuras deban programar evaluaciones y mantenimientos de manera regular para sus activos. De igual manera, los diseñadores y fabricantes de materiales deberían de poseer herramientas de diagnóstico apropiadas con el propósito de obtener mejores y más confiables materiales. En este sentido, y para un amplio número de aplicaciones, las técnicas de evaluación no destructivas han demostrado ser una útil y eficiente alternativa a los ensayos destructivos tradicionales de materiales. De manera particular, en el área de diseño de materiales, recientemente los investigadores han aprovechado el fenómeno de Emisión Acústica (EA) como una herramienta complementaria de evaluación, con la cual poder caracterizar las propiedades mecánicas de los especímenes. No obstante, una multitud de desafíos emergen al tratar dicho fenómeno, ya que el comportamiento de su intensidad, duración y aparición es esencialmente estocástico desde el punto de vista del procesado de señales tradicional, conllevando a resultados imprecisos de las evaluaciones. Esta disertación se enfoca en colaborar en la caracterización de las propiedades mecánicas de Aceros Avanzados de Alta Resistencia (AAAR), para ensayos de tracción de tensión uniaxiales, con énfasis particular en la detección de fatiga, esto es la nucleación y generación de grietas en dichos componentes metálicos. Para ello, las ondas mecánicas de EA que estos especímenes generan durante los ensayos, son estudiadas con el objetivo de caracterizar su evolución. En la introducción de este documento, se presenta una breve revisión acerca de los métodos existentes no destructivos con énfasis particular al fenómeno de EA. A continuación, se muestra un análisis exhaustivo respecto a los desafíos para la detección de eventos de EA y las y deficiencias del método tradicional de detección; de manera adicional se evalúa el desempeño de los métodos actuales de detección de EA pertenecientes al estado del arte. Después, con el objetivo de superar las limitaciones presentadas por el método tradicional, se propone un nuevo método de detección de actividad de EA; la evidencia demuestra que el método propuesto (basado en el análisis en tiempo corto de la forma de onda), supera las capacidades de detección de los métodos pertenecientes al estado del arte, cuando se evalúa la precisión de la detección de la llegada y conclusión de las ondas de EA; además de, cuando también se consideran la calidad de detección de eventos y la velocidad de cálculo. Finalmente, se propone una metodología con el propósito de evaluar la evolución de la energía del espectro frecuencial del fenómeno de EA durante un ensayo de tracción; los resultados demuestran que es posible correlacionar el contenido de dicha evolución frecuencial con respecto a la nucleación y crecimiento de grietas en AAAR's.Postprint (published version

    Representation of statistical sound properties in human auditory cortex

    Get PDF
    The work carried out in this doctoral thesis investigated the representation of statistical sound properties in human auditory cortex. It addressed four key aspects in auditory neuroscience: the representation of different analysis time windows in auditory cortex; mechanisms for the analysis and segregation of auditory objects; information-theoretic constraints on pitch sequence processing; and the analysis of local and global pitch patterns. The majority of the studies employed a parametric design in which the statistical properties of a single acoustic parameter were altered along a continuum, while keeping other sound properties fixed. The thesis is divided into four parts. Part I (Chapter 1) examines principles of anatomical and functional organisation that constrain the problems addressed. Part II (Chapter 2) introduces approaches to digital stimulus design, principles of functional magnetic resonance imaging (fMRI), and the analysis of fMRI data. Part III (Chapters 3-6) reports five experimental studies. Study 1 controlled the spectrotemporal correlation in complex acoustic spectra and showed that activity in auditory association cortex increases as a function of spectrotemporal correlation. Study 2 demonstrated a functional hierarchy of the representation of auditory object boundaries and object salience. Studies 3 and 4 investigated cortical mechanisms for encoding entropy in pitch sequences and showed that the planum temporale acts as a computational hub, requiring more computational resources for sequences with high entropy than for those with high redundancy. Study 5 provided evidence for a hierarchical organisation of local and global pitch pattern processing in neurologically normal participants. Finally, Part IV (Chapter 7) concludes with a general discussion of the results and future perspectives

    NMF-based compositional models for audio source separation

    Get PDF
    학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2017. 2. 김남수.Many classes of data can be represented by constructive combinations of parts. Most signal and data from nature have nonnegative values and can be explained and reconstructed by constructive models. By the constructive models, only the additive combination is allowed and it does not result in subtraction of parts. The compositional models include dictionary learning, exemplar-based approaches, and nonnegative matrix factorization (NMF). Compositional models are desirable in many areas including image or visual signal processing, text information processing, audio signal processing, and music information retrieval. In this dissertation, we choose NMF for compositional models and NMF-based target source separation is performed for the application. The target source separation is the extraction or reconstruction of the target signals in the mixture signals which consists with the target and interfering signals. The target source separation can be thought as blind source separation (BSS). BSS aims that the original unknown source signals are extracted without knowing or with very limited information. However, in these days, much of prior information is frequently utilized, and various approaches have been proposed for single channel source separation. NMF basically approximates a nonnegative data matrix V with a product of nonnegative basis and encoding matrices W and H, i.e., V WH. Since both W and H are nonnegative, NMF often leads to a part based representation of the data. The methods based on NMF have shown impressive results in single channel source separation The objective function of NMF is generally presented Euclidean distant, Kullback-Leibler divergence, and Itakura-saito divergence. Many optimization methods have been proposed and utilized, e.g., multiplicative update rule, projected gradient descent and NeNMF. However, NMF-based audio source separation has some issues as follows: non-uniqueness of the bases, a high dependence to the prior information, the overlapped subspace between target bases and interfering bases, a disregard of the encoding vectors from the training phase, and insucient analysis of sparse NMF. In this dissertation, we propose new approaches to resolve the above issues. In section 4, we propose a novel speech enhancement method that combines the statistical model-based enhancement scheme with the NMF-based gain function. For a better performance in time-varying noise environments, both the speech and noise bases of NMF are adapted simultaneously with the help of the estimated speech presence probability. In section 5, we propose a discriminative NMF (DNMF) algorithm which exploits the reconstruction error for the interfering signals as well as the target signal based on target bases. In section 6, we propose an approach to robust bases estimation in which an incremental strategy is adopted. Based on an analogy between clustering and NMF analysis, we incrementally estimate the NMF bases similar to the modied k-means and Linde-Buzo-Gray algorithms popular in the data clustering area. In Section 7, the distribution of the encoding vector is modeled as a multivariate exponential PDF (MVE) with a single scaling factor for each source. In Section 8, several sparse penalty terms for NMF are analyzed and compared in terms of signal to distortion ratio, sparseness of encoding vectors, reconstruction error, and entropy of basis vectors. The new objective function which applied sparse representation and discriminative NMF (DNMF) is also proposed.1 Introduction 1 1.1 Audio source separation 1 1.2 Speech enhancement 3 1.3 Measurements 4 1.4 Outline of the dissertation 6 2 Compositional model and NMF 9 2.1 Compositional model 9 2.2 NMF 14 2.2.1 Update rules: MuR, PGD 16 2.2.2 Modied NMF 20 3 NMF-based audio source separation and issues 23 3.1 NMF-based audio source separation 23 3.2 Problems of NMF in audio source separation 26 3.2.1 A high dependency to the prior knowledge 26 3.2.2 A overlapped subspace between the target and interfering basis matrices 28 3.2.3 A non-uniqueness of the bases 29 3.2.4 A prior knowledge of the encoding vectors 30 3.2.5 Sparse NMF for the source separation 32 4 Online bases update 33 4.1 Introduction 33 4.2 NMF-based speech enhancement using spectral gain function 36 4.3 Speech enhancement combining statistical model-based and NMFbased methods with the on-line bases update 38 4.3.1 On-line update of speech and noise bases 40 4.3.2 Determining maximum update rates 42 4.4 Experiment result 43 5 Discriminative NMF 47 5.1 Introduction 47 5.2 Discriminative NMF utilizing cross reconstruction error 48 5.2.1 DNMF using the reconstruction error of the other source 49 5.2.2 DNMF using the interference factors 50 5.3 Experiment result 52 6 Incremental approach for bases estimate 57 6.1 Introduction 57 6.2 Incremental approach based on modied k-means clustering and Linde-Buzo-Gray algorithm 59 6.2.1 Based on modied k-means clustering 59 6.2.2 LBG based incremental approach 62 6.3 Experiment result 63 6.3.1 Modied k-means clustering based approach 63 6.3.2 LBG based approach 66 7 Prior model of encoding vectors 77 7.1 Introduction 77 7.2 Prior model of encoding vectors based on multivariate exponential distribution 78 7.3 Experiment result 82 8 Conclusions 87 Bibliography 91 국문초록 105Docto

    NASA Thesaurus. Volume 1: Hierarchical listing

    Get PDF
    There are 16,713 postable terms and 3,716 nonpostable terms approved for use in the NASA scientific and technical information system in the Hierarchical Listing of the NASA Thesaurus. The generic structure is presented for many terms. The broader term and narrower term relationships are shown in an indented fashion that illustrates the generic structure better than the more widely used BT and NT listings. Related terms are generously applied, thus enhancing the usefulness of the Hierarchical Listing. Greater access to the Hierarchical Listing may be achieved with the collateral use of Volume 2 - Access Vocabulary
    corecore