278,969 research outputs found

    Video Feature Extraction Based on Modified LLE Using Adaptive Nearest Neighbor Approach

    Full text link
    Locally linear embedding (LLE) is an unsupervised learning algorithm which computes the low dimensional, neighborhood preserving embeddings of high dimensional data. LLE attempts to discover non-linear structure in high dimensional data by exploiting the local symmetries of linear reconstructions. In this paper, video feature extraction is done using modified LLE alongwith adaptive nearest neighbor approach to find the nearest neighbor and the connected components. The proposed feature extraction method is applied to a video. The video feature description gives a new tool for analysis of video

    A Comprehensive Review on Speech Recognition and Its Techniques

    Get PDF
    Abstract: This paper provides a review on speech recognition system and its techniques. And provide the advancement in the field of speech recognition system. As speech is a way for the communication between the sender and receiver. A speech recognition system takes speech signal as the input and gives the output in the form of text. This paper describes the basic Automatic Speech Recognition (ASR) System. Provide various Speech recognition techniques such as speech analysis, feature extraction techniques, and matching techniques. This paper gives brief description of feature extraction techniques such as Linear Prediction coding (LPC), Mel frequency Cepstral coefficient (MFCC) and Perceptual Linear Predictive (PLP) technique

    An Evaluation Framework and Database for MoCap-Based Gait Recognition Methods

    Get PDF
    As a contribution to reproducible research, this paper presents a framework and a database to improve the development, evaluation and comparison of methods for gait recognition from Motion Capture (MoCap) data. The evaluation framework provides implementation details and source codes of state-of-the-art human-interpretable geometric features as well as our own approaches where gait features are learned by a modification of Fisher's Linear Discriminant Analysis with the Maximum Margin Criterion, and by a combination of Principal Component Analysis and Linear Discriminant Analysis. It includes a description and source codes of a mechanism for evaluating four class separability coefficients of feature space and four rank-based classifier performance metrics. This framework also contains a tool for learning a custom classifier and for classifying a custom query on a custom gallery. We provide an experimental database along with source codes for its extraction from the general CMU MoCap database

    Gaussian Processes Based Data Augmentation and Expected Signature for Time Series Classification

    Get PDF
    Time series classification tasks play a crucial role in extracting relevant information from data equipped with a temporal structure. In various scientific domains, such as biology or finance, this kind of data comes from complex and hardly predictable phenomena. Therefore, classification algorithms for time series should be able to deal with the uncertainty contained in data and capture the relevant statistical properties of the underlying phenomenon. The main object of interest of this work is the development of a model for time series that tackles the classification task by interpreting time series as realisations of stochastic processes, the natural mathematical description of chaotic behaviour. The focus thus is on time series that can be thought as signals of some nature, and that convey some kind of statistical information. We propose a data-driven feature extraction model for time series built upon a Gaussian process based data augmentation and on the expected signature. The signature is a fundamental object that describes paths, much alike Fourier or wavelet expansion, but in a non-linear fashion. Likewise, the expected signature provides a statistical description of the law of stochastic processes. One of the main features is that an optimal feature extraction is learnt through the supervised task that uses the model. The model can be adapted to more complicated supervised tasks, as it integrates seamlessly in a neural network architecture and is fully compatible with back-propagation, and it can be easily accommodated to perform regressive tasks. The effectiveness of the model is demonstrated with numerical experiments on some benchmark time series

    Force field feature extraction for ear biometrics

    No full text
    The overall objective in defining feature space is to reduce the dimensionality of the original pattern space, whilst maintaining discriminatory power for classification. To meet this objective in the context of ear biometrics a new force field transformation treats the image as an array of mutually attracting particles that act as the source of a Gaussian force field. Underlying the force field there is a scalar potential energy field, which in the case of an ear takes the form of a smooth surface that resembles a small mountain with a number of peaks joined by ridges. The peaks correspond to potential energy wells and to extend the analogy the ridges correspond to potential energy channels. Since the transform also turns out to be invertible, and since the surface is otherwise smooth, information theory suggests that much of the information is transferred to these features, thus confirming their efficacy. We previously described how field line feature extraction, using an algorithm similar to gradient descent, exploits the directional properties of the force field to automatically locate these channels and wells, which then form the basis of characteristic ear features. We now show how an analysis of the mechanism of this algorithmic approach leads to a closed analytical description based on the divergence of force direction, which reveals that channels and wells are really manifestations of the same phenomenon. We further show that this new operator, with its own distinct advantages, has a striking similarity to the Marr-Hildreth operator, but with the important difference that it is non-linear. As well as addressing faster implementation, invertibility, and brightness sensitivity, the technique is also validated by performing recognition on a database of ears selected from the XM2VTS face database, and by comparing the results with the more established technique of Principal Components Analysis. This confirms not only that ears do indeed appear to have potential as a biometric, but also that the new approach is well suited to their description, being robust especially in the presence of noise, and having the advantage that the ear does not need to be explicitly extracted from the background

    Emotion Recognition from Acted and Spontaneous Speech

    Get PDF
    DizertačnĂ­ prĂĄce se zabĂœvĂĄ rozpoznĂĄnĂ­m emočnĂ­ho stavu mluvčích z ƙečovĂ©ho signĂĄlu. PrĂĄce je rozdělena do dvou hlavnĂ­ch častĂ­, prvnĂ­ část popisuju navrĆŸenĂ© metody pro rozpoznĂĄnĂ­ emočnĂ­ho stavu z hranĂœch databĂĄzĂ­. V rĂĄmci tĂ©to části jsou pƙedstaveny vĂœsledky rozpoznĂĄnĂ­ pouĆŸitĂ­m dvou rĆŻznĂœch databĂĄzĂ­ s rĆŻznĂœmi jazyky. HlavnĂ­mi pƙínosy tĂ©to části je detailnĂ­ analĂœza rozsĂĄhlĂ© ĆĄkĂĄly rĆŻznĂœch pƙíznakĆŻ zĂ­skanĂœch z ƙečovĂ©ho signĂĄlu, nĂĄvrh novĂœch klasifikačnĂ­ch architektur jako je napƙíklad „emočnĂ­ pĂĄrovĂĄní“ a nĂĄvrh novĂ© metody pro mapovĂĄnĂ­ diskrĂ©tnĂ­ch emočnĂ­ch stavĆŻ do dvou dimenzionĂĄlnĂ­ho prostoru. DruhĂĄ část se zabĂœvĂĄ rozpoznĂĄnĂ­m emočnĂ­ch stavĆŻ z databĂĄze spontĂĄnnĂ­ ƙeči, kterĂĄ byla zĂ­skĂĄna ze zĂĄznamĆŻ hovorĆŻ z reĂĄlnĂœch call center. Poznatky z analĂœzy a nĂĄvrhu metod rozpoznĂĄnĂ­ z hranĂ© ƙeči byly vyuĆŸity pro nĂĄvrh novĂ©ho systĂ©mu pro rozpoznĂĄnĂ­ sedmi spontĂĄnnĂ­ch emočnĂ­ch stavĆŻ. JĂĄdrem navrĆŸenĂ©ho pƙístupu je komplexnĂ­ klasifikačnĂ­ architektura zaloĆŸena na fĂșzi rĆŻznĂœch systĂ©mĆŻ. PrĂĄce se dĂĄle zabĂœvĂĄ vlivem emočnĂ­ho stavu mluvčího na Ășspěơnosti rozpoznĂĄnĂ­ pohlavĂ­ a nĂĄvrhem systĂ©mu pro automatickou detekci ĂșspěơnĂœch hovorĆŻ v call centrech na zĂĄkladě analĂœzy parametrĆŻ dialogu mezi ĂșčastnĂ­ky telefonnĂ­ch hovorĆŻ.Doctoral thesis deals with emotion recognition from speech signals. The thesis is divided into two main parts; the first part describes proposed approaches for emotion recognition using two different multilingual databases of acted emotional speech. The main contributions of this part are detailed analysis of a big set of acoustic features, new classification schemes for vocal emotion recognition such as “emotion coupling” and new method for mapping discrete emotions into two-dimensional space. The second part of this thesis is devoted to emotion recognition using multilingual databases of spontaneous emotional speech, which is based on telephone records obtained from real call centers. The knowledge gained from experiments with emotion recognition from acted speech was exploited to design a new approach for classifying seven emotional states. The core of the proposed approach is a complex classification architecture based on the fusion of different systems. The thesis also examines the influence of speaker’s emotional state on gender recognition performance and proposes system for automatic identification of successful phone calls in call center by means of dialogue features.

    Inferring Missing Entity Type Instances for Knowledge Base Completion: New Dataset and Methods

    Full text link
    Most of previous work in knowledge base (KB) completion has focused on the problem of relation extraction. In this work, we focus on the task of inferring missing entity type instances in a KB, a fundamental task for KB competition yet receives little attention. Due to the novelty of this task, we construct a large-scale dataset and design an automatic evaluation methodology. Our knowledge base completion method uses information within the existing KB and external information from Wikipedia. We show that individual methods trained with a global objective that considers unobserved cells from both the entity and the type side gives consistently higher quality predictions compared to baseline methods. We also perform manual evaluation on a small subset of the data to verify the effectiveness of our knowledge base completion methods and the correctness of our proposed automatic evaluation method.Comment: North American Chapter of the Association for Computational Linguistics- Human Language Technologies, 201
    • 

    corecore