7 research outputs found

    GMM Mapping Of Visual Features of Cued Speech From Speech Spectral Features

    No full text
    International audienceIn this paper, we present a statistical method based on GMM modeling to map the acoustic speech spectral features to visual features of Cued Speech in the regression criterion of Minimum Mean-Square Error (MMSE) in a low signal level which is innovative and different with the classic text-to-visual approach. Two different training methods for GMM, namely Expectation-Maximization (EM) approach and supervised training method were discussed respectively. In comparison with the GMM based mapping modeling we first present the results with the use of a Multiple-Linear Regression (MLR) model also at the low signal level and study the limitation of the approach. The experimental results demonstrate that the GMM based mapping method can significantly improve the mapping performance compared with the MLR mapping model especially in the sense of the weak linear correlation between the target and the predictor such as the hand positions of Cued Speech and the acoustic speech spectral features

    GMM Mapping Of Visual Features of Cued Speech From Speech Spectral Features

    No full text
    International audienceIn this paper, we present a statistical method based on GMM modeling to map the acoustic speech spectral features to visual features of Cued Speech in the regression criterion of Minimum Mean-Square Error (MMSE) in a low signal level which is innovative and different with the classic text-to-visual approach. Two different training methods for GMM, namely Expectation-Maximization (EM) approach and supervised training method were discussed respectively. In comparison with the GMM based mapping modeling we first present the results with the use of a Multiple-Linear Regression (MLR) model also at the low signal level and study the limitation of the approach. The experimental results demonstrate that the GMM based mapping method can significantly improve the mapping performance compared with the MLR mapping model especially in the sense of the weak linear correlation between the target and the predictor such as the hand positions of Cued Speech and the acoustic speech spectral features

    Détection automatique de chutes de personnes basée sur des descripteurs spatio-temporels (définition de la méthode, évaluation des performances et implantation temps-réel)

    Get PDF
    Nous proposons une méthode supervisée de détection de chutes de personnes en temps réel, robusteaux changements de point de vue et d environnement. La première partie consiste à rendredisponible en ligne une base de vidéos DSFD enregistrées dans quatre lieux différents et qui comporteun grand nombre d annotations manuelles propices aux comparaisons de méthodes. Nousavons aussi défini une métrique d évaluation qui permet d évaluer la méthode en s adaptant à la naturedu flux vidéo et la durée d une chute, et en tenant compte des contraintes temps réel. Dans unsecond temps, nous avons procédé à la construction et l évaluation des descripteurs spatio-temporelsSTHF, calculés à partir des attributs géométriques de la forme en mouvement dans la scène ainsique leurs transformations, pour définir le descripteur optimisé de chute après une méthode de sélectiond attributs. La robustesse aux changements d environnement a été évaluée en utilisant les SVMet le Boosting. On parvient à améliorer les performances par la mise à jour de l apprentissage parl intégration des vidéos sans chutes enregistrées dans l environnement définitif. Enfin, nous avonsréalisé, une implantation de ce détecteur sur un système embarqué assimilable à une caméra intelligentebasée sur un composant SoC de type Zynq. Une démarche de type Adéquation AlgorithmeArchitecture a permis d obtenir un bon compromis performance de classification/temps de traitementWe propose a supervised approach to detect falls in home environment adapted to location andpoint of view changes. First, we maid publicly available a realistic dataset, acquired in four differentlocations, containing a large number of manual annotation suitable for methods comparison. We alsodefined a new metric, adapted to real-time tasks, allowing to evaluate fall detection performance ina continuous video stream. Then, we build the initial spatio-temporal descriptor named STHF usingseveral combinations of transformations of geometrical features and an automatically optimised setof spatio-temporal descriptors thanks to an automatic feature selection step. We propose a realisticand pragmatic protocol which enables performance to be improved by updating the training in thecurrent location with normal activities records. Finally, we implemented the fall detection in Zynqbasedhardware platform similar to smart camera. An Algorithm-Architecture Adequacy step allowsa good trade-off between performance of classification and processing timeDIJON-BU Doc.électronique (212319901) / SudocSudocFranceF