14 research outputs found
Audio Inpainting
(c) 2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works. Published version: IEEE Transactions on Audio, Speech and Language Processing 20(3): 922-932, Mar 2012. DOI: 10.1090/TASL.2011.2168211
Robust density modelling using the student's t-distribution for human action recognition
The extraction of human features from videos is often inaccurate and prone to outliers. Such outliers can severely affect density modelling when the Gaussian distribution is used as the model since it is highly sensitive to outliers. The Gaussian distribution is also often used as base component of graphical models for recognising human actions in the videos (hidden Markov model and others) and the presence of outliers can significantly affect the recognition accuracy. In contrast, the Student's t-distribution is more robust to outliers and can be exploited to improve the recognition rate in the presence of abnormal data. In this paper, we present an HMM which uses mixtures of t-distributions as observation probabilities and show how experiments over two well-known datasets (Weizmann, MuHAVi) reported a remarkable improvement in classification accuracy. © 2011 IEEE
Recommended from our members
Joint sparse learning with nonlocal and local image priors for image error concealment
Joint sparse representation (JSR) model has recently emerged as a powerful technique with wide variety of applications. In this paper, the JSR model is extended to error concealment (EC) application, being effective to recover the original image from its corrupted version. This model is based on jointly learning a dictionary pair and two mapping matrices that are trained offline from external training images. Given the trained dictionaries and mappings, the restoration is done by transferring the recovery problem into the sparse representation domain with respect to the trained dictionaries, which is further transformed into a common space using the respective mapping matrices. Then, the reconstructed image is obtained by back projection into the spatial domain. In order to improve the accuracy and stability of the proposed JSR-based EC algorithm and avoid unexpected artifacts, the local and non-local priors are seamlessly integrated into the JSR model. The non-local prior is based on the self-similarity within natural images and helps to find an accurate sparse representation by taking a weighted average of similar areas throughout the image. The local prior is based on learning the local structural regularity of the natural images and helps to regularize the sparse representation, exploiting the strong correlation in the small local areas within the image. Compared with the state-of-the-art EC algorithms, the results show that the proposed method has better reconstruction performance in terms of objective and subjective evaluations
Construction de mosaïques de super-résolution à partir de la vidéo de basse résolution. Application au résumé vidéo et la dissimulation d'erreurs de transmission.
La numĂ©risation des vidĂ©os existantes ainsi que le dĂ©veloppement explosif des services multimĂ©dia par des rĂ©seaux comme la diffusion de la tĂ©lĂ©vision numĂ©rique ou les communications mobiles ont produit une Ă©norme quantitĂ© de vidĂ©os compressĂ©es. Ceci nĂ©cessite des outils dâindexation et de navigation efficaces, mais une indexation avant lâencodage nâest pas habituelle. Lâapproche courante est le dĂ©codage complet des ces vidĂ©os pour ensuite crĂ©er des indexes. Ceci est trĂšs coĂ»teux et par consĂ©quent non rĂ©alisable en temps rĂ©el. De plus, des informations importantes comme le mouvement, perdus lors du dĂ©codage, sont reestimĂ©es bien que dĂ©jĂ prĂ©sentes dans le flux comprimĂ©. Notre but dans cette thĂšse est donc la rĂ©utilisation des donnĂ©es dĂ©jĂ prĂ©sents dans le flux comprimĂ© MPEG pour lâindexation et la navigation rapide. Plus prĂ©cisĂ©ment, nous extrayons des coefficients DC et des vecteurs de mouvement. Dans le cadre de cette thĂšse, nous nous sommes en particulier intĂ©ressĂ©s Ă la construction de mosaĂŻques Ă partir des images DC extraites des images I. Une mosaĂŻque est construite par recalage et fusion de toutes les images dâune sĂ©quence vidĂ©o dans un seul systĂšme de coordonnĂ©es. Ce dernier est en gĂ©nĂ©ral alignĂ© avec une des images de la sĂ©quence : lâimage de rĂ©fĂ©rence. Il en rĂ©sulte une seule image qui donne une vue globale de la sĂ©quence. Ainsi, nous proposons dans cette thĂšse un systĂšme complet pour la construction des mosaĂŻques Ă partir du flux MPEG-1/2 qui tient compte de diffĂ©rentes problĂšmes apparaissant dans des sĂ©quences vidĂ©o rĂ©eles, comme par exemple des objets en mouvment ou des changements dâĂ©clairage. Une tĂąche essentielle pour la construction dâune mosaĂŻque est lâestimation de mouvement entre chaque image de la sĂ©quence et lâimage de rĂ©fĂ©rence. Notre mĂ©thode se base sur une estimation robuste du mouvement global de la camĂ©ra Ă partir des vecteurs de mouvement des images P. Cependant, le mouvement global de la camĂ©ra estimĂ© pour une image P peut ĂȘtre incorrect car il dĂ©pend fortement de la prĂ©cision des vecteurs encodĂ©s. Nous dĂ©tectons les images P concernĂ©es en tenant compte des coefficients DC de lâerreur encodĂ©e associĂ©e et proposons deux mĂ©thodes pour corriger ces mouvements. UnemosaĂŻque construite Ă partir des images DC a une rĂ©solution trĂšs faible et souffre des effets dâaliasing dus Ă la nature des images DC. Afin dâaugmenter sa rĂ©solution et dâamĂ©liorer sa qualitĂ© visuelle, nous appliquons une mĂ©thode de super-rĂ©solution basĂ©e sur des rĂ©tro-projections itĂ©ratives. Les mĂ©thodes de super-rĂ©solution sont Ă©galement basĂ©es sur le recalage et la fusion des images dâune sĂ©quence vidĂ©o, mais sont accompagnĂ©es dâune restauration dâimage. Dans ce cadre, nous avons dĂ©veloppĂ© une nouvellemĂ©thode dâestimation de flou dĂ» au mouvement de la camĂ©ra ainsi quâune mĂ©thode correspondante de restauration spectrale. La restauration spectrale permet de traiter le flou globalement, mais, dans le cas des obvi jets ayant un mouvement indĂ©pendant du mouvement de la camĂ©ra, des flous locaux apparaissent. Câest pourquoi, nous proposons un nouvel algorithme de super-rĂ©solution dĂ©rivĂ© de la restauration spatiale itĂ©rative de Van Cittert et Jansson permettant de restaurer des flous locaux. En nous basant sur une segmentation dâobjets en mouvement, nous restaurons sĂ©parĂ©ment lamosaĂŻque dâarriĂšre-plan et les objets de lâavant-plan. Nous avons adaptĂ© notre mĂ©thode dâestimation de flou en consĂ©quence. Dans une premier temps, nous avons appliquĂ© notre mĂ©thode Ă la construction de rĂ©sumĂ© vidĂ©o avec pour lâobjectif la navigation rapide par mosaĂŻques dans la vidĂ©o compressĂ©e. Puis, nous Ă©tablissions comment la rĂ©utilisation des rĂ©sultats intermĂ©diaires sert Ă dâautres tĂąches dâindexation, notamment Ă la dĂ©tection de changement de plan pour les images I et Ă la caractĂ©risation dumouvement de la camĂ©ra. Enfin, nous avons explorĂ© le domaine de la rĂ©cupĂ©ration des erreurs de transmission. Notre approche consiste en construire une mosaĂŻque lors du dĂ©codage dâun plan ; en cas de perte de donnĂ©es, lâinformation manquante peut ĂȘtre dissimulĂ©e grace Ă cette mosaĂŻque
Application and Theory of Multimedia Signal Processing Using Machine Learning or Advanced Methods
This Special Issue is a book composed by collecting documents published through peer review on the research of various advanced technologies related to applications and theories of signal processing for multimedia systems using ML or advanced methods. Multimedia signals include image, video, audio, character recognition and optimization of communication channels for networks. The specific contents included in this book are data hiding, encryption, object detection, image classification, and character recognition. Academics and colleagues who are interested in these topics will find it interesting to read
Solutions for large scale, efficient, and secure Internet of Things
The design of a general architecture for the Internet of Things (IoT) is a complex task, due to the heterogeneity of devices, communication technologies, and applications that are part of such systems. Therefore, there are significant opportunities to improve the state of the art, whether to better the performance of the system, or to solve actual issues in current systems. This thesis focuses, in particular, on three aspects of the IoT. First, issues of cyber-physical systems are analysed. In these systems, IoT technologies are widely used to monitor, control, and act on physical entities. One of the most important issue in these scenarios are related to the communication layer, which must be characterized by high reliability, low latency, and high energy efficiency. Some solutions for the channel access scheme of such systems are proposed, each tailored to different specific scenarios. These solutions, which exploit the capabilities of state of the art radio transceivers, prove effective in improving the performance of the considered systems. Positioning services for cyber-physical systems are also investigated, in order to improve the accuracy of such services. Next, the focus moves to network and service optimization for traffic intensive applications, such as video streaming. This type of traffic is common amongst non-constrained devices, like smartphones and augmented/virtual reality headsets, which form an integral part of the IoT ecosystem. The proposed solutions are able to increase the video Quality of Experience while wasting less bandwidth than state of the art strategies. Finally, the security of IoT systems is investigated. While often overlooked, this aspect is fundamental to enable the ubiquitous deployment of IoT. Therefore, security issues of commonly used IoT protocols are presented, together with a proposal for an authentication mechanism based on physical channel features. This authentication strategy proved to be effective as a standalone mechanism or as an additional security layer to improve the security level of legacy systems