2 research outputs found

    Combining audio-visual features for viewers' perception classification of Youtube car commercials

    Get PDF
    Proccedings of: 2nd International Workshop on Speech, Language and Audio in Multimedia. Penang, Malaysia, 11-12 September 2014.In this paper, we present a computational model capable of predicting the viewer perception of Youtube car TV commercials by using a set of low-level audio and visual descriptors. Our research goal relies on the hypothesis that these descriptors could reflect to some extent the objective value of the videos and, in turn, the average viewer's perception. To that end, and as a novel approach to this problem, we automatically annotate our video corpus, grouped into 2 classes corresponding to differ-ent satisfaction levels, by means of a regular k-means algorithm applied to the video metadata related to users feedback. Evaluation results show that simple linear logistic regression models based on the 10 best visual descriptors and on the 10 best audio descriptors individually perform reasonably well, achieving a classification accuracy of roughly 70% and 75%, respectively. Combination of audio and visual descriptors yields better performance, roughly 86% for the top-20 selected from the entire descriptor set, but tipping the balance in favor of the audio ones (i.e. 17 vs 3). Audio content bigger influence in this domain is also evidenced by a side analysis of the video comments.Publicad

    A client-server architecture for distributed and scalable multimedia content analysis: an Android app for assisting phone users in shooting aesthetically valuable pictures

    Get PDF
    Nowadays developing modern scientific image and video analysis algorithms faces the issue of distributing them among the open community with multiple versions for very different platforms. This requires software development skills usually unknown by the researchers outside of the computer science world. Client/server communications have acquired a leading role by abstracting the business logic of applications from thin clients running on small devices like smartphones which end users can carry with them. The present work describes the design, modeling, development and testing of a client/server architecture that has the ability to perform computations on image and video characteristics on independent Matlab® instances and offer production efficient SQL persistence to store the results. All of this, immersed in a user authenticated environment. This project has been specifically focused on a currently ongoing study by researchers from Universidad Carlos III and Universidad Politécnica de Madrid. Their main goal is to estimate the aesthetic value of images and videos by the computation of audiovisual content. However, the architecture has been designed and built with the objective of being applicable to any kind of biomedical, audiovisual or any other engineering image or video analysis study.Hoy en día, desarrollar nuevos algoritmos científicos que analicen videos o imágenes lleva consigo el problema de la distribución abierta a la comunidad con las múltiples versiones de las distintas plataformas utilizadas. Para que ello sea posible, se requieren habilidades de desarrollo de software que normalmente son desconocidas por parte de los investigadores no inmersos en campo de la informática. Las plataformas cliente/servidor han adquirido un rol primordial al abstraer la funcionalidad principal de las aplicaciones de los clientes livianos como los teléfonos inteligentes que pueden llevarse en el bolsillo. Este trabajo describe el diseño, modelado, desarrollo y prueba de una arquitectura cliente/servidor que tiene la habilidad de realizar cálculos de características de imágenes y videos en instancias independientes de Matlab® y ofrecer persistencia de datos SQL al nivel de un entorno de producción donde guardar los resultados obtenidos, todo ello sumergido en un ambiente donde los usuarios están completamente autentificados. Este proyecto ha estado particularmente enfocado a una investigación actualmente en desarrollo por investigadores de la Universidad Carlos III y la Universidad Politécnica de Madrid. Esta investigación tiene como objetivo el estudio del valor estético de imágenes y videos a través del cálculo de descriptores objetivos. De todas maneras, la arquitectura se ha diseñado y construido con el objetivo de posibilitar la aplicación a cualquier otro estudio dentro de la ingeniería biomédica, audiovisual u otra ingeniería donde se requiera el análisis de video o imagen.Ingeniería Biomédic