Search CORE

2,486 research outputs found

Towards an automatic semantic annotation of car aesthetics

Author: Catalano Chiara Eva
Giannini Franca
Monti Marina
Ucelli Giuliana
Publication venue
Publication date
Field of study

The design of a new car is guided by a set of directives indicating the target market and specific engineering and aesthetic constraints, which may include also the preservation of the company brand identity or the restyling of products already on the market. When creating a new product designers are used to evaluating other existing products to take inspiration or to possibly reuse successful solutions. In the perspective of an optimised styling workflow a great benefit could come from the opportunity of easily retrieving the related documentation and existing digital models both from internal and external repositories. In fact, the rapid growth of the web contents and the widely spread adoption of computerassisted design tools have made a huge amount of digital data available, whose exploitation could be improved by more selective retrieval methods. In particular, the retrieval of aesthetic elements may help designers to more efficiently create digital models conforming to specific styling properties. The aim of the research described in this document is the definition of a framework able to support a (semi-) automatic extraction of semantic data from 3D models and other multimedia data to allow car designers to reuse knowledge and design solutions within the styling department. The first objective is then capturing and structuring both the explicit and implicit elements that contribute to the car aesthetics and can be realistically tackled through computational models and methods. The second step is the definition of a system architecture able to transfer such semantics through an automatic annotation of car models

PUblication MAnagement

Scene extraction in motion pictures

Author: Dorai Chitra
Truong Ba Tu
Venkatesh Svetha
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2003
Field of study

This paper addresses the challenge of bridging the semantic gap between the rich meaning users desire when they query to locate and browse media and the shallowness of media descriptions that can be computed in today\u27s content management systems. To facilitate high-level semantics-based content annotation and interpretation, we tackle the problem of automatic decomposition of motion pictures into meaningful story units, namely scenes. Since a scene is a complicated and subjective concept, we first propose guidelines from fill production to determine when a scene change occurs. We then investigate different rules and conventions followed as part of Fill Grammar that would guide and shape an algorithmic solution for determining a scene. Two different techniques using intershot analysis are proposed as solutions in this paper. In addition, we present different refinement mechanisms, such as film-punctuation detection founded on Film Grammar, to further improve the results. These refinement techniques demonstrate significant improvements in overall performance. Furthermore, we analyze errors in the context of film-production techniques, which offer useful insights into the limitations of our method

Deakin Research Online

A Data-Driven Approach for Tag Refinement and Localization in Web Videos

Author: Ballan Lamberto
Bertini Marco
Del Bimbo Alberto
Serra Giuseppe
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

Tagging of visual content is becoming more and more widespread as web-based services and social networks have popularized tagging functionalities among their users. These user-generated tags are used to ease browsing and exploration of media collections, e.g. using tag clouds, or to retrieve multimedia content. However, not all media are equally tagged by users. Using the current systems is easy to tag a single photo, and even tagging a part of a photo, like a face, has become common in sites like Flickr and Facebook. On the other hand, tagging a video sequence is more complicated and time consuming, so that users just tag the overall content of a video. In this paper we present a method for automatic video annotation that increases the number of tags originally provided by users, and localizes them temporally, associating tags to keyframes. Our approach exploits collective knowledge embedded in user-generated tags and web sources, and visual similarity of keyframes and images uploaded to social sites like YouTube and Flickr, as well as web sources like Google and Bing. Given a keyframe, our method is able to select on the fly from these visual sources the training exemplars that should be the most relevant for this test sample, and proceeds to transfer labels across similar images. Compared to existing video tagging approaches that require training classifiers for each tag, our system has few parameters, is easy to implement and can deal with an open vocabulary scenario. We demonstrate the approach on tag refinement and localization on DUT-WEBV, a large dataset of web videos, and show state-of-the-art results.Comment: Preprint submitted to Computer Vision and Image Understanding (CVIU

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università degli Studi di Udine

Florence Research

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Archivio istituzionale della ricerca - Università di Padova

Aesthetics assessment of videos through visual descriptors and automatic polarity annotation

Author: Hernández García Alejandro
Publication venue
Publication date: 04/03/2014
Field of study

En un mundo en el que las nuevas tecnologías están cada vez más ligadas a la información multimedia, el desarrollo de herramientas que permitan manejar fácilmente este tipo de datos se ha convertido en una tarea imprescindible, que ha despertado el interés científico en los últimos años. De entre las líneas de investigación que han empezado a desarrollarse recientemente, el estudio de características subjetivas en material audiovisual a partir de datos objetivos es de especial interés por cuanto puede ser aplicado a sistemas de clasificación y de recomendación. Este documento presenta un trabajo de investigación centrado en el estudio de modelos que permitan predecir automáticamente la satisfacción o interés que despierta un vídeo, concretamente un anuncio publicitario de un coche, en los usuarios de YouTube que lo ven, a partir de los descriptores de bajo nivel del v ́ıdeo. Un aspecto novedoso de este trabajo es el planteamiento de una solución para este tipo de problemas basada en un procedimiento para obtener automáticamente el etiquetado de los vídeos mediante técnicas de aprendizaje no supervisado. Para ello, se ha adquirido un conjunto de anuncios de coches junto con los metadatos asociados a cada vídeo que proporcionan los usuarios y que ofrecen información referente a la satisfacción que perciben estos cuando los visualizan en YouTube. Estos metadatos han permitido diseñar tres estrategias de análisis cluster para anotar automáticamente los vídeos, utilizando cada una de ellas un conjunto de metadatos diferente, de acuerdo a la manera en que los mismos son proporcionados por los usuarios. Por otro lado, se ha extraído, mediante técnicas de procesamiento de imagen y vídeo, un conjunto descriptores visuales de cada vídeo para posteriormente entrenar un sistema de aprendizaje de máquina que ha permitido el estudio de la relevancia y utilidad de este conjunto de descriptores para predecir el valor estético de los vídeos percibido por los usuarios.Grado en Ingeniería de Sistemas Audiovisuale

Universidad Carlos III de Madrid e-Archivo

Value annotation of web resources: the ValueML Language

Author: Romanin M.
Toppano E.
Publication venue: IARIA
Publication date: 01/01/2018
Field of study

In the multimedia design field, we have recently witnessed a shift of focus from products and the user' s experience to social effects of technologies and the quality of life. In this context, values play an important role. They may be inscribed within an artifact as symbolic meanings or as a built- in use consequence. In spite of their growing relevance, there is not yet a markup language for value annotation. This paper describes a proposal for filling this gap. After a brief review of various perspectives on the concept of value and relevant taxonomies, we discuss the syntax and semantics of a preliminary version of the ValueML language together with an example of annotation of a commercial video clip

Archivio istituzionale della ricerca - Università degli Studi di Udine

Role of shot length in characterizing tempo and dramatic story sections in motion pictures

Author: Adams Brett
Dorai Chitra
Venkatesh Svetha
Publication venue: [University of Sydney]
Publication date: 01/01/2000
Field of study

Motivated by existing cinematic conventions known as film grammar, we proposed a computational approach to determine tempo as a high-level movie content descriptor as well as means for deriving dramatic story sections and events occurring in movies. Movie tempo is extracted from two easily computed aspects in our approach: shot length and motion. Story sections and events are generally associated with changes in tempo, and are thus identified by edges located in the tempo function. In this paper, we analyze our initial founding of the tempo function on the basis that the distribution of both shot length and motion in movies is normal. Given that the distribution of shot length is approximately Weibull as confirmed in our experiments, we examine the impact of modelling and modifying the contributions of shot length to tempo. We derive an appropriate normalization function that faithfully encapsulates the role of shot length in tempo perception, and analyze the changes to the story sections identified in films.<br /

Deakin Research Online