70 research outputs found
1D-mosaics grouping using lattice vector quantization for a video browsing application
International audience1D-mosaics have been introduced as a tool for structuring and navigation in video content. These objects can be con- sidered as the spatio-temporal signatures of the video shots. Our work aims at grouping automatically the video shots into scenes using these signatures. The original method is based on the tree-structured lattice vector quantization of the 1D-mosaics. Because of the hierarchical structure of the code-books, they can be compared progressively, and lattice use is time efficient. Indexing retrieval results are given for two video sequences, and different mosaics are successively compared to each other in order to assess the presented scheme's effectiveness
Deliverable D1.1 State of the art and requirements analysis for hypervideo
This deliverable presents a state-of-art and requirements analysis report for hypervideo authored as part of the WP1 of the LinkedTV project. Initially, we present some use-case (viewers) scenarios in the LinkedTV project and through the analysis of the distinctive needs and demands of each scenario we point out the technical requirements from a user-side perspective. Subsequently we study methods for the automatic and semi-automatic decomposition of the audiovisual content in order to effectively support the annotation process. Considering that the multimedia content comprises of different types of information, i.e., visual, textual and audio, we report various methods for the analysis of these three different streams. Finally we present various annotation tools which could integrate the developed analysis results so as to effectively support users (video producers) in the semi-automatic linking of hypervideo content, and based on them we report on the initial progress in building the LinkedTV annotation tool. For each one of the different classes of techniques being discussed in the deliverable we present the evaluation results from the application of one such method of the literature to a dataset well-suited to the needs of the LinkedTV project, and we indicate the future technical requirements that should be addressed in order to achieve higher levels of performance (e.g., in terms of accuracy and time-efficiency), as necessary
Deliverable D1.2 Visual, text and audio information analysis for hypervideo, first release
Enriching videos by offering continuative and related information via, e.g., audiostreams, web pages, as well as other videos, is typically hampered by its demand for massive editorial work. While there exist several automatic and semi-automatic methods that analyze audio/video content, one needs to decide which method offers appropriate information for our intended use-case scenarios. We review the technology options for video analysis that we have access to, and describe which training material we opted for to feed our algorithms. For all methods, we offer extensive qualitative and quantitative results, and give an outlook on the next steps within the project
Interdisciplinarity in the Age of the Triple Helix: a Film Practitioner's Perspective
This integrative chapter contextualises my research including articles I have published as well as one of the creative artefacts developed from it, the feature film The Knife That Killed Me. I review my work considering the ways in which technology, industry methods and academic practice have evolved as well as how attitudes to interdisciplinarity have changed, linking these to Etzkowitz and Leydesdorff’s ‘Triple Helix’ model (1995). I explore my own experiences and observations of opportunities and challenges that have been posed by the intersection of different stakeholder needs and expectations, both from industry and academic perspectives, and argue that my work provides novel examples of the applicability of the ‘Triple Helix’ to the creative industries. The chapter concludes with a reflection on the evolution and direction of my work, the relevance of the ‘Triple Helix’ to creative practice, and ways in which this relationship could be investigated further
Change blindness: eradication of gestalt strategies
Arrays of eight, texture-defined rectangles were used as stimuli in a one-shot change blindness (CB) task where there was a 50% chance that one rectangle would change orientation between two successive presentations separated by an interval. CB was eliminated by cueing the target rectangle in the first stimulus, reduced by cueing in the interval and unaffected by cueing in the second presentation. This supports the idea that a representation was formed that persisted through the interval before being 'overwritten' by the second presentation (Landman et al, 2003 Vision Research 43149–164]. Another possibility is that participants used some kind of grouping or Gestalt strategy. To test this we changed the spatial position of the rectangles in the second presentation by shifting them along imaginary spokes (by ±1 degree) emanating from the central fixation point. There was no significant difference seen in performance between this and the standard task [F(1,4)=2.565, p=0.185]. This may suggest two things: (i) Gestalt grouping is not used as a strategy in these tasks, and (ii) it gives further weight to the argument that objects may be stored and retrieved from a pre-attentional store during this task
Video interaction using pen-based technology
Dissertação para obtenção do Grau de Doutor em
InformáticaVideo can be considered one of the most complete and complex media and its manipulating
is still a difficult and tedious task. This research applies pen-based technology to
video manipulation, with the goal to improve this interaction. Even though the human
familiarity with pen-based devices, how they can be used on video interaction, in order
to improve it, making it more natural and at the same time fostering the user’s creativity
is an open question.
Two types of interaction with video were considered in this work: video annotation
and video editing. Each interaction type allows the study of one of the interaction modes
of using pen-based technology: indirectly, through digital ink, or directly, trough pen
gestures or pressure. This research contributes with two approaches for pen-based video
interaction: pen-based video annotations and video as ink.
The first uses pen-based annotations combined with motion tracking algorithms, in
order to augment video content with sketches or handwritten notes. It aims to study how
pen-based technology can be used to annotate a moving objects and how to maintain the
association between a pen-based annotations and the annotated moving object
The second concept replaces digital ink by video content, studding how pen gestures
and pressure can be used on video editing and what kind of changes are needed in the
interface, in order to provide a more familiar and creative interaction in this usage context.This work was partially funded by the UTAustin-Portugal, Digital Media, Program
(Ph.D. grant: SFRH/BD/42662/2007 - FCT/MCTES); by the HP Technology for Teaching
Grant Initiative 2006; by the project "TKB - A Transmedia Knowledge Base for contemporary
dance" (PTDC/EAT/AVP/098220/2008 funded by FCT/MCTES); and by CITI/DI/FCT/UNL (PEst-OE/EEI/UI0527/2011
Contributions for the automatic description of multimodal scenes
Tese de doutoramento. Engenharia Electrotécnica e de Computadores. Faculdade de Engenharia. Universidade do Porto. 200
Multidimensional projections for the visual exploration of multimedia data
Multidimensional data analysis is considerably important when dealing with such large and complex datasets. Among the possibilities when analyzing such kind of data, applying visualization techniques can help the user find and understand patters, trends and establish new goals. This thesis aims to present several visualization methods to interactively explore multidimensional datasets aimed from specialized to casual users, by making use of both static and dynamic representations created by multidimensional projections
- …