807 research outputs found
Methods for Affective Content Analysis and Recognition in Film
The research presented in this thesis resulted from the growing attention on the effects of emotion on users, raising questions about their potential application to computational systems.
This research investigates the best methods for determining affective scoring for video content, specifically films. This resulted in the affective video system (AVS) framework, AVS dataset and AVS systems being developed, leading to several contributions to knowledge about the best affective methods and systems.
This work presents the necessary theory to understand the subject area. It builds as the thesis matures, laying a pathway in the form of a methodology framework for viewing affective problems and systems, moving into a subsequent study reviewing the well-recognised affective methods such as the International Affective Picture System (IAPS) and how its well-defined processes and procedures could be adapted for a more modern approach using video content. The research then studies the most critical perceivable features from video clips for users, which were analysed using the repertory grid approach.
This led to the above contributions being combined to create the AVS system and database, which is a unique database comprising the affective scores for various film clips. This research concluded with the presentation of the best regression methods resulting from this research and its datasets and a summary of this performance, and discussions of the results in terms of other research in this area
Toward media collection-based storytelling
Thesis (S.M.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2007.Includes bibliographical references (p. 113-118).Life is filled with stories. Modern technologies enable us to document and share life events with various kinds of media, such as photos, videos, etc. But people still find it time-consuming to select and arrange media fragments to create coherent and engaging narratives. This thesis proposes a novel storytelling system called Storied Navigation, which lets users assemble a sequence of video clips based on their roles in telling a story, rather than solely by explicit start and end times. Storied Navigation uses textual annotations expressed in unconstrained natural language, using parsing and Commonsense reasoning to deduce possible connections between the narrative intent of the storyteller, and descriptions of events and characters in the video. It helps users increase their familiarity with a documentary video corpus. It helps them develop story threads by prompting them with recommendations of alternatives as well as possible continuations for each selected video clip. We view it as a promising first step towards transforming today's fragmented media production experience into an enjoyable, integrated storytelling activity.Edward Yu-Te Chen.S.M
ATM network impairment to video quality
Includes bibliographical reference
CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines
Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective.
The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines.
From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research
“Why I Press Play:” A Phenomenological Study of Teachers Using Film for Literacy in Appalachian Schools
This study examines the experiences of teachers in rural, Appalachian classrooms who use film as a text. Film, in this study, was both an ensemble to be used for classroom viewing purposes and a creative writing opportunity for composition. The researcher drew on the methodology of hermeneutic phenomenology, drawing on the work of Merleau-Ponty, in constructing this work. In all, five teachers shared their thinking about how to use film most effectively with reading and writing tasks. These teachers shared a wide range of practices within the structure of their classrooms, and noted their own engagement with film. Popular films, short clips, educational videos, and teacher- and student-created projects were all considered, among other visual practices. Data collection involved an interview at the beginning of the research cycle, followed by teacher audio-recorded and/or written logs, collection of supplemental teaching documents, and a final interview. This dissertation explores four major themes that resulted from the research process, as well as providing an introduction to frame the conversation, a review of the literature to demonstrate what has already been done with film in reading and writing in specific content areas, and notes on implications for current practice, policy, and research drawn from the study
A survey of digital television broadcast transmission techniques
This paper is a survey of the transmission techniques used in digital television (TV) standards worldwide. With the increase in the demand for High-Definition (HD) TV, video-on-demand and mobile TV services, there was a real need for more bandwidth-efficient, flawless and crisp video quality, which motivated the migration from analogue to digital broadcasting. In this paper we present a brief history of the development of TV and then we survey the transmission technology used in different digital terrestrial, satellite, cable and mobile TV standards in different parts of the world. First, we present the Digital Video Broadcasting standards developed in Europe for terrestrial (DVB-T/T2), for satellite (DVB-S/S2), for cable (DVB-C) and for hand-held transmission (DVB-H). We then describe the Advanced Television System Committee standards developed in the USA both for terrestrial (ATSC) and for hand-held transmission (ATSC-M/H). We continue by describing the Integrated Services Digital Broadcasting standards developed in Japan for Terrestrial (ISDB-T) and Satellite (ISDB-S) transmission and then present the International System for Digital Television (ISDTV), which was developed in Brazil by adopteding the ISDB-T physical layer architecture. Following the ISDTV, we describe the Digital Terrestrial television Multimedia Broadcast (DTMB) standard developed in China. Finally, as a design example, we highlight the physical layer implementation of the DVB-T2 standar
An overview of video recommender systems: state-of-the-art and research issues
Video platforms have become indispensable components within a diverse range of applications, serving various purposes in entertainment, e-learning, corporate training, online documentation, and news provision. As the volume and complexity of video content continue to grow, the need for personalized access features becomes an inevitable requirement to ensure efficient content consumption. To address this need, recommender systems have emerged as helpful tools providing personalized video access. By leveraging past user-specific video consumption data and the preferences of similar users, these systems excel in recommending videos that are highly relevant to individual users. This article presents a comprehensive overview of the current state of video recommender systems (VRS), exploring the algorithms used, their applications, and related aspects. In addition to an in-depth analysis of existing approaches, this review also addresses unresolved research challenges within this domain. These unexplored areas offer exciting opportunities for advancements and innovations, aiming to enhance the accuracy and effectiveness of personalized video recommendations. Overall, this article serves as a valuable resource for researchers, practitioners, and stakeholders in the video domain. It offers insights into cutting-edge algorithms, successful applications, and areas that merit further exploration to advance the field of video recommendation
Scalable Exploration of Complex Objects and Environments Beyond Plain Visual Replication
Digital multimedia content and presentation means are rapidly increasing their sophistication and are now capable of describing detailed representations of the physical world. 3D exploration experiences allow people to appreciate, understand and interact with intrinsically virtual objects.
Communicating information on objects requires the ability to explore them under different angles, as well as to mix highly photorealistic or illustrative presentations of the object themselves with additional data that provides additional insights on these objects, typically represented in the form of annotations. Effectively providing these capabilities requires the solution of important problems in visualization and user interaction.
In this thesis, I studied these problems in the cultural heritage-computing-domain, focusing on the very common and important special case of mostly planar, but visually, geometrically, and semantically rich objects. These could be generally roughly flat objects with a standard frontal viewing direction (e.g., paintings, inscriptions, bas-reliefs), as well as visualizations of fully 3D objects from a particular point of views (e.g., canonical views of buildings or statues). Selecting a precise application domain and a specific presentation mode allowed me to concentrate on the well defined use-case of the exploration of annotated relightable stratigraphic models (in particular, for local and remote museum presentation).
My main results and contributions to the state of the art have been a novel technique for interactively controlling visualization lenses while automatically maintaining good focus-and-context parameters, a novel approach for avoiding clutter in an annotated model and for guiding users towards interesting areas, and a method for structuring audio-visual object annotations into a graph and for using that graph to improve guidance and support storytelling and automated tours.
We demonstrated the effectiveness and potential of our techniques by performing interactive exploration sessions on various screen sizes and types ranging from desktop devices to large-screen displays for a walk-up-and-use museum installation.
KEYWORDS - Computer Graphics, Human-Computer Interaction, Interactive Lenses, Focus-and-Context, Annotated Models, Cultural Heritage Computing
Recommended from our members
Intelligent image cropping and scaling
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University, 2011.Nowadays, there exist a huge number of end devices with different screen properties for
watching television content, which is either broadcasted or transmitted over the internet.
To allow best viewing conditions on each of these devices, different image formats have
to be provided by the broadcaster. Producing content for every single format is,
however, not applicable by the broadcaster as it is much too laborious and costly.
The most obvious solution for providing multiple image formats is to produce one high resolution format and prepare formats of lower resolution from this. One possibility to do this is to simply scale video images to the resolution of the target image format. Two significant drawbacks are the loss of image details through ownscaling and possibly unused image areas due to letter- or pillarboxes. A preferable solution is to find the contextual most important region in the high-resolution format at first and crop this area with an aspect ratio of the target image format afterwards. On the other hand, defining
the contextual most important region manually is very time consuming. Trying to apply that to live productions would be nearly impossible. Therefore, some approaches exist that automatically define cropping areas. To do so, they extract visual features, like moving reas in a video, and define regions of interest
(ROIs) based on those. ROIs are finally used to define an enclosing cropping area. The
extraction of features is done without any knowledge about the type of content. Hence,
these approaches are not able to distinguish between features that might be important in
a given context and those that are not.
The work presented within this thesis tackles the problem of extracting visual features based on prior knowledge about the content. Such knowledge is fed into the system in form of metadata that is available from TV production environments. Based on the
extracted features, ROIs are then defined and filtered dependent on the analysed
content. As proof-of-concept, this application finally adapts SDTV (Standard Definition Television) sports productions automatically to image formats with lower resolution through intelligent cropping and scaling. If no content information is available, the system can still be applied on any type of content through a default mode. The presented approach is based on the principle of a plug-in system. Each plug-in
represents a method for analysing video content information, either on a low level by
extracting image features or on a higher level by processing extracted ROIs. The
combination of plug-ins is determined by the incoming descriptive production metadata
and hence can be adapted to each type of sport individually. The application has been comprehensively evaluated by comparing the results of the system against alternative cropping methods. This evaluation utilised videos which were manually cropped by a professional video editor, statically cropped videos and simply scaled, non-cropped videos. In addition to and apart from purely subjective evaluations,
the gaze positions of subjects watching sports videos have been measured and compared
to the regions of interest positions extracted by the system
Audio description of audiovisual programmes for the visually impaired in Hong Kong
Audio description (AD) is a means of translating visual and sound elements in audiovisual programmes, as well as in the performing and visual arts, into verbal elements, thus making these materials accessible to viewers with visual impairments. It has been a major area of interest within the field of audiovisual translation studies in recent years and a considerable amount of literature has been published on end users’ reception in Western countries. When it comes to the Chinese speaking world, little literature is available on AD reception studies and no previous works have investigated the media uses and gratifications of the blind and the partially sighted in Hong Kong. The main purpose of this research is to examine the media use behaviour and motivations as well as the reception and preferences of the visually impaired audiences when consuming AD. After examining the main characteristics of AD and its history in Hong Kong, the study focuses on a media accessibility survey under the uses and gratifications framework, and an AD reception study. The views of 44 blind and partially sighted participants are elicited through individual face-to-face interviews. During the reception study, a pre-questionnaire, a questionnaire proper, experimental clips with different versions of AD, and a post-questionnaire were used to identify their AD preferences. Both quantitative and qualitative data were collected. The results reveal that the participants are not satisfied with the current provision of AD services, they demand higher volumes of materials with AD, and have certain AD preferences that if taken properly into account by the industry could help improve their comprehension of audiovisual programmes. The findings offer important insights into the situation of AD in Hong Kong and recommendations are put forward for future developments to serve the community, especially in terms of training audio describers
- …