9,000 research outputs found

    Semantic analysis of field sports video using a petri-net of audio-visual concepts

    Get PDF
    The most common approach to automatic summarisation and highlight detection in sports video is to train an automatic classifier to detect semantic highlights based on occurrences of low-level features such as action replays, excited commentators or changes in a scoreboard. We propose an alternative approach based on the detection of perception concepts (PCs) and the construction of Petri-Nets which can be used for both semantic description and event detection within sports videos. Low-level algorithms for the detection of perception concepts using visual, aural and motion characteristics are proposed, and a series of Petri-Nets composed of perception concepts is formally defined to describe video content. We call this a Perception Concept Network-Petri Net (PCN-PN) model. Using PCN-PNs, personalized high-level semantic descriptions of video highlights can be facilitated and queries on high-level semantics can be achieved. A particular strength of this framework is that we can easily build semantic detectors based on PCN-PNs to search within sports videos and locate interesting events. Experimental results based on recorded sports video data across three types of sports games (soccer, basketball and rugby), and each from multiple broadcasters, are used to illustrate the potential of this framework

    Examining the role of smart TVs and VR HMDs in synchronous at-a-distance media consumption

    Get PDF
    This article examines synchronous at-a-distance media consumption from two perspectives: How it can be facilitated using existing consumer displays (through TVs combined with smartphones), and imminently available consumer displays (through virtual reality (VR) HMDs combined with RGBD sensing). First, we discuss results from an initial evaluation of a synchronous shared at-a-distance smart TV system, CastAway. Through week-long in-home deployments with five couples, we gain formative insights into the adoption and usage of at-a-distance media consumption and how couples communicated during said consumption. We then examine how the imminent availability and potential adoption of consumer VR HMDs could affect preferences toward how synchronous at-a-distance media consumption is conducted, in a laboratory study of 12 pairs, by enhancing media immersion and supporting embodied telepresence for communication. Finally, we discuss the implications these studies have for the near-future of consumer synchronous at-a-distance media consumption. When combined, these studies begin to explore a design space regarding the varying ways in which at-a-distance media consumption can be supported and experienced (through music, TV content, augmenting existing TV content for immersion, and immersive VR content), what factors might influence usage and adoption and the implications for supporting communication and telepresence during media consumption

    Dialectical Polyptych: an interactive movie installation

    Get PDF
    Most of the known video games developed by important software companies usually establish an approach to the cinematic language in an attempt to create a perfect combination of narrative, visual technique and interaction. Unlike most video games, interactive film narratives normally involve an interruption in time whenever the spectator has to make choices. “Dialectical Polyptych” is an interactive movie included in a project called “Characters looking for a spectactor”, which aims to give the spectator on-the-fly control over film editing, thus exploiting the role of the spectator as an active subject in the presented narrative. This paper presents an installation based on a mobile device, which allows seamless real-time interactivity with the movie. Different finger touches in the screen allow the spectator to alternate between two parallel narratives, both producing a complementary narrative, and change the angle or shot within each narrative.info:eu-repo/semantics/publishedVersio

    A framework for realistic 3D tele-immersion

    Get PDF
    Meeting, socializing and conversing online with a group of people using teleconferencing systems is still quite differ- ent from the experience of meeting face to face. We are abruptly aware that we are online and that the people we are engaging with are not in close proximity. Analogous to how talking on the telephone does not replicate the experi- ence of talking in person. Several causes for these differences have been identified and we propose inspiring and innova- tive solutions to these hurdles in attempt to provide a more realistic, believable and engaging online conversational expe- rience. We present the distributed and scalable framework REVERIE that provides a balanced mix of these solutions. Applications build on top of the REVERIE framework will be able to provide interactive, immersive, photo-realistic ex- periences to a multitude of users that for them will feel much more similar to having face to face meetings than the expe- rience offered by conventional teleconferencing systems

    The perceptual and attentive impact of delay and jitter in multimedia delivery

    Get PDF
    In this paper we present the results of a study that examines the user’s perception—understood as both information assimilation and subjective satisfaction—of multimedia quality, when impacted by varying network-level parameters (delay and jitter). In addition, we integrate eye-tracking assessment to provide a more complete understanding of user perception of multimedia quality. Results show that delay and jitter significantly affect user satisfaction; variation in video eye path when either no single/obvious point of focus exists or when the point of attention changes dramatically. Lastly, results showed that content variation significantly affected user satisfaction, as well as user information assimilation

    SALSA: A Novel Dataset for Multimodal Group Behavior Analysis

    Get PDF
    Studying free-standing conversational groups (FCGs) in unstructured social settings (e.g., cocktail party ) is gratifying due to the wealth of information available at the group (mining social networks) and individual (recognizing native behavioral and personality traits) levels. However, analyzing social scenes involving FCGs is also highly challenging due to the difficulty in extracting behavioral cues such as target locations, their speaking activity and head/body pose due to crowdedness and presence of extreme occlusions. To this end, we propose SALSA, a novel dataset facilitating multimodal and Synergetic sociAL Scene Analysis, and make two main contributions to research on automated social interaction analysis: (1) SALSA records social interactions among 18 participants in a natural, indoor environment for over 60 minutes, under the poster presentation and cocktail party contexts presenting difficulties in the form of low-resolution images, lighting variations, numerous occlusions, reverberations and interfering sound sources; (2) To alleviate these problems we facilitate multimodal analysis by recording the social interplay using four static surveillance cameras and sociometric badges worn by each participant, comprising the microphone, accelerometer, bluetooth and infrared sensors. In addition to raw data, we also provide annotations concerning individuals' personality as well as their position, head, body orientation and F-formation information over the entire event duration. Through extensive experiments with state-of-the-art approaches, we show (a) the limitations of current methods and (b) how the recorded multiple cues synergetically aid automatic analysis of social interactions. SALSA is available at http://tev.fbk.eu/salsa.Comment: 14 pages, 11 figure

    Sonification of experimental parameters as a new method for efficient coding of behavior

    No full text
    Cognitive research is often focused on experimental condition-driven reactions. Ethological studies frequently rely on the observation of naturally occurring specific behaviors. In both cases, subjects are filmed during the study, so that afterwards behaviors can be coded on video. Coding should typically be blind to experimental conditions, but often requires more information than that present on video. We introduce a method for blindcoding of behavioral videos that takes care of both issues via three main innovations. First, of particular significance for playback studies, it allows creation of a “soundtrack” of the study, that is, a track composed of synthesized sounds representing different aspects of the experimental conditions, or other events, over time. Second, it facilitates coding behavior using this audio track, together with the possibly muted original video. This enables coding blindly to conditions as required, but not ignoring other relevant events. Third, our method makes use of freely available, multi-platform software, including scripts we developed
    corecore