527 research outputs found

    On-line video abstraction

    Full text link
    Tesis doctoral inédita. Universidad Autónoma de Madrid, Escuela Politécnica Superior, abril de 201

    Audio-visual football video analysis, from structure detection to attention analysis

    Get PDF
    Sport video is an important video genre. Content-based sports video analysis attracts great interest from both industry and academic fields. A sports video is characterised by repetitive temporal structures, relatively plain contents, and strong spatio-temporal variations, such as quick camera switches and swift local motions. It is necessary to develop specific techniques for content-based sports video analysis to utilise these characteristics. For an efficient and effective sports video analysis system, there are three fundamental questions: (1) what are key stories for sports videos; (2) what incurs viewer’s interest; and (3) how to identify game highlights. This thesis is developed around these questions. We approached these questions from two different perspectives and in turn three research contributions are presented, namely, replay detection, attack temporal structure decomposition, and attention-based highlight identification. Replay segments convey the most important contents in sports videos. It is an efficient approach to collect game highlights by detecting replay segments. However, replay is an artefact of editing, which improves with advances in video editing tools. The composition of replay is complex, which includes logo transitions, slow motions, viewpoint switches and normal speed video clips. Since logo transition clips are pervasive in game collections of FIFA World Cup 2002, FIFA World Cup 2006 and UEFA Championship 2006, we take logo transition detection as an effective replacement of replay detection. A two-pass system was developed, including a five-layer adaboost classifier and a logo template matching throughout an entire video. The five-layer adaboost utilises shot duration, average game pitch ratio, average motion, sequential colour histogram and shot frequency between two neighbouring logo transitions, to filter out logo transition candidates. Subsequently, a logo template is constructed and employed to find all transition logo sequences. The precision and recall of this system in replay detection is 100% in a five-game evaluation collection. An attack structure is a team competition for a score. Hence, this structure is a conceptually fundamental unit of a football video as well as other sports videos. We review the literature of content-based temporal structures, such as play-break structure, and develop a three-step system for automatic attack structure decomposition. Four content-based shot classes, namely, play, focus, replay and break were identified by low level visual features. A four-state hidden Markov model was trained to simulate transition processes among these shot classes. Since attack structures are the longest repetitive temporal unit in a sports video, a suffix tree is proposed to find the longest repetitive substring in the label sequence of shot class transitions. These occurrences of this substring are regarded as a kernel of an attack hidden Markov process. Therefore, the decomposition of attack structure becomes a boundary likelihood comparison between two Markov chains. Highlights are what attract notice. Attention is a psychological measurement of “notice ”. A brief survey of attention psychological background, attention estimation from vision and auditory, and multiple modality attention fusion is presented. We propose two attention models for sports video analysis, namely, the role-based attention model and the multiresolution autoregressive framework. The role-based attention model is based on the perception structure during watching video. This model removes reflection bias among modality salient signals and combines these signals by reflectors. The multiresolution autoregressive framework (MAR) treats salient signals as a group of smooth random processes, which follow a similar trend but are filled with noise. This framework tries to estimate a noise-less signal from these coarse noisy observations by a multiple resolution analysis. Related algorithms are developed, such as event segmentation on a MAR tree and real time event detection. The experiment shows that these attention-based approach can find goal events at a high precision. Moreover, results of MAR-based highlight detection on the final game of FIFA 2002 and 2006 are highly similar to professionally labelled highlights by BBC and FIFA

    How to teach digital reading?

    Get PDF
    This paper offers a discussion of the knowledge, skills, and awareness involved in digital reading. Reading, in this paper, is used in the broader sense to include deriving meaning from media on a digital screen. This paper synthesises key ideas from existing studies and presents a taxonomy for the teaching of digital reading. The taxonomy includes the development of: 1) the knowledge of linear and deep reading strategies; 2) basic and critical information skills; and 3) a multimodal semiotic awareness. The goal of this paper is to unpack the specific knowledge and skills for digital reading which will support educators, including classroom teachers and librarians, on the aspects to pay attention to as students engage in digital reading. This paper argues that, in addition to equipping students with the knowledge of reading strategies and information skills, an awareness of how the various semiotic modes make meaning is fundamental to effective digital reading

    Successful approaches to mental practice: A case study of four pianists

    Get PDF
    Musicians often use mental practice for enhancing performance, but individuals may have different preferences and skills in their characteristic, individually successful ways of carrying out such practice. In this study, we focus on the approaches to mental practice of four pianists who, according to the ratings of a panel of expert judges, showed outstanding improvement in their performances following their mental practice of a new piece in at least one of the two conditions: silent reading of the score or reading the score while simultaneously listening to the music. The four pianists’ approaches to mental practice were studied through self-reports in post-task interviews that were compared with eye-tracking data gathered during the actual mental practice. In successful mental practice, the pianists relied on their experience and the skills they had practised in audiation, use of recordings, imaginary rehearsal, and structural analysis. The results encourage musicians to explore their characteristic approaches to mental practice, and to deliberately practise and develop versatile mental practice skills in order to apply them flexibly in different musical situations. Eye tracking was found to be a useful tool for validating and supplementing musicians’ subjective self-descriptions and for revealing covert mental processes in the context of music reading.</p

    Uncertainty-aware video visual analytics of tracked moving objects

    Get PDF
    Vast amounts of video data render manual video analysis useless while recent automatic video analytics techniques suffer from insufficient performance. To alleviate these issues we present a scalable and reliable approach exploiting the visual analytics methodology. This involves the user in the iterative process of exploration hypotheses generation and their verification. Scalability is achieved by interactive filter definitions on trajectory features extracted by the automatic computer vision stage. We establish the interface between user and machine adopting the VideoPerpetuoGram (VPG) for visualization and enable users to provide filter-based relevance feedback. Additionally users are supported in deriving hypotheses by context-sensitive statistical graphics. To allow for reliable decision making we gather uncertainties introduced by the computer vision step communicate these information to users through uncertainty visualization and grant fuzzy hypothesis formulation to interact with the machine. Finally we demonstrate the effectiveness of our approach by the video analysis mini challenge which was part of the IEEE Symposium on Visual Analytics Science and Technology 2009

    Surveillance video summarization based on trajectory rarity measure

    Get PDF
    The dynamic video summarization of surveillance videos has several critical applications, mainly due to the wide availability of digital cameras in environments such as airports, train and bus stations, shopping centers, stadiums, buildings, schools, hospitals, roads, among others. This study presents an approach for the generation of dynamic summary on surveillance video domain based on human trajectories. It has an emphasis on trajectory descriptors in conjunction with the unsupervised clustering method. Our approach contribute to existing literature concerning the combination of methods and objectives. We hypothesize that the clustering of trajectories permits to identify rare trajectories base on their morphology. The clustering as an output provides numerous subsets of trajectories or clusters and the number of elements of a specific cluster is used to determine their rarity. Those subsets with few components are rare while the others that have a high number of elements are considered ordinary; therefore, the implications of our study show that is possible to use unsupervised clustering for automatic detection of rare trajectories based on their morphology and with this information segment videos. We experimented with different sets of trajectories segmenting the rare videos from our ground truth.Trabajo de investigació

    Automatic non-linear video editing for home video collections

    Get PDF
    The video editing process consists of deciding what elements to retain, delete, or combine from various video sources so that they come together in an organized, logical, and visually pleasing manner. Before the digital era, non-linear editing involved the arduous process of physically cutting and splicing video tapes, and was restricted to the movie industry and a few video enthusiasts. Today, when digital cameras and camcorders have made large personal video collections commonplace, non-linear video editing has gained renewed importance and relevance. Almost all available video editing systems today are dependent on considerable user interaction to produce coherent edited videos. In this work, we describe an automatic non-linear video editing system for generating coherent movies from a collection of unedited personal videos. Our thesis is that computing image-level visual similarity in an appropriate manner forms a good basis for automatic non-linear video editing. To our knowledge, this is a novel approach to solving this problem. The generation of output video from the system is guided by one or more input keyframes from the user, which guide the content of the output video. The output video is generated in a manner such that it is non-repetitive and follows the dynamics of the input videos. When no input keyframes are provided, our system generates "video textures" with the content of the output chosen at random. Our system demonstrates promising results on large video collections and is a first step towards increased automation in non-linear video editin
    corecore