4,582 research outputs found

    Video Shot Clustering using Spectral Methods

    Get PDF
    The automatic segmentation and structuring of videos present technical challenges due to the large variation of content, spatial layout, and possible lack of storyline. In this paper, we propose a spectral method to group video shots into scenes based on their visual similarity and temporal relations. Spectral methods have been shown to be effective in capturing perceptual organization features. In particular, we investigate the problem of automatic model selection, which is currently an open research issue for spectral methods, and propose measures to assess the validity of a grouping result. The methodology is used to group shots from home videos and soccer games. The results indicate the validity of the proposed approach, both compared to existing techniques as well as to human performance

    Assessing Scene Structuring in Consumer Videos

    Get PDF
    Scene structuring is a video analysis task for which no common evaluation procedures have been fully adopted. In this paper, we present a methodology to evaluate such task in home videos, which takes into account human judgement, and includes a representative corpus, a set of objective performance measures, and an evaluation protocol. The components of our approach are detailed as follows. First, we describe the generation of a set of home video scene structures produced by multiple people. Second, we define similarity measures that model variations with respect to two factors: human perceptual organization and level of structure granularity. Third, we describe a protocol for evaluation of automatic algorithms based on their comparison to human performance. We illustrate our methodology by assessing the performance of two recently proposed methods: probabilistic hierarchical clustering and spectral clustering

    Hierarchical Hidden Markov Model in Detecting Activities of Daily Living in Wearable Videos for Studies of Dementia

    Get PDF
    International audienceThis paper presents a method for indexing activities of daily living in videos obtained from wearable cameras. In the context of dementia diagnosis by doctors, the videos are recorded at patients' houses and later visualized by the medical practitioners. The videos may last up to two hours, therefore a tool for an efficient navigation in terms of activities of interest is crucial for the doctors. The specific recording mode provides video data which are really difficult, being a single sequence shot where strong motion and sharp lighting changes often appear. Our work introduces an automatic motion based segmentation of the video and a video structuring approach in terms of activities by a hierarchical two-level Hidden Markov Model. We define our description space over motion and visual characteristics of video and audio channels. Experiments on real data obtained from the recording at home of several patients show the difficulty of the task and the promising results of our approach

    Assessing Scene Structuring in Consumer Videos

    Full text link

    An Overview of Video Shot Clustering and Summarization Techniques for Mobile Applications

    Get PDF
    The problem of content characterization of video programmes is of great interest because video appeals to large audiences and its efficient distribution over various networks should contribute to widespread usage of multimedia services. In this paper we analyze several techniques proposed in literature for content characterization of video programmes, including movies and sports, that could be helpful for mobile media consumption. In particular we focus our analysis on shot clustering methods and effective video summarization techniques since, in the current video analysis scenario, they facilitate the access to the content and help in quick understanding of the associated semantics. First we consider the shot clustering techniques based on low-level features, using visual, audio and motion information, even combined in a multi-modal fashion. Then we concentrate on summarization techniques, such as static storyboards, dynamic video skimming and the extraction of sport highlights. Discussed summarization methods can be employed in the development of tools that would be greatly useful to most mobile users: in fact these algorithms automatically shorten the original video while preserving most events by highlighting only the important content. The effectiveness of each approach has been analyzed, showing that it mainly depends on the kind of video programme it relates to, and the type of summary or highlights we are focusing on

    Black box

    Get PDF
    Fear and paranoia are steadily on the rise throughout the world as a result, in part, of media\u27s presentation of violent and traumatic imagery. The dissemination and reception of these types of images are consequential for a viewing public, including an increasing desensitization to violence through over-exposure; the potential for aggressive behavior by people of all ages; and the loss of a viewer\u27s accountability as witness to a disturbing event. Black Box is an aesthetic investigation of the reception of traumatic images by a viewing public. In order to trace this reception, the image of the American crow (Corvus brachyrhynchos), removed from its natural context, is transformed via moving imagery into literal, violent recreations of events and images present within today\u27s media-soaked culture. The crow functions as a metaphor of the ways in which images are first read and then subsequently shape contemporary viewership. The use of video identifies the disseminating power of 24-hour media, with its telltale marks of time and sequence, recording and broadcasting. Moving imagery, sound production, and the metaphorical presentation of the crow combine to create a visual metonym for conflict and suggest an ominous threat of trauma

    Machine vision applications in UAVs for autonomous aerial refueling and runway detection

    Get PDF
    This research focuses on the application of Machine Vision (MV) techniques and algorithms to the problems of Autonomous Aerial Refueling (AAR) and Runway Detection. In particular, real laboratory based hardware was used in a simulated environment to emulate real-life conditions for AAR. It was shown that the K-Means Clustering Algorithm solution to the Marker Detection problem could be executed at a frame rate of 30 Hz and it averaged a tracking error of less than one pixel while utilizing only 0.16% of the image. It was also shown that the solution to the Runway Detection problem could be executed at a frame rate of 20 Hz which is acceptable for use in an UAV performing reconnaissance work. Data from these tests suggest that both software schemes are suitable for applications in moving vehicles and that the accuracy of the measurements produced by the schemes make them suitable for UAV applications

    Fog Computing in Medical Internet-of-Things: Architecture, Implementation, and Applications

    Full text link
    In the era when the market segment of Internet of Things (IoT) tops the chart in various business reports, it is apparently envisioned that the field of medicine expects to gain a large benefit from the explosion of wearables and internet-connected sensors that surround us to acquire and communicate unprecedented data on symptoms, medication, food intake, and daily-life activities impacting one's health and wellness. However, IoT-driven healthcare would have to overcome many barriers, such as: 1) There is an increasing demand for data storage on cloud servers where the analysis of the medical big data becomes increasingly complex, 2) The data, when communicated, are vulnerable to security and privacy issues, 3) The communication of the continuously collected data is not only costly but also energy hungry, 4) Operating and maintaining the sensors directly from the cloud servers are non-trial tasks. This book chapter defined Fog Computing in the context of medical IoT. Conceptually, Fog Computing is a service-oriented intermediate layer in IoT, providing the interfaces between the sensors and cloud servers for facilitating connectivity, data transfer, and queryable local database. The centerpiece of Fog computing is a low-power, intelligent, wireless, embedded computing node that carries out signal conditioning and data analytics on raw data collected from wearables or other medical sensors and offers efficient means to serve telehealth interventions. We implemented and tested an fog computing system using the Intel Edison and Raspberry Pi that allows acquisition, computing, storage and communication of the various medical data such as pathological speech data of individuals with speech disorders, Phonocardiogram (PCG) signal for heart rate estimation, and Electrocardiogram (ECG)-based Q, R, S detection.Comment: 29 pages, 30 figures, 5 tables. Keywords: Big Data, Body Area Network, Body Sensor Network, Edge Computing, Fog Computing, Medical Cyberphysical Systems, Medical Internet-of-Things, Telecare, Tele-treatment, Wearable Devices, Chapter in Handbook of Large-Scale Distributed Computing in Smart Healthcare (2017), Springe

    Algorithms for Video Structuring

    Get PDF
    Video structuring aims at automatically finding structure in a video sequence. Occupying a key-position within video analysis, it is a fundamental step for quality indexing and browsing. As a low level video analysis, video structuring can be seen as a serial process which includes (i) shot boundary detection, (ii) video shot feature extraction and (iii) video shot clustering. The resulting analysis serves as the base for higher level processing such as content-based image retrieval or semantic indexing. In this study, the whole process is examined and implemented. Two shot boundary detectors based on motion estimation and color distribution analysis are designed. Based on recent advances in machine learning, a novel technique for video shot clustering is presented. Typical approaches for segmenting and clustering shots use graph analysis, with split and merge algorithms for finding subgraphs corresponding to different scenes. In this work, the clustering algorithm is based on a spectral method which has proven its efficiency in still-image segmentation. This technique clusters points (in our case features extracted from video shots) using eigenvectors of matrices derived from data. Relevant data depends of the quality of feature extraction. After stating the main problems of video structuring, solutions are proposed defining an heuristical distance metric for similarity between shots. We combine color visual features with time constraints. The entire process of video structuring is tested on a ten hours home video database

    Diasporic Archives and Hauntological Accretions

    Get PDF
    Centering on two recent participatory archive projects, Jacqueline Hoàng Nguyễn’s The Making of An Archive (2014-present), and Regent Park Film Festival’s Home Made Visible (2017-2019), this essay examines how diasporic archives “densify” authoritative records, and allow us to think generatively about archival movements and accretions. Both projects gathered and digitised archives from members of diasporic and racialised communities. Through public calls and workshops soliciting amateur archivists’ personal and familial still and moving image troves, these projects prioritised excavating and inscribing quotidian and ephemeral records as a response to Canadian multiculturalism’s imposed silences. The essay approaches diaspora – and diasporic archives – not (just) through rubrics of loss and obsolescence, but through the concept of hauntological thickening, arguing that these two projects intervene on authoritative and singular archival narratives by densifying the latter with occluded histories, affects, and textural traces of transfer. It also examines how quotidian visual records offer hauntological refractions of official narratives, and become vehicles for imbrications of personal, familial, and national histories and discourses. Finally, the essay concludes with an exploration of how the archives engage audiences through affective and sensorial registers.&nbsp
    • …
    corecore