4,582 research outputs found
Video Shot Clustering using Spectral Methods
The automatic segmentation and structuring of videos present technical challenges due to the large variation of content, spatial layout, and possible lack of storyline. In this paper, we propose a spectral method to group video shots into scenes based on their visual similarity and temporal relations. Spectral methods have been shown to be effective in capturing perceptual organization features. In particular, we investigate the problem of automatic model selection, which is currently an open research issue for spectral methods, and propose measures to assess the validity of a grouping result. The methodology is used to group shots from home videos and soccer games. The results indicate the validity of the proposed approach, both compared to existing techniques as well as to human performance
Assessing Scene Structuring in Consumer Videos
Scene structuring is a video analysis task for which no common evaluation procedures have been fully adopted. In this paper, we present a methodology to evaluate such task in home videos, which takes into account human judgement, and includes a representative corpus, a set of objective performance measures, and an evaluation protocol. The components of our approach are detailed as follows. First, we describe the generation of a set of home video scene structures produced by multiple people. Second, we define similarity measures that model variations with respect to two factors: human perceptual organization and level of structure granularity. Third, we describe a protocol for evaluation of automatic algorithms based on their comparison to human performance. We illustrate our methodology by assessing the performance of two recently proposed methods: probabilistic hierarchical clustering and spectral clustering
Hierarchical Hidden Markov Model in Detecting Activities of Daily Living in Wearable Videos for Studies of Dementia
International audienceThis paper presents a method for indexing activities of daily living in videos obtained from wearable cameras. In the context of dementia diagnosis by doctors, the videos are recorded at patients' houses and later visualized by the medical practitioners. The videos may last up to two hours, therefore a tool for an efficient navigation in terms of activities of interest is crucial for the doctors. The specific recording mode provides video data which are really difficult, being a single sequence shot where strong motion and sharp lighting changes often appear. Our work introduces an automatic motion based segmentation of the video and a video structuring approach in terms of activities by a hierarchical two-level Hidden Markov Model. We define our description space over motion and visual characteristics of video and audio channels. Experiments on real data obtained from the recording at home of several patients show the difficulty of the task and the promising results of our approach
An Overview of Video Shot Clustering and Summarization Techniques for Mobile Applications
The problem of content characterization of video programmes is of great interest because video appeals to large audiences and its efficient distribution over various networks should contribute to widespread usage of multimedia services. In this paper we analyze several techniques proposed in literature for content characterization of video programmes, including movies and sports, that could be helpful for mobile media consumption. In particular we focus our analysis on shot clustering methods and effective video summarization techniques since, in the current video analysis scenario, they facilitate the access to the content and help in quick understanding of the associated semantics. First we consider the shot clustering techniques based on low-level features, using visual, audio and motion information, even combined in a multi-modal fashion. Then we concentrate on summarization techniques, such as static storyboards, dynamic video
skimming and the extraction of sport highlights. Discussed summarization methods can be employed in the development of tools that would be greatly useful to most mobile users: in fact these algorithms automatically shorten the original video while preserving most events by highlighting only the important content. The effectiveness of each approach has been analyzed, showing that it mainly depends on the kind of video programme it relates to, and the type of summary or highlights we are focusing on
Black box
Fear and paranoia are steadily on the rise throughout the world as a result, in part, of media\u27s presentation of violent and traumatic imagery. The dissemination and reception of these types of images are consequential for a viewing public, including an increasing desensitization to violence through over-exposure; the potential for aggressive behavior by people of all ages; and the loss of a viewer\u27s accountability as witness to a disturbing event. Black Box is an aesthetic investigation of the reception of traumatic images by a viewing public. In order to trace this reception, the image of the American crow (Corvus brachyrhynchos), removed from its natural context, is transformed via moving imagery into literal, violent recreations of events and images present within today\u27s media-soaked culture. The crow functions as a metaphor of the ways in which images are first read and then subsequently shape contemporary viewership. The use of video identifies the disseminating power of 24-hour media, with its telltale marks of time and sequence, recording and broadcasting. Moving imagery, sound production, and the metaphorical presentation of the crow combine to create a visual metonym for conflict and suggest an ominous threat of trauma
Machine vision applications in UAVs for autonomous aerial refueling and runway detection
This research focuses on the application of Machine Vision (MV) techniques and algorithms to the problems of Autonomous Aerial Refueling (AAR) and Runway Detection. In particular, real laboratory based hardware was used in a simulated environment to emulate real-life conditions for AAR. It was shown that the K-Means Clustering Algorithm solution to the Marker Detection problem could be executed at a frame rate of 30 Hz and it averaged a tracking error of less than one pixel while utilizing only 0.16% of the image. It was also shown that the solution to the Runway Detection problem could be executed at a frame rate of 20 Hz which is acceptable for use in an UAV performing reconnaissance work. Data from these tests suggest that both software schemes are suitable for applications in moving vehicles and that the accuracy of the measurements produced by the schemes make them suitable for UAV applications
Fog Computing in Medical Internet-of-Things: Architecture, Implementation, and Applications
In the era when the market segment of Internet of Things (IoT) tops the chart
in various business reports, it is apparently envisioned that the field of
medicine expects to gain a large benefit from the explosion of wearables and
internet-connected sensors that surround us to acquire and communicate
unprecedented data on symptoms, medication, food intake, and daily-life
activities impacting one's health and wellness. However, IoT-driven healthcare
would have to overcome many barriers, such as: 1) There is an increasing demand
for data storage on cloud servers where the analysis of the medical big data
becomes increasingly complex, 2) The data, when communicated, are vulnerable to
security and privacy issues, 3) The communication of the continuously collected
data is not only costly but also energy hungry, 4) Operating and maintaining
the sensors directly from the cloud servers are non-trial tasks. This book
chapter defined Fog Computing in the context of medical IoT. Conceptually, Fog
Computing is a service-oriented intermediate layer in IoT, providing the
interfaces between the sensors and cloud servers for facilitating connectivity,
data transfer, and queryable local database. The centerpiece of Fog computing
is a low-power, intelligent, wireless, embedded computing node that carries out
signal conditioning and data analytics on raw data collected from wearables or
other medical sensors and offers efficient means to serve telehealth
interventions. We implemented and tested an fog computing system using the
Intel Edison and Raspberry Pi that allows acquisition, computing, storage and
communication of the various medical data such as pathological speech data of
individuals with speech disorders, Phonocardiogram (PCG) signal for heart rate
estimation, and Electrocardiogram (ECG)-based Q, R, S detection.Comment: 29 pages, 30 figures, 5 tables. Keywords: Big Data, Body Area
Network, Body Sensor Network, Edge Computing, Fog Computing, Medical
Cyberphysical Systems, Medical Internet-of-Things, Telecare, Tele-treatment,
Wearable Devices, Chapter in Handbook of Large-Scale Distributed Computing in
Smart Healthcare (2017), Springe
Algorithms for Video Structuring
Video structuring aims at automatically finding structure in a video sequence. Occupying a key-position within video analysis, it is a fundamental step for quality indexing and browsing. As a low level video analysis, video structuring can be seen as a serial process which includes (i) shot boundary detection, (ii) video shot feature extraction and (iii) video shot clustering. The resulting analysis serves as the base for higher level processing such as content-based image retrieval or semantic indexing. In this study, the whole process is examined and implemented. Two shot boundary detectors based on motion estimation and color distribution analysis are designed. Based on recent advances in machine learning, a novel technique for video shot clustering is presented. Typical approaches for segmenting and clustering shots use graph analysis, with split and merge algorithms for finding subgraphs corresponding to different scenes. In this work, the clustering algorithm is based on a spectral method which has proven its efficiency in still-image segmentation. This technique clusters points (in our case features extracted from video shots) using eigenvectors of matrices derived from data. Relevant data depends of the quality of feature extraction. After stating the main problems of video structuring, solutions are proposed defining an heuristical distance metric for similarity between shots. We combine color visual features with time constraints. The entire process of video structuring is tested on a ten hours home video database
Diasporic Archives and Hauntological Accretions
Centering on two recent participatory archive projects, Jacqueline HoĂ ng Nguyá»…n’s The Making of An Archive (2014-present), and Regent Park Film Festival’s Home Made Visible (2017-2019), this essay examines how diasporic archives “densify” authoritative records, and allow us to think generatively about archival movements and accretions. Both projects gathered and digitised archives from members of diasporic and racialised communities. Through public calls and workshops soliciting amateur archivists’ personal and familial still and moving image troves, these projects prioritised excavating and inscribing quotidian and ephemeral records as a response to Canadian multiculturalism’s imposed silences. The essay approaches diaspora – and diasporic archives – not (just) through rubrics of loss and obsolescence, but through the concept of hauntological thickening, arguing that these two projects intervene on authoritative and singular archival narratives by densifying the latter with occluded histories, affects, and textural traces of transfer. It also examines how quotidian visual records offer hauntological refractions of official narratives, and become vehicles for imbrications of personal, familial, and national histories and discourses. Finally, the essay concludes with an exploration of how the archives engage audiences through affective and sensorial registers. 
- …