753 research outputs found
Framework for Clique-based Fusion of Graph Streams in Multi-function System Testing
The paper describes a framework for multi-function system testing.
Multi-function system testing is considered as fusion (or revelation) of
clique-like structures. The following sets are considered: (i) subsystems
(system parts or units / components / modules), (ii) system functions and a
subset of system components for each system function, and (iii) function
clusters (some groups of system functions which are used jointly). Test
procedures (as units testing) are used for each subsystem. The procedures lead
to an ordinal result (states, colors) for each component, e.g., [1,2,3,4]
(where 1 corresponds to 'out of service', 2 corresponds to 'major faults', 3
corresponds to 'minor faults', 4 corresponds to 'trouble free service'). Thus,
for each system function a graph over corresponding system components is
examined while taking into account ordinal estimates/colors of the components.
Further, an integrated graph (i.e., colored graph) for each function cluster is
considered (this graph integrates the graphs for corresponding system
functions). For the integrated graph (for each function cluster) structure
revelation problems are under examination (revelation of some subgraphs which
can lead to system faults): (1) revelation of clique and quasi-clique (by
vertices at level 1, 2, etc.; by edges/interconnection existence) and (2)
dynamical problems (when vertex colors are functions of time) are studied as
well: existence of a time interval when clique or quasi-clique can exist.
Numerical examples illustrate the approach and problems.Comment: 6 pages, 13 figure
Online Mutual Foreground Segmentation for Multispectral Stereo Videos
The segmentation of video sequences into foreground and background regions is
a low-level process commonly used in video content analysis and smart
surveillance applications. Using a multispectral camera setup can improve this
process by providing more diverse data to help identify objects despite adverse
imaging conditions. The registration of several data sources is however not
trivial if the appearance of objects produced by each sensor differs
substantially. This problem is further complicated when parallax effects cannot
be ignored when using close-range stereo pairs. In this work, we present a new
method to simultaneously tackle multispectral segmentation and stereo
registration. Using an iterative procedure, we estimate the labeling result for
one problem using the provisional result of the other. Our approach is based on
the alternating minimization of two energy functions that are linked through
the use of dynamic priors. We rely on the integration of shape and appearance
cues to find proper multispectral correspondences, and to properly segment
objects in low contrast regions. We also formulate our model as a frame
processing pipeline using higher order terms to improve the temporal coherence
of our results. Our method is evaluated under different configurations on
multiple multispectral datasets, and our implementation is available online.Comment: Preprint accepted for publication in IJCV (December 2018
Automatic Synchronization of Multi-User Photo Galleries
In this paper we address the issue of photo galleries synchronization, where
pictures related to the same event are collected by different users. Existing
solutions to address the problem are usually based on unrealistic assumptions,
like time consistency across photo galleries, and often heavily rely on
heuristics, limiting therefore the applicability to real-world scenarios. We
propose a solution that achieves better generalization performance for the
synchronization task compared to the available literature. The method is
characterized by three stages: at first, deep convolutional neural network
features are used to assess the visual similarity among the photos; then, pairs
of similar photos are detected across different galleries and used to construct
a graph; eventually, a probabilistic graphical model is used to estimate the
temporal offset of each pair of galleries, by traversing the minimum spanning
tree extracted from this graph. The experimental evaluation is conducted on
four publicly available datasets covering different types of events,
demonstrating the strength of our proposed method. A thorough discussion of the
obtained results is provided for a critical assessment of the quality in
synchronization.Comment: ACCEPTED to IEEE Transactions on Multimedi
Audio-visual football video analysis, from structure detection to attention analysis
Sport video is an important video genre. Content-based sports video analysis attracts great interest from both industry and academic fields. A sports video is characterised by repetitive temporal structures, relatively plain contents, and strong spatio-temporal variations, such as quick camera switches and swift local motions. It is necessary to develop specific techniques for content-based sports video analysis to utilise these characteristics.
For an efficient and effective sports video analysis system, there are three fundamental questions: (1) what are key stories for sports videos; (2) what incurs viewer’s interest; and (3) how to identify game highlights. This thesis is developed around these questions. We approached these questions from two different perspectives and in turn three research contributions are presented, namely, replay detection, attack temporal structure decomposition, and attention-based highlight identification.
Replay segments convey the most important contents in sports videos. It is an efficient approach to collect game highlights by detecting replay segments. However, replay is an artefact of editing, which improves with advances in video editing tools. The composition of replay is complex, which includes logo transitions, slow motions, viewpoint switches and normal speed video clips. Since logo transition clips are pervasive in game collections of FIFA World Cup 2002, FIFA World Cup 2006 and UEFA Championship 2006, we take logo transition detection as an effective replacement of replay detection. A two-pass system was developed, including a five-layer adaboost classifier and a logo template matching throughout an entire video. The five-layer adaboost utilises shot duration, average game pitch ratio, average motion, sequential colour histogram and shot frequency between two neighbouring logo transitions, to filter out logo transition candidates. Subsequently, a logo template is constructed and employed to find all transition logo sequences. The precision and recall of this system in replay detection is 100% in a five-game evaluation collection.
An attack structure is a team competition for a score. Hence, this structure is a conceptually fundamental unit of a football video as well as other sports videos. We review the literature of content-based temporal structures, such as play-break structure, and develop a three-step system for automatic attack structure decomposition. Four content-based shot classes, namely, play, focus, replay and break were identified by low level visual features. A four-state hidden Markov model was trained to simulate transition processes among these shot classes. Since attack structures are the longest repetitive temporal unit in a sports video, a suffix tree is proposed to find the longest repetitive substring in the label sequence of shot class transitions. These occurrences of this substring are regarded as a kernel of an attack hidden Markov process. Therefore, the decomposition of attack structure becomes a boundary likelihood comparison between two Markov chains.
Highlights are what attract notice. Attention is a psychological measurement of “notice ”. A brief survey of attention psychological background, attention estimation from vision and auditory, and multiple modality attention fusion is presented. We propose two attention models for sports video analysis, namely, the role-based attention model and the multiresolution autoregressive framework. The role-based attention model is based on the perception structure during watching video. This model removes reflection bias among modality salient signals and combines these signals by reflectors. The multiresolution autoregressive framework (MAR) treats salient signals as a group of smooth random processes, which follow a similar trend but are filled with noise. This framework tries to estimate a noise-less signal from these coarse noisy observations by a multiple resolution analysis. Related algorithms are developed, such as event segmentation on a MAR tree and real time event detection. The experiment shows that these attention-based approach can find goal events at a high precision. Moreover, results of MAR-based highlight detection on the final game of FIFA 2002 and 2006 are highly similar to professionally labelled highlights by BBC and FIFA
- …