1,304 research outputs found
Unsupervised Understanding of Location and Illumination Changes in Egocentric Videos
Wearable cameras stand out as one of the most promising devices for the
upcoming years, and as a consequence, the demand of computer algorithms to
automatically understand the videos recorded with them is increasing quickly.
An automatic understanding of these videos is not an easy task, and its mobile
nature implies important challenges to be faced, such as the changing light
conditions and the unrestricted locations recorded. This paper proposes an
unsupervised strategy based on global features and manifold learning to endow
wearable cameras with contextual information regarding the light conditions and
the location captured. Results show that non-linear manifold methods can
capture contextual patterns from global features without compromising large
computational resources. The proposed strategy is used, as an application case,
as a switching mechanism to improve the hand-detection problem in egocentric
videos.Comment: Submitted for publicatio
Periphery Plots for Contextualizing Heterogeneous Time-Based Charts
Patterns in temporal data can often be found across different scales, such as
days, weeks, and months, making effective visualization of time-based data
challenging. Here we propose a new approach for providing focus and context in
time-based charts to enable interpretation of patterns across time scales. Our
approach employs a focus zone with a time and a second axis, that can either
represent quantities or categories, as well as a set of adjacent periphery
plots that can aggregate data along the time, value, or both dimensions. We
present a framework for periphery plots and describe two use cases that
demonstrate the utility of our approach.Comment: To Appear in IEEE VIS 2019 Short Papers. Open source software and
other materials available on github:
https://github.com/PrecisionVISSTA/PeripheryPlots Video figure available on
Vimeo: https://vimeo.com/34967814
Similarity search and data mining techniques for advanced database systems.
Modern automated methods for measurement, collection, and analysis of data in industry and science are providing more and more data with drastically increasing structure complexity. On the one hand, this growing complexity is justified by the need for a richer and more precise description of real-world objects, on the other hand it is justified by the rapid progress in measurement and analysis techniques that allow the user a versatile exploration of objects. In order to manage the huge volume of such complex data, advanced database systems are employed. In contrast to conventional database systems that support exact match queries, the user of these advanced database systems focuses on applying similarity search and data mining techniques.
Based on an analysis of typical advanced database systems — such as biometrical, biological, multimedia, moving, and CAD-object database systems — the following three challenging characteristics of complexity are detected: uncertainty (probabilistic feature vectors), multiple instances (a set of homogeneous feature vectors), and multiple representations (a set of heterogeneous feature vectors). Therefore, the goal of this thesis is to develop similarity search and data mining techniques that are capable of handling uncertain, multi-instance, and multi-represented objects.
The first part of this thesis deals with similarity search techniques. Object identification is a similarity search technique that is typically used for the recognition of objects from image, video, or audio data. Thus, we develop a novel probabilistic model for object identification. Based on it, two novel types of identification queries are defined. In order to process the novel query types efficiently, we introduce an index structure called Gauss-tree. In addition, we specify further probabilistic models and query types for uncertain multi-instance objects and uncertain spatial objects. Based on the index structure, we develop algorithms for an efficient processing of these query types. Practical benefits of using probabilistic feature vectors are demonstrated on a real-world application for video similarity search. Furthermore, a similarity search technique is presented that is based on aggregated multi-instance objects, and that is suitable for video similarity search. This technique takes multiple representations into account in order to achieve better effectiveness.
The second part of this thesis deals with two major data mining techniques: clustering and classification. Since privacy preservation is a very important demand of distributed advanced applications, we propose using uncertainty for data obfuscation in order to provide privacy preservation during clustering. Furthermore, a model-based and a density-based clustering method for multi-instance objects are developed. Afterwards, original extensions and enhancements of the density-based clustering algorithms DBSCAN and OPTICS for handling multi-represented objects are introduced. Since several advanced database systems like biological or multimedia database systems handle predefined, very large class systems, two novel classification techniques for large class sets that benefit from using multiple representations are defined. The first classification method is based on the idea of a k-nearest-neighbor classifier. It employs a novel density-based technique to reduce training instances and exploits the entropy impurity of the local neighborhood in order to weight a given representation. The second technique addresses hierarchically-organized class systems. It uses a novel hierarchical, supervised method for the reduction of large multi-instance objects, e.g. audio or video, and applies support vector machines for efficient hierarchical classification of multi-represented objects. User benefits of this technique are demonstrated by a prototype that performs a classification of large music collections.
The effectiveness and efficiency of all proposed techniques are discussed and verified by comparison with conventional approaches in versatile experimental evaluations on real-world datasets
Similarity search and data mining techniques for advanced database systems.
Modern automated methods for measurement, collection, and analysis of data in industry and science are providing more and more data with drastically increasing structure complexity. On the one hand, this growing complexity is justified by the need for a richer and more precise description of real-world objects, on the other hand it is justified by the rapid progress in measurement and analysis techniques that allow the user a versatile exploration of objects. In order to manage the huge volume of such complex data, advanced database systems are employed. In contrast to conventional database systems that support exact match queries, the user of these advanced database systems focuses on applying similarity search and data mining techniques.
Based on an analysis of typical advanced database systems — such as biometrical, biological, multimedia, moving, and CAD-object database systems — the following three challenging characteristics of complexity are detected: uncertainty (probabilistic feature vectors), multiple instances (a set of homogeneous feature vectors), and multiple representations (a set of heterogeneous feature vectors). Therefore, the goal of this thesis is to develop similarity search and data mining techniques that are capable of handling uncertain, multi-instance, and multi-represented objects.
The first part of this thesis deals with similarity search techniques. Object identification is a similarity search technique that is typically used for the recognition of objects from image, video, or audio data. Thus, we develop a novel probabilistic model for object identification. Based on it, two novel types of identification queries are defined. In order to process the novel query types efficiently, we introduce an index structure called Gauss-tree. In addition, we specify further probabilistic models and query types for uncertain multi-instance objects and uncertain spatial objects. Based on the index structure, we develop algorithms for an efficient processing of these query types. Practical benefits of using probabilistic feature vectors are demonstrated on a real-world application for video similarity search. Furthermore, a similarity search technique is presented that is based on aggregated multi-instance objects, and that is suitable for video similarity search. This technique takes multiple representations into account in order to achieve better effectiveness.
The second part of this thesis deals with two major data mining techniques: clustering and classification. Since privacy preservation is a very important demand of distributed advanced applications, we propose using uncertainty for data obfuscation in order to provide privacy preservation during clustering. Furthermore, a model-based and a density-based clustering method for multi-instance objects are developed. Afterwards, original extensions and enhancements of the density-based clustering algorithms DBSCAN and OPTICS for handling multi-represented objects are introduced. Since several advanced database systems like biological or multimedia database systems handle predefined, very large class systems, two novel classification techniques for large class sets that benefit from using multiple representations are defined. The first classification method is based on the idea of a k-nearest-neighbor classifier. It employs a novel density-based technique to reduce training instances and exploits the entropy impurity of the local neighborhood in order to weight a given representation. The second technique addresses hierarchically-organized class systems. It uses a novel hierarchical, supervised method for the reduction of large multi-instance objects, e.g. audio or video, and applies support vector machines for efficient hierarchical classification of multi-represented objects. User benefits of this technique are demonstrated by a prototype that performs a classification of large music collections.
The effectiveness and efficiency of all proposed techniques are discussed and verified by comparison with conventional approaches in versatile experimental evaluations on real-world datasets
A Video is Worth 10,000 Words: Training and Benchmarking with Diverse Captions for Better Long Video Retrieval
Existing long video retrieval systems are trained and tested in the
paragraph-to-video retrieval regime, where every long video is described by a
single long paragraph. This neglects the richness and variety of possible valid
descriptions of a video, which could be described in moment-by-moment detail,
or in a single phrase summary, or anything in between. To provide a more
thorough evaluation of the capabilities of long video retrieval systems, we
propose a pipeline that leverages state-of-the-art large language models to
carefully generate a diverse set of synthetic captions for long videos. We
validate this pipeline's fidelity via rigorous human inspection. We then
benchmark a representative set of video language models on these synthetic
captions using a few long video datasets, showing that they struggle with the
transformed data, especially the shortest captions. We also propose a
lightweight fine-tuning method, where we use a contrastive loss to learn a
hierarchical embedding loss based on the differing levels of information among
the various captions. Our method improves performance both on the downstream
paragraph-to-video retrieval task (+1.1% R@1 on ActivityNet), as well as for
the various long video retrieval metrics we compute using our synthetic data
(+3.6% R@1 for short descriptions on ActivityNet). For data access and other
details, please refer to our project website at
https://mgwillia.github.io/10k-words.Comment: 13 pages, 15 tables, 5 figure
Secure Surveillance Framework for IoT Systems Using Probabilistic Image Encryption
[EN] This paper proposes a secure surveillance framework for Internet of things (IoT) systems by intelligent integration of video summarization and image encryption. First, an efficient video summarization method is used to extract the informative frames using the processing capabilities of visual sensors. When an event is detected from keyframes, an alert is sent to the concerned authority autonomously. As the final decision about an event mainly depends on the extracted keyframes, their modification during transmission by attackers can result in severe losses. To tackle this issue, we propose a fast probabilistic and lightweight algorithm for the encryption of keyframes prior to transmission, considering the memory and processing requirements of constrained devices that increase its suitability for IoT systems. Our experimental results verify the effectiveness of the proposed method in terms of robustness, execution time, and security compared to other image encryption algorithms. Furthermore, our framework can reduce the bandwidth, storage, transmission cost, and the time required for analysts to browse large volumes of surveillance data and make decisions about abnormal events, such as suspicious activity detection and fire detection in surveillance applications.This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIP) (No. 2016R1A2B4011712). Paper no. TII-17-2066.Muhammad, K.; Hamza, R.; Ahmad, J.; Lloret, J.; Wang, H.; Baik, SW. (2018). Secure Surveillance Framework for IoT Systems Using Probabilistic Image Encryption. IEEE Transactions on Industrial Informatics. 14(8):3679-3689. https://doi.org/10.1109/TII.2018.2791944S3679368914
Quality Control Tools for Video Preservation
To aid in the nation's efforts to preserve its video history, the Bay Area Video Coalition (BAVC) requests $350,000 over two years to develop an open source and freely available software "toolkit" to help perform sophisticated quality control on video digitization workflows. BAVC, in partnership with the Dance Heritage Coalition (DHC) and independent consultant Dave Rice, will create Quality Control Tools for Video Preservation (QC Tools), a suite of open source software tools that will ensure accurate and efficient assessment of media integrity throughout the archival digitization process
- …