Search CORE

1,304 research outputs found

Unsupervised Understanding of Location and Illumination Changes in Egocentric Videos

Author: Barakova Emilia
Betancourt Alejandro
Díaz-Rodríguez Natalia
Marcenaro Lucio
Rauterberg Matthias
Regazzoni Carlo
Publication venue
Publication date: 01/01/2017
Field of study

Wearable cameras stand out as one of the most promising devices for the upcoming years, and as a consequence, the demand of computer algorithms to automatically understand the videos recorded with them is increasing quickly. An automatic understanding of these videos is not an easy task, and its mobile nature implies important challenges to be faced, such as the changing light conditions and the unrestricted locations recorded. This paper proposes an unsupervised strategy based on global features and manifold learning to endow wearable cameras with contextual information regarding the light conditions and the location captured. Results show that non-linear manifold methods can capture contextual patterns from global features without compromising large computational resources. The proposed strategy is used, as an application case, as a switching mechanism to improve the hand-detection problem in egocentric videos.Comment: Submitted for publicatio

arXiv.org e-Print Archive

Repository TU/e

Crossref

Pure OAI Repository

Repositorio Institucional Universidad de Granada

Archivio istituzionale della ricerca - Università di Genova

Periphery Plots for Contextualizing Heterogeneous Time-Based Charts

Author: Chung Arlene E.
Gehlenborg Nils
Gotz David
Manz Trevor
Morrow Bryce
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 12/08/2019
Field of study

Patterns in temporal data can often be found across different scales, such as days, weeks, and months, making effective visualization of time-based data challenging. Here we propose a new approach for providing focus and context in time-based charts to enable interpretation of patterns across time scales. Our approach employs a focus zone with a time and a second axis, that can either represent quantities or categories, as well as a set of adjacent periphery plots that can aggregate data along the time, value, or both dimensions. We present a framework for periphery plots and describe two use cases that demonstrate the utility of our approach.Comment: To Appear in IEEE VIS 2019 Short Papers. Open source software and other materials available on github: https://github.com/PrecisionVISSTA/PeripheryPlots Video figure available on Vimeo: https://vimeo.com/34967814

arXiv.org e-Print Archive

Crossref

Similarity search and data mining techniques for advanced database systems.

Author: Pryakhin Alexey
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 21/12/2006
Field of study

Modern automated methods for measurement, collection, and analysis of data in industry and science are providing more and more data with drastically increasing structure complexity. On the one hand, this growing complexity is justified by the need for a richer and more precise description of real-world objects, on the other hand it is justified by the rapid progress in measurement and analysis techniques that allow the user a versatile exploration of objects. In order to manage the huge volume of such complex data, advanced database systems are employed. In contrast to conventional database systems that support exact match queries, the user of these advanced database systems focuses on applying similarity search and data mining techniques. Based on an analysis of typical advanced database systems — such as biometrical, biological, multimedia, moving, and CAD-object database systems — the following three challenging characteristics of complexity are detected: uncertainty (probabilistic feature vectors), multiple instances (a set of homogeneous feature vectors), and multiple representations (a set of heterogeneous feature vectors). Therefore, the goal of this thesis is to develop similarity search and data mining techniques that are capable of handling uncertain, multi-instance, and multi-represented objects. The first part of this thesis deals with similarity search techniques. Object identification is a similarity search technique that is typically used for the recognition of objects from image, video, or audio data. Thus, we develop a novel probabilistic model for object identification. Based on it, two novel types of identification queries are defined. In order to process the novel query types efficiently, we introduce an index structure called Gauss-tree. In addition, we specify further probabilistic models and query types for uncertain multi-instance objects and uncertain spatial objects. Based on the index structure, we develop algorithms for an efficient processing of these query types. Practical benefits of using probabilistic feature vectors are demonstrated on a real-world application for video similarity search. Furthermore, a similarity search technique is presented that is based on aggregated multi-instance objects, and that is suitable for video similarity search. This technique takes multiple representations into account in order to achieve better effectiveness. The second part of this thesis deals with two major data mining techniques: clustering and classification. Since privacy preservation is a very important demand of distributed advanced applications, we propose using uncertainty for data obfuscation in order to provide privacy preservation during clustering. Furthermore, a model-based and a density-based clustering method for multi-instance objects are developed. Afterwards, original extensions and enhancements of the density-based clustering algorithms DBSCAN and OPTICS for handling multi-represented objects are introduced. Since several advanced database systems like biological or multimedia database systems handle predefined, very large class systems, two novel classification techniques for large class sets that benefit from using multiple representations are defined. The first classification method is based on the idea of a k-nearest-neighbor classifier. It employs a novel density-based technique to reduce training instances and exploits the entropy impurity of the local neighborhood in order to weight a given representation. The second technique addresses hierarchically-organized class systems. It uses a novel hierarchical, supervised method for the reduction of large multi-instance objects, e.g. audio or video, and applies support vector machines for efficient hierarchical classification of multi-represented objects. User benefits of this technique are demonstrated by a prototype that performs a classification of large music collections. The effectiveness and efficiency of all proposed techniques are discussed and verified by comparison with conventional approaches in versatile experimental evaluations on real-world datasets

Similarity search and data mining techniques for advanced database systems.

Author: Pryakhin Alexey
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 21/12/2006
Field of study

Digitale Hochschulschriften der LMU

A Video is Worth 10,000 Words: Training and Benchmarking with Diverse Captions for Better Long Video Retrieval

Author: Cogswell Michael
Divakaran Ajay
Gwilliam Matthew
Shrivastava Abhinav
Sikka Karan
Ye Meng
Publication venue
Publication date: 30/11/2023
Field of study

Existing long video retrieval systems are trained and tested in the paragraph-to-video retrieval regime, where every long video is described by a single long paragraph. This neglects the richness and variety of possible valid descriptions of a video, which could be described in moment-by-moment detail, or in a single phrase summary, or anything in between. To provide a more thorough evaluation of the capabilities of long video retrieval systems, we propose a pipeline that leverages state-of-the-art large language models to carefully generate a diverse set of synthetic captions for long videos. We validate this pipeline's fidelity via rigorous human inspection. We then benchmark a representative set of video language models on these synthetic captions using a few long video datasets, showing that they struggle with the transformed data, especially the shortest captions. We also propose a lightweight fine-tuning method, where we use a contrastive loss to learn a hierarchical embedding loss based on the differing levels of information among the various captions. Our method improves performance both on the downstream paragraph-to-video retrieval task (+1.1% R@1 on ActivityNet), as well as for the various long video retrieval metrics we compute using our synthetic data (+3.6% R@1 for short descriptions on ActivityNet). For data access and other details, please refer to our project website at https://mgwillia.github.io/10k-words.Comment: 13 pages, 15 tables, 5 figure

arXiv.org e-Print Archive

Secure Surveillance Framework for IoT Systems Using Probabilistic Image Encryption

Author: Ahmad Jamil
Baik Sung Wook
Hamza Rafik
Lloret Jaime
Muhammad Khan
Wang Haoxiang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/08/2018
Field of study

[EN] This paper proposes a secure surveillance framework for Internet of things (IoT) systems by intelligent integration of video summarization and image encryption. First, an efficient video summarization method is used to extract the informative frames using the processing capabilities of visual sensors. When an event is detected from keyframes, an alert is sent to the concerned authority autonomously. As the final decision about an event mainly depends on the extracted keyframes, their modification during transmission by attackers can result in severe losses. To tackle this issue, we propose a fast probabilistic and lightweight algorithm for the encryption of keyframes prior to transmission, considering the memory and processing requirements of constrained devices that increase its suitability for IoT systems. Our experimental results verify the effectiveness of the proposed method in terms of robustness, execution time, and security compared to other image encryption algorithms. Furthermore, our framework can reduce the bandwidth, storage, transmission cost, and the time required for analysts to browse large volumes of surveillance data and make decisions about abnormal events, such as suspicious activity detection and fire detection in surveillance applications.This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIP) (No. 2016R1A2B4011712). Paper no. TII-17-2066.Muhammad, K.; Hamza, R.; Ahmad, J.; Lloret, J.; Wang, H.; Baik, SW. (2018). Secure Surveillance Framework for IoT Systems Using Probabilistic Image Encryption. IEEE Transactions on Industrial Informatics. 14(8):3679-3689. https://doi.org/10.1109/TII.2018.2791944S3679368914

Crossref

RiuNet

Quality Control Tools for Video Preservation

Author: Benjamin Turkus
Moriah Ulinskas
Publication venue: 'Modern Language Association'
Publication date: 01/01/2015
Field of study

To aid in the nation's efforts to preserve its video history, the Bay Area Video Coalition (BAVC) requests $350,000 over two years to develop an open source and freely available software "toolkit" to help perform sophisticated quality control on video digitization workflows. BAVC, in partnership with the Dance Heritage Coalition (DHC) and independent consultant Dave Rice, will create Quality Control Tools for Video Preservation (QC Tools), a suite of open source software tools that will ensure accurate and efficient assessment of media integrity throughout the archival digitization process

Humanities Commons