Search CORE

528 research outputs found

Recommended from our members

Localization from semantic observations via the matrix permanent

Author: Atanasov Nikolay
Daniilidis Kostas
Pappas George J
Zhu Menglong
Publication venue: eScholarship, University of California
Publication date: 01/01/2016
Field of study

Most approaches to robot localization rely on low-level geometric features such as points, lines, and planes. In this paper, we use object recognition to obtain semantic information from the robot’s sensors and consider the task of localizing the robot within a prior map of landmarks, which are annotated with semantic labels. As object recognition algorithms miss detections and produce false alarms, correct data association between the detections and the landmarks on the map is central to the semantic localization problem. Instead of the traditional vector-based representation, we propose a sensor model, which encodes the semantic observations via random finite sets and enables a unified treatment of missed detections, false alarms, and data association. Our second contribution is to reduce the problem of computing the likelihood of a set-valued observation to the problem of computing a matrix permanent. It is this crucial transformation that allows us to solve the semantic localization problem with a polynomial-time approximation to the set-based Bayes filter. Finally, we address the active semantic localization problem, in which the observer’s trajectory is planned in order to improve the accuracy and efficiency of the localization process. The performance of our approach is demonstrated in simulation and in real environments using deformable-part-model-based object detectors. Robust global localization from semantic observations is demonstrated for a mobile robot, for the Project Tango phone, and on the KITTI visual odometry dataset. Comparisons are made with the traditional lidar-based geometric Monte Carlo localization

eScholarship - University of California

Unsupervised Segmentation of Action Segments in Egocentric Videos using Gaze

Author: Hipiny I.
Juan S. F. Samson
Khairuddin M. A.
Minoi J. L.
Sunar M. S.
Ujir H.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 30/09/2017
Field of study

Unsupervised segmentation of action segments in egocentric videos is a desirable feature in tasks such as activity recognition and content-based video retrieval. Reducing the search space into a finite set of action segments facilitates a faster and less noisy matching. However, there exist a substantial gap in machine understanding of natural temporal cuts during a continuous human activity. This work reports on a novel gaze-based approach for segmenting action segments in videos captured using an egocentric camera. Gaze is used to locate the region-of-interest inside a frame. By tracking two simple motion-based parameters inside successive regions-of-interest, we discover a finite set of temporal cuts. We present several results using combinations (of the two parameters) on a dataset, i.e., BRISGAZE-ACTIONS. The dataset contains egocentric videos depicting several daily-living activities. The quality of the temporal cuts is further improved by implementing two entropy measures.Comment: To appear in 2017 IEEE International Conference On Signal and Image Processing Application

arXiv.org e-Print Archive

Crossref

Comprehensive and Scalable Appraisals of Contemporary Documents

Author: Peter Bajcsy
Rob Kooper
Sang-Chul Lee
William McFadden
Publication venue: 'IntechOpen'
Publication date: 01/02/2010
Field of study

IntechOpen

Recent Trends in Computational Intelligence

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Traditional models struggle to cope with complexity, noise, and the existence of a changing environment, while Computational Intelligence (CI) offers solutions to complicated problems as well as reverse problems. The main feature of CI is adaptability, spanning the fields of machine learning and computational neuroscience. CI also comprises biologically-inspired technologies such as the intellect of swarm as part of evolutionary computation and encompassing wider areas such as image processing, data collection, and natural language processing. This book aims to discuss the usage of CI for optimal solving of various applications proving its wide reach and relevance. Bounding of optimization methods and data mining strategies make a strong and reliable prediction tool for handling real-life applications

Directory of Open Access Books (DOAB)

An Analyst's Assistant for the Interpretation of Vehicle Track Data

Author: Borchardt Gary
Felshin Sue
Katz Boris
Nguyen Hong-Linh
Senne Ken
Wang Andy
Publication venue
Publication date: 08/10/2014
Field of study

This report describes the Analyst's Assistant, a software system for language-interactive, collaborative user-system interpretation of events, specifically targeting vehicle events that can be recognized on the basis of vehicle track data. The Analyst's Assistant utilizes language not only as a means of interaction, but also as a basis for internal representation of scene information, background knowledge, and results of interpretation. Building on this basis, the system demonstrates emerging intelligent systems techniques related to event recognition, summarization of events, partitioning of subtasks between user and system, and handling of language and graphical references to scene entities during interactive analysis

DSpace@MIT

Sketch-Based Annotation and Visualization in Video Authoring

Author: Cui-Xia Ma
Dong-Xing Teng
Guo-Zhong Dai
Hong-An Wang
Yong-Jin Liu
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Detection and Generalization of Spatio-temporal Trajectories for Motion Imagery

Author: Partsinevelos Panayotis
Publication venue: DigitalCommons@UMaine
Publication date: 01/01/2002
Field of study

In today\u27s world of vast information availability users often confront large unorganized amounts of data with limited tools for managing them. Motion imagery datasets have become increasingly popular means for exposing and disseminating information. Commonly, moving objects are of primary interest in modeling such datasets. Users may require different levels of detail mainly for visualization and further processing purposes according to the application at hand. In this thesis we exploit the geometric attributes of objects for dataset summarization by using a series of image processing and neural network tools. In order to form data summaries we select representative time instances through the segmentation of an object\u27s spatio-temporal trajectory lines. High movement variation instances are selected through a new hybrid self-organizing map (SOM) technique to describe a single spatio-temporal trajectory. Multiple objects move in diverse yet classifiable patterns. In order to group corresponding trajectories we utilize an abstraction mechanism that investigates a vague moving relevance between the data in space and time. Thus, we introduce the spatio-temporal neighborhood unit as a variable generalization surface. By altering the unit\u27s dimensions, scaled generalization is accomplished. Common complications in tracking applications that include occlusion, noise, information gaps and unconnected segments of data sequences are addressed through the hybrid-SOM analysis. Nevertheless, entangled data sequences where no information on which data entry belongs to each corresponding trajectory are frequently evident. A multidimensional classification technique that combines geometric and backpropagation neural network implementation is used to distinguish between trajectory data. Further more, modeling and summarization of two-dimensional phenomena evolving in time brings forward the novel concept of spatio-temporal helixes as compact event representations. The phenomena models are comprised of SOM movement nodes (spines) and cardinality shape-change descriptors (prongs). While we focus on the analysis of MI datasets, the framework can be generalized to function with other types of spatio-temporal datasets. Multiple scale generalization is allowed in a dynamic significance-based scale rather than a constant one. The constructed summaries are not just a visualization product but they support further processing for metadata creation, indexing, and querying. Experimentation, comparisons and error estimations for each technique support the analyses discussed

CiteSeerX

University of Maine

Extraction of Key-Frames from an Unstable Video Feed

Author: Vempati Nikhilesh
Publication venue
Publication date: 13/07/2017
Field of study

The APOLI project deals with Automated Power Line Inspection using Highly-automated Unmanned Aerial Systems. Beside the Real-time damage assessment by on-board high-resolution image data exploitation a postprocessing of the video data is necessary. This Master Thesis deals with the implementation of an Isolator Detector Framework and a Work ow in the Automotive Data and Time-triggered Framework(ADTF) that loads a video direct from a camera or from a storage and extracts the Key Frames which contain objects of interest. This is done by the implementation of an object detection system using C++ and the creation of ADTF Filters that perform the task of detection of the objects of interest and extract the Key Frames using a supervised learning platform. The use case is the extraction of frames from video samples that contain Images of Isolators from Power Transmission Lines

Qucosa

HSSS - Hochschulschriftenserver der SLUB

Multimedia ONline ARchiv CHemnitz