133 research outputs found

    Survey on video anomaly detection in dynamic scenes with moving cameras

    Full text link
    The increasing popularity of compact and inexpensive cameras, e.g.~dash cameras, body cameras, and cameras equipped on robots, has sparked a growing interest in detecting anomalies within dynamic scenes recorded by moving cameras. However, existing reviews primarily concentrate on Video Anomaly Detection (VAD) methods assuming static cameras. The VAD literature with moving cameras remains fragmented, lacking comprehensive reviews to date. To address this gap, we endeavor to present the first comprehensive survey on Moving Camera Video Anomaly Detection (MC-VAD). We delve into the research papers related to MC-VAD, critically assessing their limitations and highlighting associated challenges. Our exploration encompasses three application domains: security, urban transportation, and marine environments, which in turn cover six specific tasks. We compile an extensive list of 25 publicly-available datasets spanning four distinct environments: underwater, water surface, ground, and aerial. We summarize the types of anomalies these datasets correspond to or contain, and present five main categories of approaches for detecting such anomalies. Lastly, we identify future research directions and discuss novel contributions that could advance the field of MC-VAD. With this survey, we aim to offer a valuable reference for researchers and practitioners striving to develop and advance state-of-the-art MC-VAD methods.Comment: Under revie

    Learning Latent Image Representations with Prior Knowledge

    Get PDF
    Deep learning has become a dominant tool in many computer vision applications due to the superior performance of extracting low-dimensional latent representations from images. However, though there is prior knowledge for many applications already, most existing methods learn image representations from large-scale training data in a black-box way, which is not good for interpretability and controllability. This thesis explores approaches that integrate different types of prior knowledge into deep neural networks. Instead of learning image representations from scratch, leveraging the prior knowledge in latent space can softly regularize the training and obtain more controllable representations.The models presented in the thesis mainly address three different problems: (i) How to encode epipolar geometry in deep learning architectures for multi-view stereo. The key of multi-view stereo is to find the matched correspondence across images. In this thesis, a learning-based method inspired by the classical plane sweep algorithm is studied. The method aims to improve the correspondence matching in two parts: obtaining better potential correspondence candidates with a novel plane sampling strategy and learning the multiplane representations instead of using hand-crafted cost metrics. (ii) How to capture the correlations of input data in the latent space. Multiple methods that introduce Gaussian process in the latent space to encode view priors are explored in the thesis. According to the availability of relative motion of frames, there is a hierarchy of three covariance functions which are presented as Gaussian process priors, and the correlated latent representations can be obtained via latent nonparametric fusion. Experimental results show that the correlated representations lead to more temporally consistent predictions for depth estimation, and they can also be applied to generative models to synthesize images in new views. (iii) How to use the known factors of variation to learn disentangled representations. Both equivariant representations and factorized representations are studied for novel view synthesis and interactive fashion retrieval respectively. In summary, this thesis presents three different types of solutions that use prior domain knowledge to learn more powerful image representations. For depth estimation, the presented methods integrate the multi-view geometry into the deep neural network. For image sequences, the correlated representations obtained from inter-frame reasoning make more consistent and stable predictions. The disentangled representations provide explicit flexible control over specific known factors of variation

    Spatio-temporal research data infrastructure in the context of autonomous driving

    Get PDF
    In this paper, we present an implementation of a research data management system that features structured data storage for spatio-temporal experimental data (environmental perception and navigation in the framework of autonomous driving), including metadata management and interfaces for visualization and parallel processing. The demands of the research environment, the design of the system, the organization of the data storage, and computational hardware as well as structures and processes related to data collection, preparation, annotation, and storage are described in detail. We provide examples for the handling of datasets, explaining the required data preparation steps for data storage as well as benefits when using the data in the context of scientific tasks. © 2020 by the authors

    Collaborative mobile industrial manipulator : a review of system architecture and applications

    Get PDF
    This paper provides a comprehensive review of the development of Collaborative Mobile Industrial Manipulator (CMIM), which is currently in high demand. Such a review is necessary to have an overall understanding about CMIM advanced technology. This is the first review to combine the system architecture and application which is necessary in order to gain a full understanding of the system. The classical framework of CMIM is firstly discussed, including hardware and software. Subsystems that are typically involved in hardware such as mobile platform, manipulator, end-effector and sensors are presented. With regards to software, planner, controller, perception, interaction and so on are also described. Following this, the common applications (logistics, manufacturing and assembly) in industry are surveyed. Finally, the trends are predicted and issues are indicated as references for CMIM researchers. Specifically, more research is needed in the areas of interaction, fully autonomous control, coordination and standards. Besides, experiments in real environment would be performed more and novel collaborative robotic systems would be proposed in future. Additionally, some advanced technology in other areas would also be applied into the system. In all, the system would become more intelligent, collaborative and autonomous

    Artificial Vision Algorithms for Socially Assistive Robot Applications: A Review of the Literature

    Get PDF
    Today, computer vision algorithms are very important for different fields and applications, such as closed-circuit television security, health status monitoring, and recognizing a specific person or object and robotics. Regarding this topic, the present paper deals with a recent review of the literature on computer vision algorithms (recognition and tracking of faces, bodies, and objects) oriented towards socially assistive robot applications. The performance, frames per second (FPS) processing speed, and hardware implemented to run the algorithms are highlighted by comparing the available solutions. Moreover, this paper provides general information for researchers interested in knowing which vision algorithms are available, enabling them to select the one that is most suitable to include in their robotic system applicationsBeca Conacyt Doctorado No de CVU: 64683

    Explain what you see:argumentation-based learning and robotic vision

    Get PDF
    In this thesis, we have introduced new techniques for the problems of open-ended learning, online incremental learning, and explainable learning. These methods have applications in the classification of tabular data, 3D object category recognition, and 3D object parts segmentation. We have utilized argumentation theory and probability theory to develop these methods. The first proposed open-ended online incremental learning approach is Argumentation-Based online incremental Learning (ABL). ABL works with tabular data and can learn with a small number of learning instances using an abstract argumentation framework and bipolar argumentation framework. It has a higher learning speed than state-of-the-art online incremental techniques. However, it has high computational complexity. We have addressed this problem by introducing Accelerated Argumentation-Based Learning (AABL). AABL uses only an abstract argumentation framework and uses two strategies to accelerate the learning process and reduce the complexity. The second proposed open-ended online incremental learning approach is the Local Hierarchical Dirichlet Process (Local-HDP). Local-HDP aims at addressing two problems of open-ended category recognition of 3D objects and segmenting 3D object parts. We have utilized Local-HDP for the task of object part segmentation in combination with AABL to achieve an interpretable model to explain why a certain 3D object belongs to a certain category. The explanations of this model tell a user that a certain object has specific object parts that look like a set of the typical parts of certain categories. Moreover, integrating AABL and Local-HDP leads to a model that can handle a high degree of occlusion

    Context-awareness for adaptive information retrieval systems

    Get PDF
    Philosophiae Doctor - PhDThis research study investigates optimization of IRS to individual information needs in order of relevance. The research addressed development of algorithms that optimize the ranking of documents retrieved from IRS. In this thesis, we present two aspects of context-awareness in IR. Firstly, the design of context of information. The context of a query determines retrieved information relevance. Thus, executing the same query in diverse contexts often leads to diverse result rankings. Secondly, the relevant context aspects should be incorporated in a way that supports the knowledge domain representing users’ interests. In this thesis, the use of evolutionary algorithms is incorporated to improve the effectiveness of IRS. A context-based information retrieval system is developed whose retrieval effectiveness is evaluated using precision and recall metrics. The results demonstrate how to use attributes from user interaction behaviour to improve the IR effectivenes

    Knowledge Augmented Machine Learning with Applications in Autonomous Driving: A Survey

    Get PDF
    The existence of representative datasets is a prerequisite of many successful artificial intelligence and machine learning models. However, the subsequent application of these models often involves scenarios that are inadequately represented in the data used for training. The reasons for this are manifold and range from time and cost constraints to ethical considerations. As a consequence, the reliable use of these models, especially in safety-critical applications, is a huge challenge. Leveraging additional, already existing sources of knowledge is key to overcome the limitations of purely data-driven approaches, and eventually to increase the generalization capability of these models. Furthermore, predictions that conform with knowledge are crucial for making trustworthy and safe decisions even in underrepresented scenarios. This work provides an overview of existing techniques and methods in the literature that combine data-based models with existing knowledge. The identified approaches are structured according to the categories integration, extraction and conformity. Special attention is given to applications in the field of autonomous driving
    corecore