443 research outputs found

    LIPIcs, Volume 251, ITCS 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 251, ITCS 2023, Complete Volum

    Asymptotics of stochastic learning in structured networks

    Get PDF

    Asymptotics of stochastic learning in structured networks

    Get PDF

    Spatiotemporal Event Graphs for Dynamic Scene Understanding

    Get PDF
    Dynamic scene understanding is the ability of a computer system to interpret and make sense of the visual information present in a video of a real-world scene. In this thesis, we present a series of frameworks for dynamic scene understanding starting from road event detection from an autonomous driving perspective to complex video activity detection, followed by continual learning approaches for the life-long learning of the models. Firstly, we introduce the ROad event Awareness Dataset (ROAD) for Autonomous Driving, to our knowledge the first of its kind. ROAD is designed to test an autonomous vehicle’s ability to detect road events, defined as triplets composed by an active agent, the action(s) it performs and the corresponding scene locations. Due to the lack of datasets equipped with formally specified logical requirements, we also introduce the ROad event Awareness Dataset with logical Requirements (ROAD-R), the first publicly available dataset for autonomous driving with requirements expressed as logical constraints, as a tool for driving neurosymbolic research in the area. Next, we extend event detection to holistic scene understanding by proposing two complex activity detection methods. In the first method, we present a deformable, spatiotemporal scene graph approach, consisting of three main building blocks: action tube detection, a 3D deformable RoI pooling layer designed for learning the flexible, deformable geometry of the constituent action tubes, and a scene graph constructed by considering all parts as nodes and connecting them based on different semantics. In a second approach evolving from the first, we propose a hybrid graph neural network that combines attention applied to a graph encoding of the local (short-term) dynamic scene with a temporal graph modelling the overall long-duration activity. Our contribution is threefold: i) a feature extraction technique; ii) a method for constructing a local scene graph followed by graph attention, and iii) a graph for temporally connecting all the local dynamic scene graphs. Finally, the last part of the thesis is about presenting a new continual semi-supervised learning (CSSL) paradigm, proposed to the attention of the machine learning community. We also propose to formulate the continual semi-supervised learning problem as a latent-variable

    Point Cloud Processing for Environmental Analysis in Autonomous Driving using Deep Learning

    Get PDF
    Autonomous self-driving cars need a very precise perception system of their environment, working for every conceivable scenario. Therefore, different kinds of sensor types, such as lidar scanners, are in use. This thesis contributes highly efficient algorithms for 3D object recognition to the scientific community. It provides a Deep Neural Network with specific layers and a novel loss to safely localize and estimate the orientation of objects from point clouds originating from lidar sensors. First, a single-shot 3D object detector is developed that outputs dense predictions in only one forward pass. Next, this detector is refined by fusing complementary semantic features from cameras and joint probabilistic tracking to stabilize predictions and filter outliers. The last part presents an evaluation of data from automotive-grade lidar scanners. A Generative Adversarial Network is also being developed as an alternative for target-specific artificial data generation.One of the main objectives of leading automotive companies is autonomous self-driving cars. They need a very precise perception system of their environment, working for every conceivable scenario. Therefore, different kinds of sensor types are in use. Besides cameras, lidar scanners became very important. The development in that field is significant for future applications and system integration because lidar offers a more accurate depth representation, independent from environmental illumination. Especially algorithms and machine learning approaches, including Deep Learning and Artificial Intelligence based on raw laser scanner data, are very important due to the long range and three-dimensional resolution of the measured point clouds. Consequently, a broad field of research with many challenges and unsolved tasks has been established. This thesis aims to address this deficit and contribute highly efficient algorithms for 3D object recognition to the scientific community. It provides a Deep Neural Network with specific layers and a novel loss to safely localize and estimate the orientation of objects from point clouds. First, a single shot 3D object detector is developed that outputs dense predictions in only one forward pass. Next, this detector is refined by fusing complementary semantic features from cameras and a joint probabilistic tracking to stabilize predictions and filter outliers. In the last part, a concept for deployment into an existing test vehicle focuses on the semi-automated generation of a suitable dataset. Subsequently, an evaluation of data from automotive-grade lidar scanners is presented. A Generative Adversarial Network is also being developed as an alternative for target-specific artificial data generation. Experiments on the acquired application-specific and benchmark datasets show that the presented methods compete with a variety of state-of-the-art algorithms while being trimmed down to efficiency for use in self-driving cars. Furthermore, they include an extensive set of standard evaluation metrics and results to form a solid baseline for future research.Eines der Hauptziele führender Automobilhersteller sind autonome Fahrzeuge. Sie benötigen ein sehr präzises System für die Wahrnehmung der Umgebung, dass für jedes denkbare Szenario überall auf der Welt funktioniert. Daher sind verschiedene Arten von Sensoren im Einsatz, sodass neben Kameras u. a. auch Lidar Sensoren ein wichtiger Bestandteil sind. Die Entwicklung auf diesem Gebiet ist für künftige Anwendungen von höchster Bedeutung, da Lidare eine genauere, von der Umgebungsbeleuchtung unabhängige, Tiefendarstellung bieten. Insbesondere Algorithmen und maschinelle Lernansätze wie Deep Learning, die Rohdaten über Lernzprozesse direkt verarbeiten können, sind aufgrund der großen Reichweite und der dreidimensionalen Auflösung der gemessenen Punktwolken sehr wichtig. Somit hat sich ein weites Forschungsfeld mit vielen Herausforderungen und ungelösten Problemen etabliert. Diese Arbeit zielt darauf ab, dieses Defizit zu verringern und effiziente Algorithmen zur 3D-Objekterkennung zu entwickeln. Sie stellt ein tiefes Neuronales Netzwerk mit spezifischen Schichten und einer neuartigen Fehlerfunktion zur sicheren Lokalisierung und Schätzung der Orientierung von Objekten aus Punktwolken bereit. Zunächst wird ein 3D-Detektor entwickelt, der in nur einem Vorwärtsdurchlauf aus einer Punktwolke alle Objekte detektiert. Anschließend wird dieser Detektor durch die Fusion von komplementären semantischen Merkmalen aus Kamerabildern und einem gemeinsamen probabilistischen Tracking verfeinert, um die Detektionen zu stabilisieren und Ausreißer zu filtern. Im letzten Teil wird ein Konzept für den Einsatz in einem bestehenden Testfahrzeug vorgestellt, das sich auf die halbautomatische Generierung eines geeigneten Datensatzes konzentriert. Hierbei wird eine Auswertung auf Daten von Automotive-Lidaren vorgestellt. Als Alternative zur zielgerichteten künstlichen Datengenerierung wird ein weiteres generatives Neuronales Netzwerk untersucht. Experimente mit den erzeugten anwendungsspezifischen- und Benchmark-Datensätzen zeigen, dass sich die vorgestellten Methoden mit dem Stand der Technik messen können und gleichzeitig auf Effizienz für den Einsatz in selbstfahrenden Autos optimiert sind. Darüber hinaus enthalten sie einen umfangreichen Satz an Evaluierungsmetriken und -ergebnissen, die eine solide Grundlage für die zukünftige Forschung bilden

    LIPIcs, Volume 261, ICALP 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 261, ICALP 2023, Complete Volum

    Transformed Path Integral Based Approaches for Stochastic Dynamical Systems: Prediction, Filtering, and Optimal Control

    Get PDF
    The study of stochastic systems, their characterization, prediction, and control are of great importance to many fields in science and engineering. This often involves obtaining accurate estimates of quantities of interest such as the system state distribution and/or the expected cost in nonlinear dynamical systems subjected to random forces. The computational prediction and control of such systems are often challenging (and involve large computational costs) due to the presence of nonlinearities, model and measurement uncertainties. Novel path integral–based frameworks for efficient solutions to problems in prediction, nonlinear filtering, and optimal control of stochastic dynamical systems are presented to address several key challenges. The presented frameworks are as follows: (1) the transformed path integral (TPI) approach for solution of the Fokker-Planck equation in stochastic dynamical systems with a full rank diffusion coefficient matrix, (2) the generalized transformed path integral (GTPI) approach—a non-trivial extension of the TPI to stochastic dynamical systems with rank deficient diffusion coefficient matrices, (3) the generalized transformed path integral filter (GTPIF) for solution of nonlinear filtering problems, and (4) the generalized transformed path integral control (GTPIC) for solution of a large class of stochastic optimal control problems are presented. The proposed frameworks are based on the underlying short-time propagators and dynamic transformations of the state variables that ensure the appropriate distributions in the transformed space (state distributions in TPI and GTPI; and corresponding conditional distributions in GTPIF and GTPIC) always have zero mean and identity covariance. In systems where the dynamics are linear with respect to the state variables and initial distribution is Gaussian, the appropriate distributions in the transformed space remain invariant with a standard normal distribution as expected. The frameworks thus allow for the underlying distributions necessary for evaluating the quantities of interest to be accurately represented and evolved in a transformed computational domain. Compared to conventional fixed grid approaches and Monte-Carlo simulations, the challenges in dynamical systems with large drift, diffusion, and concentration of PDF can be addressed more efficiently using the proposed frameworks. In addition, straightforward error bounds for the underlying distributions in the transformed space can be established via Chebyshev's inequality

    Control and Analysis for Sequential Information based on Machine Learning

    Get PDF
    Sequential information is crucial for real-world applications that are related to time, which is same with time-series being described by sequence data followed by temporal order and regular intervals. In this thesis, we consider four major tasks of sequential information that include sequential trend prediction, control strategy optimisation, visual-temporal interpolation and visual-semantic sequential alignment. We develop machine learning theories and provide state-of-the-art models for various real-world applications that involve sequential processes, including the industrial batch process, sequential video inpainting, and sequential visual-semantic image captioning. The ultimate goal is about designing a hybrid framework that can unify diverse sequential information analysis and control systems For industrial process, control algorithms rely on simulations to find the optimal control strategy. However, few machine learning techniques can control the process using raw data, although some works use ML to predict trends. Most control methods rely on amounts of previous experiences, and cannot execute future information to optimize the control strategy. To improve the effectiveness of the industrial process, we propose improved reinforcement learning approaches that can modify the control strategy. We also propose a hybrid reinforcement virtual learning approach to optimise the long-term control strategy. This approach creates a virtual space that interacts with reinforcement learning to predict a virtual strategy without conducting any real experiments, thereby improving and optimising control efficiency. For sequential visual information analysis, we propose a dual-fusion transformer model to tackle the sequential visual-temporal encoding in video inpainting tasks. Our framework includes a flow-guided transformer with dual attention fusion, and we observe that the sequential information is effectively processed, resulting in promising inpainting videos. Finally, we propose a cycle-based captioning model for the analysis of sequential visual-semantic information. This model augments data from two views to optimise caption generation from an image, overcoming new few-shot and zero-shot settings. The proposed model can generate more accurate and informative captions by leveraging sequential visual-semantic information. Overall, the thesis contributes to analysing and manipulating sequential information in multi-modal real-world applications. Our flexible framework design provides a unified theoretical foundation to deploy sequential information systems in distinctive application domains. Considering the diversity of challenges addressed in this thesis, we believe our technique paves the pathway towards versatile AI in the new era
    • …
    corecore