323 research outputs found

    Probabilistic Parametric Curves for Sequence Modeling

    Get PDF
    This work proposes a probabilistic extension to BĂ©zier curves as a basis for effectively modeling stochastic processes with a bounded index set. The proposed stochastic process model is based on Mixture Density Networks and BĂ©zier curves with Gaussian random variables as control points. A key advantage of this model is given by the ability to generate multi-mode predictions in a single inference step, thus avoiding the need for Monte Carlo simulation

    Probabilistic Parametric Curves for Sequence Modeling

    Get PDF
    Repräsentationen sequenzieller Daten basieren in der Regel auf der Annahme, dass beobachtete Sequenzen Realisierungen eines unbekannten zugrundeliegenden stochastischen Prozesses sind. Die Bestimmung einer solchen Repräsentation wird üblicherweise als Lernproblem ausgelegt und ergibt ein Sequenzmodell. Das Modell muss in diesem Zusammenhang in der Lage sein, die multimodale Natur der Daten zu erfassen, ohne einzelne Modi zu vermischen. Zur Modellierung eines zugrundeliegenden stochastischen Prozesses lernen häufig verwendete, auf neuronalen Netzen basierende Ansätze entweder eine Wahrscheinlichkeitsverteilung zu parametrisieren oder eine implizite Repräsentation unter Verwendung stochastischer Eingaben oder Neuronen. Dabei integrieren diese Modelle in der Regel Monte Carlo Verfahren oder andere Näherungslösungen, um die Parameterschätzung und probabilistische Inferenz zu ermöglichen. Dies gilt sogar für regressionsbasierte Ansätze basierend auf Mixture Density Netzwerken, welche ebenso Monte Carlo Simulationen zur multi-modalen Inferenz benötigen. Daraus ergibt sich eine Forschungslücke für vollständig regressionsbasierte Ansätze zur Parameterschätzung und probabilistischen Inferenz. Infolgedessen stellt die vorliegende Arbeit eine probabilistische Erweiterung für Bézierkurven (N\mathcal{N}-Kurven) als Basis für die Modellierung zeitkontinuierlicher stochastischer Prozesse mit beschränkter Indexmenge vor. Das vorgestellte Modell, bezeichnet als N\mathcal{N}-Kurven - Modell, basiert auf Mixture Density Netzwerken (MDN) und Bézierkurven, welche Kurvenkontrollpunkte als normalverteilt annehmen. Die Verwendung eines MDN-basierten Ansatzes steht im Einklang mit aktuellen Versuchen, Unsicherheitsschätzung als Regressionsproblem auszulegen, und ergibt ein generisches Modell, welches allgemein als Basismodell für die probabilistische Sequenzmodellierung einsetzbar ist. Ein wesentlicher Vorteil des Modells ist unter anderem die Möglichkeit glatte, multi-modale Vorhersagen in einem einzigen Inferenzschritt zu generieren, ohne dabei Monte Carlo Simulationen zu benötigen. Durch die Verwendung von Bézierkurven als Basis, kann das Modell außerdem theoretisch für beliebig hohe Datendimensionen verwendet werden, indem die Kontrollpunkte in einen hochdimensionalen Raum eingebettet werden. Um die durch den Fokus auf beschränkte Indexmengen existierenden theoretischen Einschränkungen aufzuheben, wird zusätzlich eine konzeptionelle Erweiterung für das N\mathcal{N}-Kurven - Modell vorgestellt, mit der unendliche stochastische Prozesse modelliert werden können. Wesentliche Eigenschaften des vorgestellten Modells und dessen Erweiterung werden auf verschiedenen Beispielen zur Sequenzsynthese gezeigt. Aufgrund der hinreichenden Anwendbarkeit des N\mathcal{N}-Kurven - Modells auf die meisten Anwendungsfälle, wird dessen Tauglichkeit umfangreich auf verschiedenen Mehrschrittprädiktionsaufgaben unter Verwendung realer Daten evaluiert. Zunächst wird das Modell gegen häufig verwendete probabilistische Sequenzmodelle im Kontext der Vorhersage von Fußgängertrajektorien evaluiert, wobei es sämtliche Vergleichsmodelle übertrifft. In einer qualitativen Auswertung wird das Verhalten des Modells in einem Vorhersagekontext untersucht. Außerdem werden Schwierigkeiten bei der Bewertung probabilistischer Sequenzmodelle in einem multimodalen Setting diskutiert. Darüber hinaus wird das Modell im Kontext der Vorhersage menschlicher Bewegungen angewendet, um die angestrebte Skalierbarkeit des Modells auf höherdimensionale Daten zu bewerten. Bei dieser Aufgabe übertrifft das Modell allgemein verwendete einfache und auf neuronalen Netzen basierende Grundmodelle und ist in verschiedenen Situationen auf Augenhöhe mit verschiedenen State-of-the-Art-Modellen, was die Einsetzbarkeit in diesem höherdimensionalen Beispiel zeigt. Des Weiteren werden Schwierigkeiten bei der Kovarianzschätzung und die Glättungseigenschaften des N\mathcal{N}-Kurven - Modells diskutiert

    Nachweislich sichere Bewegungsplanung fĂĽr autonome Fahrzeuge durch Echtzeitverifikation

    Get PDF
    This thesis introduces fail-safe motion planning as the first approach to guarantee legal safety of autonomous vehicles in arbitrary traffic situations. The proposed safety layer verifies whether intended trajectories comply with legal safety and provides fail-safe trajectories when intended trajectories result in safety-critical situations. The presented results indicate that the use of fail-safe motion planning can drastically reduce the number of traffic accidents.Die vorliegende Arbeit führt ein neuartiges Verifikationsverfahren ein, mit dessen Hilfe zum ersten Mal die verkehrsregelkonforme Sicherheit von autonomen Fahrzeugen gewährleistet werden kann. Das Verifikationsverfahren überprüft, ob geplante Trajektorien sicher sind und generiert Rückfalltrajektorien falls diese zu einer unsicheren Situation führen. Die Ergebnisse zeigen, dass die Verwendung des Verfahrens zu einer deutlichen Reduktion von Verkehrsunfällen führt

    Improving Visual Embeddings using Attention and Geometry Constraints

    Get PDF
    Learning a non-linear function to embed the raw data (i.e., image, video, or language) to a discriminative feature embedding space is considered a fundamental problem in the learning community. In such embedding spaces, the data with similar semantic meaning are clustered, while the data with dissimilar semantic meaning are separated. A number of practical applications can benefit from a good feature embedding, e.g., machine translation, classification/recognition, retrieval, any-shot learning, etc In this Thesis, we aim to improve the visual embeddings using attention and geometry constraints. In the first part of the Thesis, we develop two neural attention modules, which can automatically localize the informative regions within the feature map, thereby generating a discriminative feature representation for the image. An Attention in Attention (AiA) mechanism is first proposed to align the feature map along with the deep network, by modeling the interaction of inner attention and outer attention modules. Intuitively, the AiA mechanism can be understood as having an attention inside another, with the inner one determining where to focus for the outer attention module. Further, we employ explicit non-linear mappings in Reproducing Kernel Hilbert Spaces (RHKSs) to generate attention values, leading the channel descriptor of the feature map to own the representation power of second-order polynomial kernel and Gaussian kernel. In addition, the Channel Recurrent Attention (CRA) module is proposed to build a global receptive field to the feature map. The existing attention mechanisms focus on either the channel pattern or the spatial pattern of the feature map, which cannot make full use of the information in the feature map. The CRA module can jointly learn the channel and spatial patterns of the feature map and produce attention value per every element of the input feature map. This is achieved by feeding the spatial vectors to a recurrent neural network (RNN) sequentially, such that the RNN can create a global view of the feature map. In the second part, we investigate the superiority of geometry constraint for embedding learning. We first study the geometry concern of the set as an embedding for a video clip. Usually, the video embedding is optimized using triplet loss, in which the distance is calculated between clip features, such that the frame feature cannot be optimized directly. To this end, we model the video clip as a set, and employ the distance between sets in the triplet loss. Tailored for the set-aware triplet loss, a new set distance metric is also proposed to measure the hard frames in a triplet. Optimizing over set-aware triplet loss leads to a compact clip feature embedding, improving the discriminative of the video representation. Beyond the flat Euclidean embedding space, we further study a curved space, i.e., hyperbolic spaces, as image embedding spaces. In contrast to Euclidean embedding, hyperbolic embedding can encode the data's hierarchical structure, as the volume of hyperbolic space increases exponentially. However, performing basic operations for comparison in hyperbolic spaces is complex and time-consuming. For example, the similarity measure is not well-defined in hyperbolic spaces. To mitigate this issue, we introduce the positive definite (pd) kernels for hyperbolic embeddings. Specifically, we propose four pd kernels in hyperbolic spaces in conjunction with a theoretical analysis. The proposed kernels include hyperbolic tangent kernel, hyperbolic RBF kernel, hyperbolic Laplace kernel, and hyperbolic binomial kernel. We demonstrate the effectiveness of the proposed methods via a image or video person re-identification task. We also evaluate the generalization of hyperbolic kernels by few-shot learning, zero-shot learning and knowledge distillation tasks

    Behavior-specific proprioception models for robotic force estimation: a machine learning approach

    Get PDF
    Robots that support humans in physically demanding tasks require accurate force sensing capabilities. A common way to achieve this is by monitoring the interaction with the environment directly with dedicated force sensors. Major drawbacks of such special purpose sensors are the increased costs and the reduced payload of the robot platform. Instead, this thesis investigates how the functionality of such sensors can be approximated by utilizing force estimation approaches. Most of today’s robots are equipped with rich proprioceptive sensing capabilities where even a robotic arm, e.g., the UR5, provides access to more than hundred sensor readings. Following this trend, it is getting feasible to utilize a wide variety of sensors for force estimation purposes. Human proprioception allows estimating forces such as the weight of an object by prior experience about sensory-motor patterns. Applying a similar approach to robots enables them to learn from previous demonstrations without the need of dedicated force sensors. This thesis introduces Behavior-Specific Proprioception Models (BSPMs), a novel concept for enhancing robotic behavior with estimates of the expected proprioceptive feedback. A main methodological contribution is the operationalization of the BSPM approach using data-driven machine learning techniques. During a training phase, the behavior is continuously executed while recording proprioceptive sensor readings. The training data acquired from these demonstrations represents ground truth about behavior-specific sensory-motor experiences, i.e., the influence of performed actions and environmental conditions on the proprioceptive feedback. This data acquisition procedure does not require expert knowledge about the particular robot platform, e.g., kinematic chains or mass distribution, which is a major advantage over analytical approaches. The training data is then used to learn BSPMs, e.g. using lazy learning techniques or artificial neural networks. At runtime, the BSPMs provide estimates of the proprioceptive feedback that can be compared to actual sensations. The BSPM approach thus extends classical programming by demonstrations methods where only movement data is learned and enables robots to accurately estimate forces during behavior execution

    Fuzzy Logic

    Get PDF
    The capability of Fuzzy Logic in the development of emerging technologies is introduced in this book. The book consists of sixteen chapters showing various applications in the field of Bioinformatics, Health, Security, Communications, Transportations, Financial Management, Energy and Environment Systems. This book is a major reference source for all those concerned with applied intelligent systems. The intended readers are researchers, engineers, medical practitioners, and graduate students interested in fuzzy logic systems

    Unveiling the frontiers of deep learning: innovations shaping diverse domains

    Full text link
    Deep learning (DL) enables the development of computer models that are capable of learning, visualizing, optimizing, refining, and predicting data. In recent years, DL has been applied in a range of fields, including audio-visual data processing, agriculture, transportation prediction, natural language, biomedicine, disaster management, bioinformatics, drug design, genomics, face recognition, and ecology. To explore the current state of deep learning, it is necessary to investigate the latest developments and applications of deep learning in these disciplines. However, the literature is lacking in exploring the applications of deep learning in all potential sectors. This paper thus extensively investigates the potential applications of deep learning across all major fields of study as well as the associated benefits and challenges. As evidenced in the literature, DL exhibits accuracy in prediction and analysis, makes it a powerful computational tool, and has the ability to articulate itself and optimize, making it effective in processing data with no prior training. Given its independence from training data, deep learning necessitates massive amounts of data for effective analysis and processing, much like data volume. To handle the challenge of compiling huge amounts of medical, scientific, healthcare, and environmental data for use in deep learning, gated architectures like LSTMs and GRUs can be utilized. For multimodal learning, shared neurons in the neural network for all activities and specialized neurons for particular tasks are necessary.Comment: 64 pages, 3 figures, 3 table

    Deep Reinforcement Learning Models for Real-Time Traffic Signal Optimization with Big Traffic Data

    Get PDF
    One of the most significant changes that the globe has faced in recent years is the changes brought about by the COVID19 pandemic. While this research was started before the pandemic began, the pandemic has exposed the value that data and information can have in modern society. During the pandemic traffic volumes changed substantially, leaving the inefficiencies of existing methods exposed. This research has focussed on exploring two key ideas that will become increasingly relevant as societies adapt to these changes: Big Data and Artificial Intelligence. For many municipalities, traffic signals are still re-timed using traditional approaches and there is still significant reliance on static timing plans designed with data collected from static field studies. This research explored the possibility of using travel-time data obtained from Bluetooth and WiFi sniffing. Bluetooth and WiFi sniffing is an emerging Big Data approach that takes advantage of the ability to track and monitor unique devices as they move from location to location. An approach to re-time signals using an adaptive system was developed, analysed, and tested under varying conditions. The results of this work showed that this data could be used to improve delays by as much as 10\% when compared to traditional approaches. More importantly, this approach demonstrated that it is possible to re-time signals using a readily available and dynamic data source without the need for field volume studies. In addition to Big Data technologies, Artificial Intelligence (AI) is increasingly playing an important role in modern technologies. AI is already being used to make complex decisions, categorise images, and can best humans in complex strategy games. While AI shows promise, applications to Traffic Engineering have been limtied. This research has advanced the state-of-the art by conducting a systematic sensitivity study on an AI technique, Deep Reinforcement Learning. This thesis investigated and identified optimal settings for key parameters such as the discount factor, learning rate, and reward functions. This thesis also developed and tested a complete framework that could potentially be applied to evaluate AI techniques in field settings. This includes applications of AI techniques such as transfer learning to reduce training times. Finally, this thesis also examined framings for multi-intersection control, including comparisons to existing state-of-the art approaches such as SCOOT
    • …
    corecore