10,879 research outputs found
Hierarchical Graph Neural Networks for Proprioceptive 6D Pose Estimation of In-hand Objects
Robotic manipulation, in particular in-hand object manipulation, often
requires an accurate estimate of the object's 6D pose. To improve the accuracy
of the estimated pose, state-of-the-art approaches in 6D object pose estimation
use observational data from one or more modalities, e.g., RGB images, depth,
and tactile readings. However, existing approaches make limited use of the
underlying geometric structure of the object captured by these modalities,
thereby, increasing their reliance on visual features. This results in poor
performance when presented with objects that lack such visual features or when
visual features are simply occluded. Furthermore, current approaches do not
take advantage of the proprioceptive information embedded in the position of
the fingers. To address these limitations, in this paper: (1) we introduce a
hierarchical graph neural network architecture for combining multimodal (vision
and touch) data that allows for a geometrically informed 6D object pose
estimation, (2) we introduce a hierarchical message passing operation that
flows the information within and across modalities to learn a graph-based
object representation, and (3) we introduce a method that accounts for the
proprioceptive information for in-hand object representation. We evaluate our
model on a diverse subset of objects from the YCB Object and Model Set, and
show that our method substantially outperforms existing state-of-the-art work
in accuracy and robustness to occlusion. We also deploy our proposed framework
on a real robot and qualitatively demonstrate successful transfer to real
settings
Requirements for Explainability and Acceptance of Artificial Intelligence in Collaborative Work
The increasing prevalence of Artificial Intelligence (AI) in safety-critical
contexts such as air-traffic control leads to systems that are practical and
efficient, and to some extent explainable to humans to be trusted and accepted.
The present structured literature analysis examines n = 236 articles on the
requirements for the explainability and acceptance of AI. Results include a
comprehensive review of n = 48 articles on information people need to perceive
an AI as explainable, the information needed to accept an AI, and
representation and interaction methods promoting trust in an AI. Results
indicate that the two main groups of users are developers who require
information about the internal operations of the model and end users who
require information about AI results or behavior. Users' information needs vary
in specificity, complexity, and urgency and must consider context, domain
knowledge, and the user's cognitive resources. The acceptance of AI systems
depends on information about the system's functions and performance, privacy
and ethical considerations, as well as goal-supporting information tailored to
individual preferences and information to establish trust in the system.
Information about the system's limitations and potential failures can increase
acceptance and trust. Trusted interaction methods are human-like, including
natural language, speech, text, and visual representations such as graphs,
charts, and animations. Our results have significant implications for future
human-centric AI systems being developed. Thus, they are suitable as input for
further application-specific investigations of user needs
CARLA+: An Evolution of the CARLA Simulator for Complex Environment Using a Probabilistic Graphical Model
In an urban and uncontrolled environment, the presence of mixed traffic of autonomous vehicles, classical vehicles, vulnerable road users, e.g., pedestrians, and unprecedented dynamic events makes it challenging for the classical autonomous vehicle to navigate the traffic safely. Therefore, the realization of collaborative autonomous driving has the potential to improve road safety and traffic efficiency. However, an obvious challenge in this regard is how to define, model, and simulate the environment that captures the dynamics of a complex and urban environment. Therefore, in this research, we first define the dynamics of the envisioned environment, where we capture the dynamics relevant to the complex urban environment, specifically, highlighting the challenges that are unaddressed and are within the scope of collaborative autonomous driving. To this end, we model the dynamic urban environment leveraging a probabilistic graphical model (PGM). To develop the proposed solution, a realistic simulation environment is required. There are a number of simulators—CARLA (Car Learning to Act), one of the prominent ones, provides rich features and environment; however, it still fails on a few fronts, for example, it cannot fully capture the complexity of an urban environment. Moreover, the classical CARLA mainly relies on manual code and multiple conditional statements, and it provides no pre-defined way to do things automatically based on the dynamic simulation environment. Hence, there is an urgent need to extend the off-the-shelf CARLA with more sophisticated settings that can model the required dynamics. In this regard, we comprehensively design, develop, and implement an extension of a classical CARLA referred to as CARLA+ for the complex environment by integrating the PGM framework. It provides a unified framework to automate the behavior of different actors leveraging PGMs. Instead of manually catering to each condition, CARLA+ enables the user to automate the modeling of different dynamics of the environment. Therefore, to validate the proposed CARLA+, experiments with different settings are designed and conducted. The experimental results demonstrate that CARLA+ is flexible enough to allow users to model various scenarios, ranging from simple controlled models to complex models learned directly from real-world data. In the future, we plan to extend CARLA+ by allowing for more configurable parameters and more flexibility on the type of probabilistic networks and models one can choose. The open-source code of CARLA+ is made publicly available for researchers
A LIGHTWEIGHT MULTI-PERSON POSE ESTIMATION SCHEME BASED ON JETSON NANO
As the basic technology of human action recognition, pose estimation is attracting more and more researchers' attention, while edge application scenarios pose a higher challenge. This paper proposes a lightweight multi-person pose estimation scheme to meet the needs of real-time human action recognition on the edge end. This scheme uses AlphaPose to extract human skeleton nodes, and adds ResNet and Dense Upsampling Revolution to improve its accuracy. Meanwhile, we use YOLO to enhance AlphaPose’s support for multi-person pose estimation, and optimize the proposed model with TensorRT. In addition, this paper sets Jetson Nano as the Edge AI deployment device of the proposed model and successfully realizes the model migration to the edge end. The experimental results show that the speed of the optimized object detection model can reach 20 FPS, and the optimized multi-person pose estimation model can reach 10 FPS. With the image resolution of 320×240, the model’s accuracy is 73.2%, which can meet the real-time requirements. In short, our scheme can provide a basis for lightweight multi-person action recognition scheme on the edge end
Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning
A compelling use case of offline reinforcement learning (RL) is to obtain a
policy initialization from existing datasets followed by fast online
fine-tuning with limited interaction. However, existing offline RL methods tend
to behave poorly during fine-tuning. In this paper, we study the fine-tuning
problem in the context of conservative offline RL methods and we devise an
approach for learning an effective initialization from offline data that also
enables fast online fine-tuning capabilities. Our approach, calibrated
Q-learning (Cal-QL), accomplishes this by learning a conservative value
function initialization that underestimates the value of the learned policy
from offline data, while also ensuring that the learned Q-values are at a
reasonable scale. We refer to this property as calibration, and define it
formally as providing a lower bound on the true value function of the learned
policy and an upper bound on the value of some other (suboptimal) reference
policy, which may simply be the behavior policy. We show that a conservative
offline RL algorithm that also learns a calibrated value function leads to
effective online fine-tuning, enabling us to take the benefits of offline
initializations in online fine-tuning. In practice, Cal-QL can be implemented
on top of the conservative Q learning (CQL) for offline RL within a one-line
code change. Empirically, Cal-QL outperforms state-of-the-art methods on 9/11
fine-tuning benchmark tasks that we study in this paper. Code and video are
available at https://nakamotoo.github.io/projects/Cal-QLComment: project page: https://nakamotoo.github.io/projects/Cal-Q
Beam scanning by liquid-crystal biasing in a modified SIW structure
A fixed-frequency beam-scanning 1D antenna based on Liquid Crystals (LCs) is designed for application in 2D scanning with lateral alignment. The 2D array environment imposes full decoupling of adjacent 1D antennas, which often conflicts with the LC requirement of DC biasing: the proposed design accommodates both. The LC medium is placed inside a Substrate Integrated Waveguide (SIW) modified to work as a Groove Gap Waveguide, with radiating slots etched on the upper broad wall, that radiates as a Leaky-Wave Antenna (LWA). This allows effective application of the DC bias voltage needed for tuning the LCs. At the same time, the RF field remains laterally confined, enabling the possibility to lay several antennas in parallel and achieve 2D beam scanning. The design is validated by simulation employing the actual properties of a commercial LC medium
Using machine learning to predict pathogenicity of genomic variants throughout the human genome
Geschätzt mehr als 6.000 Erkrankungen werden durch Veränderungen im Genom verursacht. Ursachen gibt es viele: Eine genomische Variante kann die Translation eines Proteins stoppen, die Genregulation stören oder das Spleißen der mRNA in eine andere Isoform begünstigen. All diese Prozesse müssen überprüft werden, um die zum beschriebenen Phänotyp passende Variante zu ermitteln. Eine Automatisierung dieses Prozesses sind Varianteneffektmodelle. Mittels maschinellem Lernen und Annotationen aus verschiedenen Quellen bewerten diese Modelle genomische Varianten hinsichtlich ihrer Pathogenität.
Die Entwicklung eines Varianteneffektmodells erfordert eine Reihe von Schritten: Annotation der Trainingsdaten, Auswahl von Features, Training verschiedener Modelle und Selektion eines Modells. Hier präsentiere ich ein allgemeines Workflow dieses Prozesses. Dieses ermöglicht es den Prozess zu konfigurieren, Modellmerkmale zu bearbeiten, und verschiedene Annotationen zu testen. Der Workflow umfasst außerdem die Optimierung von Hyperparametern, Validierung und letztlich die Anwendung des Modells durch genomweites Berechnen von Varianten-Scores.
Der Workflow wird in der Entwicklung von Combined Annotation Dependent Depletion (CADD), einem Varianteneffektmodell zur genomweiten Bewertung von SNVs und InDels, verwendet. Durch Etablierung des ersten Varianteneffektmodells für das humane Referenzgenome GRCh38 demonstriere ich die gewonnenen Möglichkeiten Annotationen aufzugreifen und neue Modelle zu trainieren. Außerdem zeige ich, wie Deep-Learning-Scores als Feature in einem CADD-Modell die Vorhersage von RNA-Spleißing verbessern. Außerdem werden Varianteneffektmodelle aufgrund eines neuen, auf Allelhäufigkeit basierten, Trainingsdatensatz entwickelt.
Diese Ergebnisse zeigen, dass der entwickelte Workflow eine skalierbare und flexible Möglichkeit ist, um Varianteneffektmodelle zu entwickeln. Alle entstandenen Scores sind unter cadd.gs.washington.edu und cadd.bihealth.org frei verfügbar.More than 6,000 diseases are estimated to be caused by genomic variants. This can happen in many possible ways: a variant may stop the translation of a protein, interfere with gene regulation, or alter splicing of the transcribed mRNA into an unwanted isoform. It is necessary to investigate all of these processes in order to evaluate which variant may be causal for the deleterious phenotype. A great help in this regard are variant effect scores. Implemented as machine learning classifiers, they integrate annotations from different resources to rank genomic variants in terms of pathogenicity.
Developing a variant effect score requires multiple steps: annotation of the training data, feature selection, model training, benchmarking, and finally deployment for the model's application. Here, I present a generalized workflow of this process. It makes it simple to configure how information is converted into model features, enabling the rapid exploration of different annotations. The workflow further implements hyperparameter optimization, model validation and ultimately deployment of a selected model via genome-wide scoring of genomic variants.
The workflow is applied to train Combined Annotation Dependent Depletion (CADD), a variant effect model that is scoring SNVs and InDels genome-wide. I show that the workflow can be quickly adapted to novel annotations by porting CADD to the genome reference GRCh38. Further, I demonstrate the integration of deep-neural network scores as features into a new CADD model, improving the annotation of RNA splicing events. Finally, I apply the workflow to train multiple variant effect models from training data that is based on variants selected by allele frequency.
In conclusion, the developed workflow presents a flexible and scalable method to train variant effect scores. All software and developed scores are freely available from cadd.gs.washington.edu and cadd.bihealth.org
Introduction to Facial Micro Expressions Analysis Using Color and Depth Images: A Matlab Coding Approach (Second Edition, 2023)
The book attempts to introduce a gentle introduction to the field of Facial
Micro Expressions Recognition (FMER) using Color and Depth images, with the aid
of MATLAB programming environment. FMER is a subset of image processing and it
is a multidisciplinary topic to analysis. So, it requires familiarity with
other topics of Artifactual Intelligence (AI) such as machine learning, digital
image processing, psychology and more. So, it is a great opportunity to write a
book which covers all of these topics for beginner to professional readers in
the field of AI and even without having background of AI. Our goal is to
provide a standalone introduction in the field of MFER analysis in the form of
theorical descriptions for readers with no background in image processing with
reproducible Matlab practical examples. Also, we describe any basic definitions
for FMER analysis and MATLAB library which is used in the text, that helps
final reader to apply the experiments in the real-world applications. We
believe that this book is suitable for students, researchers, and professionals
alike, who need to develop practical skills, along with a basic understanding
of the field. We expect that, after reading this book, the reader feels
comfortable with different key stages such as color and depth image processing,
color and depth image representation, classification, machine learning, facial
micro-expressions recognition, feature extraction and dimensionality reduction.
The book attempts to introduce a gentle introduction to the field of Facial
Micro Expressions Recognition (FMER) using Color and Depth images, with the aid
of MATLAB programming environment.Comment: This is the second edition of the boo
Recommended from our members
A Survey of Quantum-Cognitively Inspired Sentiment Analysis Models
Quantum theory, originally proposed as a physical theory to describe the motions of microscopic particles, has been applied to various non-physics domains involving human cognition and decision-making that are inherently uncertain and exhibit certain non-classical, quantum-like characteristics. Sentiment analysis is a typical example of such domains. In the last few years, by leveraging the modeling power of quantum probability (a non-classical probability stemming from quantum mechanics methodology) and deep neural networks, a range of novel quantum-cognitively inspired models for sentiment analysis have emerged and performed well. This survey presents a timely overview of the latest developments in this fascinating cross-disciplinary area. We first provide a background of quantum probability and quantum cognition at a theoretical level, analyzing their advantages over classical theories in modeling the cognitive aspects of sentiment analysis. Then, recent quantum-cognitively inspired models are introduced and discussed in detail, focusing on how they approach the key challenges of the sentiment analysis task. Finally, we discuss the limitations of the current research and highlight future research directions
- …