539 research outputs found
A space-variant visual pathway model for data efficient deep learning
We present an investigation into adopting a model of the retino-cortical mapping, found in biological visual systems, to improve the efficiency of image analysis using Deep Convolutional Neural Nets (DCNNs) in the context of robot vision and egocentric perception systems. This work has now enabled DCNNs to process input images approaching one million pixels in size, in real time, using only consumer grade graphics processor (GPU) hardware in a single pass of the DCNN
Combined Admittance Control With Type II Singularity Evasion for Parallel Robots Using Dynamic Movement Primitives
[EN] This article addresses a new way of generating compliant trajectories for control using movement primitives to allow physical human-robot interaction where parallel robots (PRs) are involved. PRs are suitable for tasks requiring precision and performance because of their robust behavior. However, two fundamental issues must be resolved to ensure safe operation: first, the force exerted on the human must be controlled and limited, and second, Type II singularities should be avoided to keep complete control of the robot. We offer a unified solution under the dynamic movement primitives (DMP) framework to tackle both tasks simultaneously. DMPs are used to get an abstract representation for movement generation and are involved in broad areas, such as imitation learning and movement recognition. For force control, we design an admittance controller intrinsically defined within the DMP structure, and subsequently, the Type II singularity evasion layer is added to the system. Both the admittance controller and the evader exploit the dynamic behavior of the DMP and its properties related to invariance and temporal coupling, and the whole system is deployed in a real PR meant for knee rehabilitation. The results show the capability of the system to perform safe rehabilitation exercises.This work was supported in part by the Fondo Europeo de Desarrollo Regional under Grant PID2021-125694OB-I00, in part by the Vicerrectorado de Investigacion de la Universitat Politecnica de Valencia under Grant PAID-11-21, and in part by the Ministerio de Universidades, Gobierno de Espana under Grant FPU18/05105.Escarabajal-Sánchez, RJ.; Pulloquinga-Zapata, J.; Valera Fernández, Á.; Mata Amela, V.; Vallés Miquel, M.; Castillo-García, FJ. (2023). Combined Admittance Control With Type II Singularity Evasion for Parallel Robots Using Dynamic Movement Primitives. IEEE Transactions on Robotics. 39(3):2224-2239. https://doi.org/10.1109/TRO.2023.32381362224223939
State of the Art in Face Recognition
Notwithstanding the tremendous effort to solve the face recognition problem, it is not possible yet to design a face recognition system with a potential close to human performance. New computer vision and pattern recognition approaches need to be investigated. Even new knowledge and perspectives from different fields like, psychology and neuroscience must be incorporated into the current field of face recognition to design a robust face recognition system. Indeed, many more efforts are required to end up with a human like face recognition system. This book tries to make an effort to reduce the gap between the previous face recognition research state and the future state
Programming by Demonstration on Riemannian Manifolds
This thesis presents a Riemannian approach to Programming by Demonstration (PbD).
It generalizes an existing PbD method from Euclidean manifolds to Riemannian manifolds.
In this abstract, we review the objectives, methods and contributions of the presented
approach.
OBJECTIVES
PbD aims at providing a user-friendly method for skill transfer between human and
robot. It enables a user to teach a robot new tasks using few demonstrations. In order
to surpass simple record-and-replay, methods for PbD need to \u2018understand\u2019 what to
imitate; they need to extract the functional goals of a task from the demonstration data.
This is typically achieved through the application of statisticalmethods.
The variety of data encountered in robotics is large. Typical manipulation tasks involve
position, orientation, stiffness, force and torque data. These data are not solely
Euclidean. Instead, they originate from a variety of manifolds, curved spaces that are
only locally Euclidean. Elementary operations, such as summation, are not defined on
manifolds. Consequently, standard statistical methods are not well suited to analyze
demonstration data that originate fromnon-Euclidean manifolds. In order to effectively
extract what-to-imitate, methods for PbD should take into account the underlying geometry
of the demonstration manifold; they should be geometry-aware.
Successful task execution does not solely depend on the control of individual task
variables. By controlling variables individually, a task might fail when one is perturbed
and the others do not respond. Task execution also relies on couplings among task variables.
These couplings describe functional relations which are often called synergies. In
order to understand what-to-imitate, PbDmethods should be able to extract and encode
synergies; they should be synergetic.
In unstructured environments, it is unlikely that tasks are found in the same scenario
twice. The circumstances under which a task is executed\u2014the task context\u2014are more
likely to differ each time it is executed. Task context does not only vary during task execution,
it also varies while learning and recognizing tasks. To be effective, a robot should
be able to learn, recognize and synthesize skills in a variety of familiar and unfamiliar
contexts; this can be achieved when its skill representation is context-adaptive.
THE RIEMANNIAN APPROACH
In this thesis, we present a skill representation that is geometry-aware, synergetic and
context-adaptive. The presented method is probabilistic; it assumes that demonstrations
are samples from an unknown probability distribution. This distribution is approximated
using a Riemannian GaussianMixtureModel (GMM).
Instead of using the \u2018standard\u2019 Euclidean Gaussian, we rely on the Riemannian Gaussian\u2014
a distribution akin the Gaussian, but defined on a Riemannian manifold. A Riev
mannian manifold is a manifold\u2014a curved space which is locally Euclidean\u2014that provides
a notion of distance. This notion is essential for statistical methods as such methods
rely on a distance measure. Examples of Riemannian manifolds in robotics are: the
Euclidean spacewhich is used for spatial data, forces or torques; the spherical manifolds,
which can be used for orientation data defined as unit quaternions; and Symmetric Positive
Definite (SPD) manifolds, which can be used to represent stiffness and manipulability.
The Riemannian Gaussian is intrinsically geometry-aware. Its definition is based on
the geometry of the manifold, and therefore takes into account the manifold curvature.
In robotics, the manifold structure is often known beforehand. In the case of PbD, it follows
from the structure of the demonstration data. Like the Gaussian distribution, the
Riemannian Gaussian is defined by a mean and covariance. The covariance describes
the variance and correlation among the state variables. These can be interpreted as local
functional couplings among state variables: synergies. This makes the Riemannian
Gaussian synergetic. Furthermore, information encoded in multiple Riemannian Gaussians
can be fused using the Riemannian product of Gaussians. This feature allows us to
construct a probabilistic context-adaptive task representation.
CONTRIBUTIONS
In particular, this thesis presents a generalization of existing methods of PbD, namely
GMM-GMR and TP-GMM. This generalization involves the definition ofMaximum Likelihood
Estimate (MLE), Gaussian conditioning and Gaussian product for the Riemannian
Gaussian, and the definition of ExpectationMaximization (EM) and GaussianMixture
Regression (GMR) for the Riemannian GMM. In this generalization, we contributed
by proposing to use parallel transport for Gaussian conditioning. Furthermore, we presented
a unified approach to solve the aforementioned operations using aGauss-Newton
algorithm. We demonstrated how synergies, encoded in a Riemannian Gaussian, can be
transformed into synergetic control policies using standard methods for LinearQuadratic
Regulator (LQR). This is achieved by formulating the LQR problem in a (Euclidean) tangent
space of the Riemannian manifold. Finally, we demonstrated how the contextadaptive
Task-Parameterized Gaussian Mixture Model (TP-GMM) can be used for context
inference\u2014the ability to extract context from demonstration data of known tasks.
Our approach is the first attempt of context inference in the light of TP-GMM. Although
effective, we showed that it requires further improvements in terms of speed and reliability.
The efficacy of the Riemannian approach is demonstrated in a variety of scenarios.
In shared control, the Riemannian Gaussian is used to represent control intentions of a
human operator and an assistive system. Doing so, the properties of the Gaussian can
be employed to mix their control intentions. This yields shared-control systems that
continuously re-evaluate and assign control authority based on input confidence. The
context-adaptive TP-GMMis demonstrated in a Pick & Place task with changing pick and
place locations, a box-taping task with changing box sizes, and a trajectory tracking task
typically found in industr
Advanced Strategies for Robot Manipulators
Amongst the robotic systems, robot manipulators have proven themselves to be of increasing importance and are widely adopted to substitute for human in repetitive and/or hazardous tasks. Modern manipulators are designed complicatedly and need to do more precise, crucial and critical tasks. So, the simple traditional control methods cannot be efficient, and advanced control strategies with considering special constraints are needed to establish. In spite of the fact that groundbreaking researches have been carried out in this realm until now, there are still many novel aspects which have to be explored
Divergent Criticality – A Mechanism of Neural Function for Perception and Learning
The natural world presents opportunities to all organisms as they compete for the biological-value afforded to them through their ecological engagement. This presents two fundamental requirements for perceiving such opportunities: to be able to recognise value and learning how to access new value. Though many theoretical accounts of how we might achieve such selectionist ends have been explored – how ‘perception’ and ‘learning’ resonate with life’s challenges and opportunities, to date, no explanation has yet been able to naturalise such perception adequately in the Universal laws that govern our existence – not only for explaining the human experience of the world, but in exploring the true nature of our perception. This thesis explores our perceptions of engaging with the world and seeks to explain how the demands of our experiences resonate with the efficient functioning of our brain. It proposes, that in a world of challenge and opportunity, rather than the efficient functioning of our neural resources, it is, instead, the optimising of ‘learning’ that is selected for, as an evolutionary priority. Building on existing literature in the fields of Phenomenology, Free Energy and Neuroscience, this thesis considers perception and learning as synonymous with the cognitive constructs of an ‘attention’ tuned for learning optimisation, and explores the processes of learning in neural function. It addresses the philosophical issues of how an individual’s perception of subjective experiences, might provide some empirical objectivity in proposing a ‘Tolerance’ hypothesis. This is a relative definition able to coordinate a ‘perception of experience’ in terms of an learning-function, grounded in free-energy theory (the laws of physics) and the ecological dynamics of a spontaneous or ‘self- organising’ mechanism – Divergent Criticality. The methodology incorporated three studies: Pilot, Developmental and Exploratory. Over the three studies, Divergent Criticality was tested by developing a functional Affordance measure to address the Research Question – are perceptions as affective-cognitions made aware as reflecting the agential mediation of a self-regulating, optimal learning mechanism? Perception questionnaires of Situational Interest and Self-concept were used in Study One and Study Two to investigate their suitability in addressing the Research Question. Here, Factor Analysis and Structural Equation Modelling assessed the validity and reliability of these measures, developing robust questionnaires and a research design for testing Divergent Criticality. In Study Three, the Divergent Criticality hypothesis was found to be significant, supporting that a Divergent Criticality mechanism is in operation: When individuals are engaging with dynamic ecological challenges, perception is affective in accordance with Tolerance Optimisation, demonstrating that a Divergent Criticality mechanism is driving individuals to the limits of their Effectivity – an optimal learning state which is fundamental to life and naturalised in Universal laws
Change blindness: eradication of gestalt strategies
Arrays of eight, texture-defined rectangles were used as stimuli in a one-shot change blindness (CB) task where there was a 50% chance that one rectangle would change orientation between two successive presentations separated by an interval. CB was eliminated by cueing the target rectangle in the first stimulus, reduced by cueing in the interval and unaffected by cueing in the second presentation. This supports the idea that a representation was formed that persisted through the interval before being 'overwritten' by the second presentation (Landman et al, 2003 Vision Research 43149–164]. Another possibility is that participants used some kind of grouping or Gestalt strategy. To test this we changed the spatial position of the rectangles in the second presentation by shifting them along imaginary spokes (by ±1 degree) emanating from the central fixation point. There was no significant difference seen in performance between this and the standard task [F(1,4)=2.565, p=0.185]. This may suggest two things: (i) Gestalt grouping is not used as a strategy in these tasks, and (ii) it gives further weight to the argument that objects may be stored and retrieved from a pre-attentional store during this task
Matching and compressing sequences of visual hulls
Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2004.Includes bibliographical references (p. 61-63).In this thesis, we implement the polyhedral visual hull (PVH) algorithm in a modular software system to reconstruct 3D meshes from 2D images and camera poses. We also introduce the new idea of visual hull graphs. For data, using an eight camera synchronous system after multi-camera calibration, we collect video sequences to study the pose and motion of people. For efficiency in VH processing, we compress 2D input contours to reduce te number of triangles in the output mesh and demonstrate how subdivision surfaces smoothly approximate the irregular output mesh in 3D. After generating sequences of visual hulls from source video, to define a visual hull graph, we use a simple distance metric for pose by calculating Chamfer distances between 2D shape contours. At each frame of our graph, we store a view independent 3D pose and calculate the transition probability to any other frame based on similarity of pose. To test our approach, we synthesize new realistic motion by walking through cycles in the graph. Our results are new videos of arbitrary length and viewing direction based on a sample source video.by Naveen Goela.M.Eng
2D-3D Pose Tracking of Rigid Instruments in Minimally Invasive Surgery
Instrument localization and tracking is an important challenge for advanced computer assisted techniques in minimally invasive surgery and image-based solutions to instrument localization can provide a non-invasive, low cost solution. In this study, we present a novel algorithm capable of recovering the 3D pose of laparoscopic surgical instruments combining constraints from a classification algorithm, multiple point features, stereo views (when available) and a linear motion model to robustly track the tool in surgical videos. We demonstrate the improved robustness and performance of our algorithm with optically tracked ground truth and additionally qualitatively demonstrate its performance on in vivo images. © 2014 Springer International Publishing Switzerland
Breaking the Curse of Dimensionality in Deep Neural Networks by Learning Invariant Representations
Artificial intelligence, particularly the subfield of machine learning, has
seen a paradigm shift towards data-driven models that learn from and adapt to
data. This has resulted in unprecedented advancements in various domains such
as natural language processing and computer vision, largely attributed to deep
learning, a special class of machine learning models. Deep learning arguably
surpasses traditional approaches by learning the relevant features from raw
data through a series of computational layers.
This thesis explores the theoretical foundations of deep learning by studying
the relationship between the architecture of these models and the inherent
structures found within the data they process. In particular, we ask What
drives the efficacy of deep learning algorithms and allows them to beat the
so-called curse of dimensionality-i.e. the difficulty of generally learning
functions in high dimensions due to the exponentially increasing need for data
points with increased dimensionality? Is it their ability to learn relevant
representations of the data by exploiting their structure? How do different
architectures exploit different data structures? In order to address these
questions, we push forward the idea that the structure of the data can be
effectively characterized by its invariances-i.e. aspects that are irrelevant
for the task at hand.
Our methodology takes an empirical approach to deep learning, combining
experimental studies with physics-inspired toy models. These simplified models
allow us to investigate and interpret the complex behaviors we observe in deep
learning systems, offering insights into their inner workings, with the
far-reaching goal of bridging the gap between theory and practice.Comment: PhD Thesis @ EPF
- …