539 research outputs found

    A space-variant visual pathway model for data efficient deep learning

    Get PDF
    We present an investigation into adopting a model of the retino-cortical mapping, found in biological visual systems, to improve the efficiency of image analysis using Deep Convolutional Neural Nets (DCNNs) in the context of robot vision and egocentric perception systems. This work has now enabled DCNNs to process input images approaching one million pixels in size, in real time, using only consumer grade graphics processor (GPU) hardware in a single pass of the DCNN

    Combined Admittance Control With Type II Singularity Evasion for Parallel Robots Using Dynamic Movement Primitives

    Full text link
    [EN] This article addresses a new way of generating compliant trajectories for control using movement primitives to allow physical human-robot interaction where parallel robots (PRs) are involved. PRs are suitable for tasks requiring precision and performance because of their robust behavior. However, two fundamental issues must be resolved to ensure safe operation: first, the force exerted on the human must be controlled and limited, and second, Type II singularities should be avoided to keep complete control of the robot. We offer a unified solution under the dynamic movement primitives (DMP) framework to tackle both tasks simultaneously. DMPs are used to get an abstract representation for movement generation and are involved in broad areas, such as imitation learning and movement recognition. For force control, we design an admittance controller intrinsically defined within the DMP structure, and subsequently, the Type II singularity evasion layer is added to the system. Both the admittance controller and the evader exploit the dynamic behavior of the DMP and its properties related to invariance and temporal coupling, and the whole system is deployed in a real PR meant for knee rehabilitation. The results show the capability of the system to perform safe rehabilitation exercises.This work was supported in part by the Fondo Europeo de Desarrollo Regional under Grant PID2021-125694OB-I00, in part by the Vicerrectorado de Investigacion de la Universitat Politecnica de Valencia under Grant PAID-11-21, and in part by the Ministerio de Universidades, Gobierno de Espana under Grant FPU18/05105.Escarabajal-Sánchez, RJ.; Pulloquinga-Zapata, J.; Valera Fernández, Á.; Mata Amela, V.; Vallés Miquel, M.; Castillo-García, FJ. (2023). Combined Admittance Control With Type II Singularity Evasion for Parallel Robots Using Dynamic Movement Primitives. IEEE Transactions on Robotics. 39(3):2224-2239. https://doi.org/10.1109/TRO.2023.32381362224223939

    State of the Art in Face Recognition

    Get PDF
    Notwithstanding the tremendous effort to solve the face recognition problem, it is not possible yet to design a face recognition system with a potential close to human performance. New computer vision and pattern recognition approaches need to be investigated. Even new knowledge and perspectives from different fields like, psychology and neuroscience must be incorporated into the current field of face recognition to design a robust face recognition system. Indeed, many more efforts are required to end up with a human like face recognition system. This book tries to make an effort to reduce the gap between the previous face recognition research state and the future state

    Programming by Demonstration on Riemannian Manifolds

    Get PDF
    This thesis presents a Riemannian approach to Programming by Demonstration (PbD). It generalizes an existing PbD method from Euclidean manifolds to Riemannian manifolds. In this abstract, we review the objectives, methods and contributions of the presented approach. OBJECTIVES PbD aims at providing a user-friendly method for skill transfer between human and robot. It enables a user to teach a robot new tasks using few demonstrations. In order to surpass simple record-and-replay, methods for PbD need to \u2018understand\u2019 what to imitate; they need to extract the functional goals of a task from the demonstration data. This is typically achieved through the application of statisticalmethods. The variety of data encountered in robotics is large. Typical manipulation tasks involve position, orientation, stiffness, force and torque data. These data are not solely Euclidean. Instead, they originate from a variety of manifolds, curved spaces that are only locally Euclidean. Elementary operations, such as summation, are not defined on manifolds. Consequently, standard statistical methods are not well suited to analyze demonstration data that originate fromnon-Euclidean manifolds. In order to effectively extract what-to-imitate, methods for PbD should take into account the underlying geometry of the demonstration manifold; they should be geometry-aware. Successful task execution does not solely depend on the control of individual task variables. By controlling variables individually, a task might fail when one is perturbed and the others do not respond. Task execution also relies on couplings among task variables. These couplings describe functional relations which are often called synergies. In order to understand what-to-imitate, PbDmethods should be able to extract and encode synergies; they should be synergetic. In unstructured environments, it is unlikely that tasks are found in the same scenario twice. The circumstances under which a task is executed\u2014the task context\u2014are more likely to differ each time it is executed. Task context does not only vary during task execution, it also varies while learning and recognizing tasks. To be effective, a robot should be able to learn, recognize and synthesize skills in a variety of familiar and unfamiliar contexts; this can be achieved when its skill representation is context-adaptive. THE RIEMANNIAN APPROACH In this thesis, we present a skill representation that is geometry-aware, synergetic and context-adaptive. The presented method is probabilistic; it assumes that demonstrations are samples from an unknown probability distribution. This distribution is approximated using a Riemannian GaussianMixtureModel (GMM). Instead of using the \u2018standard\u2019 Euclidean Gaussian, we rely on the Riemannian Gaussian\u2014 a distribution akin the Gaussian, but defined on a Riemannian manifold. A Riev mannian manifold is a manifold\u2014a curved space which is locally Euclidean\u2014that provides a notion of distance. This notion is essential for statistical methods as such methods rely on a distance measure. Examples of Riemannian manifolds in robotics are: the Euclidean spacewhich is used for spatial data, forces or torques; the spherical manifolds, which can be used for orientation data defined as unit quaternions; and Symmetric Positive Definite (SPD) manifolds, which can be used to represent stiffness and manipulability. The Riemannian Gaussian is intrinsically geometry-aware. Its definition is based on the geometry of the manifold, and therefore takes into account the manifold curvature. In robotics, the manifold structure is often known beforehand. In the case of PbD, it follows from the structure of the demonstration data. Like the Gaussian distribution, the Riemannian Gaussian is defined by a mean and covariance. The covariance describes the variance and correlation among the state variables. These can be interpreted as local functional couplings among state variables: synergies. This makes the Riemannian Gaussian synergetic. Furthermore, information encoded in multiple Riemannian Gaussians can be fused using the Riemannian product of Gaussians. This feature allows us to construct a probabilistic context-adaptive task representation. CONTRIBUTIONS In particular, this thesis presents a generalization of existing methods of PbD, namely GMM-GMR and TP-GMM. This generalization involves the definition ofMaximum Likelihood Estimate (MLE), Gaussian conditioning and Gaussian product for the Riemannian Gaussian, and the definition of ExpectationMaximization (EM) and GaussianMixture Regression (GMR) for the Riemannian GMM. In this generalization, we contributed by proposing to use parallel transport for Gaussian conditioning. Furthermore, we presented a unified approach to solve the aforementioned operations using aGauss-Newton algorithm. We demonstrated how synergies, encoded in a Riemannian Gaussian, can be transformed into synergetic control policies using standard methods for LinearQuadratic Regulator (LQR). This is achieved by formulating the LQR problem in a (Euclidean) tangent space of the Riemannian manifold. Finally, we demonstrated how the contextadaptive Task-Parameterized Gaussian Mixture Model (TP-GMM) can be used for context inference\u2014the ability to extract context from demonstration data of known tasks. Our approach is the first attempt of context inference in the light of TP-GMM. Although effective, we showed that it requires further improvements in terms of speed and reliability. The efficacy of the Riemannian approach is demonstrated in a variety of scenarios. In shared control, the Riemannian Gaussian is used to represent control intentions of a human operator and an assistive system. Doing so, the properties of the Gaussian can be employed to mix their control intentions. This yields shared-control systems that continuously re-evaluate and assign control authority based on input confidence. The context-adaptive TP-GMMis demonstrated in a Pick & Place task with changing pick and place locations, a box-taping task with changing box sizes, and a trajectory tracking task typically found in industr

    Advanced Strategies for Robot Manipulators

    Get PDF
    Amongst the robotic systems, robot manipulators have proven themselves to be of increasing importance and are widely adopted to substitute for human in repetitive and/or hazardous tasks. Modern manipulators are designed complicatedly and need to do more precise, crucial and critical tasks. So, the simple traditional control methods cannot be efficient, and advanced control strategies with considering special constraints are needed to establish. In spite of the fact that groundbreaking researches have been carried out in this realm until now, there are still many novel aspects which have to be explored

    Divergent Criticality – A Mechanism of Neural Function for Perception and Learning

    Get PDF
    The natural world presents opportunities to all organisms as they compete for the biological-value afforded to them through their ecological engagement. This presents two fundamental requirements for perceiving such opportunities: to be able to recognise value and learning how to access new value. Though many theoretical accounts of how we might achieve such selectionist ends have been explored – how ‘perception’ and ‘learning’ resonate with life’s challenges and opportunities, to date, no explanation has yet been able to naturalise such perception adequately in the Universal laws that govern our existence – not only for explaining the human experience of the world, but in exploring the true nature of our perception. This thesis explores our perceptions of engaging with the world and seeks to explain how the demands of our experiences resonate with the efficient functioning of our brain. It proposes, that in a world of challenge and opportunity, rather than the efficient functioning of our neural resources, it is, instead, the optimising of ‘learning’ that is selected for, as an evolutionary priority. Building on existing literature in the fields of Phenomenology, Free Energy and Neuroscience, this thesis considers perception and learning as synonymous with the cognitive constructs of an ‘attention’ tuned for learning optimisation, and explores the processes of learning in neural function. It addresses the philosophical issues of how an individual’s perception of subjective experiences, might provide some empirical objectivity in proposing a ‘Tolerance’ hypothesis. This is a relative definition able to coordinate a ‘perception of experience’ in terms of an learning-function, grounded in free-energy theory (the laws of physics) and the ecological dynamics of a spontaneous or ‘self- organising’ mechanism – Divergent Criticality. The methodology incorporated three studies: Pilot, Developmental and Exploratory. Over the three studies, Divergent Criticality was tested by developing a functional Affordance measure to address the Research Question – are perceptions as affective-cognitions made aware as reflecting the agential mediation of a self-regulating, optimal learning mechanism? Perception questionnaires of Situational Interest and Self-concept were used in Study One and Study Two to investigate their suitability in addressing the Research Question. Here, Factor Analysis and Structural Equation Modelling assessed the validity and reliability of these measures, developing robust questionnaires and a research design for testing Divergent Criticality. In Study Three, the Divergent Criticality hypothesis was found to be significant, supporting that a Divergent Criticality mechanism is in operation: When individuals are engaging with dynamic ecological challenges, perception is affective in accordance with Tolerance Optimisation, demonstrating that a Divergent Criticality mechanism is driving individuals to the limits of their Effectivity – an optimal learning state which is fundamental to life and naturalised in Universal laws

    Change blindness: eradication of gestalt strategies

    Get PDF
    Arrays of eight, texture-defined rectangles were used as stimuli in a one-shot change blindness (CB) task where there was a 50% chance that one rectangle would change orientation between two successive presentations separated by an interval. CB was eliminated by cueing the target rectangle in the first stimulus, reduced by cueing in the interval and unaffected by cueing in the second presentation. This supports the idea that a representation was formed that persisted through the interval before being 'overwritten' by the second presentation (Landman et al, 2003 Vision Research 43149–164]. Another possibility is that participants used some kind of grouping or Gestalt strategy. To test this we changed the spatial position of the rectangles in the second presentation by shifting them along imaginary spokes (by ±1 degree) emanating from the central fixation point. There was no significant difference seen in performance between this and the standard task [F(1,4)=2.565, p=0.185]. This may suggest two things: (i) Gestalt grouping is not used as a strategy in these tasks, and (ii) it gives further weight to the argument that objects may be stored and retrieved from a pre-attentional store during this task

    Matching and compressing sequences of visual hulls

    Get PDF
    Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2004.Includes bibliographical references (p. 61-63).In this thesis, we implement the polyhedral visual hull (PVH) algorithm in a modular software system to reconstruct 3D meshes from 2D images and camera poses. We also introduce the new idea of visual hull graphs. For data, using an eight camera synchronous system after multi-camera calibration, we collect video sequences to study the pose and motion of people. For efficiency in VH processing, we compress 2D input contours to reduce te number of triangles in the output mesh and demonstrate how subdivision surfaces smoothly approximate the irregular output mesh in 3D. After generating sequences of visual hulls from source video, to define a visual hull graph, we use a simple distance metric for pose by calculating Chamfer distances between 2D shape contours. At each frame of our graph, we store a view independent 3D pose and calculate the transition probability to any other frame based on similarity of pose. To test our approach, we synthesize new realistic motion by walking through cycles in the graph. Our results are new videos of arbitrary length and viewing direction based on a sample source video.by Naveen Goela.M.Eng

    2D-3D Pose Tracking of Rigid Instruments in Minimally Invasive Surgery

    Get PDF
    Instrument localization and tracking is an important challenge for advanced computer assisted techniques in minimally invasive surgery and image-based solutions to instrument localization can provide a non-invasive, low cost solution. In this study, we present a novel algorithm capable of recovering the 3D pose of laparoscopic surgical instruments combining constraints from a classification algorithm, multiple point features, stereo views (when available) and a linear motion model to robustly track the tool in surgical videos. We demonstrate the improved robustness and performance of our algorithm with optically tracked ground truth and additionally qualitatively demonstrate its performance on in vivo images. © 2014 Springer International Publishing Switzerland

    Breaking the Curse of Dimensionality in Deep Neural Networks by Learning Invariant Representations

    Full text link
    Artificial intelligence, particularly the subfield of machine learning, has seen a paradigm shift towards data-driven models that learn from and adapt to data. This has resulted in unprecedented advancements in various domains such as natural language processing and computer vision, largely attributed to deep learning, a special class of machine learning models. Deep learning arguably surpasses traditional approaches by learning the relevant features from raw data through a series of computational layers. This thesis explores the theoretical foundations of deep learning by studying the relationship between the architecture of these models and the inherent structures found within the data they process. In particular, we ask What drives the efficacy of deep learning algorithms and allows them to beat the so-called curse of dimensionality-i.e. the difficulty of generally learning functions in high dimensions due to the exponentially increasing need for data points with increased dimensionality? Is it their ability to learn relevant representations of the data by exploiting their structure? How do different architectures exploit different data structures? In order to address these questions, we push forward the idea that the structure of the data can be effectively characterized by its invariances-i.e. aspects that are irrelevant for the task at hand. Our methodology takes an empirical approach to deep learning, combining experimental studies with physics-inspired toy models. These simplified models allow us to investigate and interpret the complex behaviors we observe in deep learning systems, offering insights into their inner workings, with the far-reaching goal of bridging the gap between theory and practice.Comment: PhD Thesis @ EPF
    corecore