114 research outputs found

    Solution intervals for variables in spatial RCRCR linkages

    Get PDF
    © 2019. ElsevierAn analytic method to compute the solution intervals for the input variables of spatial RCRCR linkages and their inversions is presented. The input-output equation is formulated as the intersection of a single ellipse with a parameterized family of ellipses, both related with the possible values that certain dual angles determined by the configuration of the mechanism can take. Bounds for the angles of the input pairs of the RCRCR and RRCRC inversions are found by imposing the tangency of two ellipses, what reduces to analyzing the discriminant of a fourth degree polynomial. The bounds for the input pair of the RCRRC inversion is found as the intersection of a single ellipse with the envelope of the parameterized family of ellipses. The method provides the bounds of each of the assembly modes of the mechanism as well as the local extrema that may exist for the input variablePeer ReviewedPostprint (author's final draft

    Online EM with weight-based forgetting

    Get PDF
    In the online version of the EM algorithm introduced by Sato and Ishii (2000), a time-dependent discount factor is introduced for forgetting the effect of the old estimated values obtained with an earlier, inaccurate estimator. In their approach, forgetting is uniformly applied to the estimators of each mixture component depending exclusively on time, irrespective of theweight attributed to each unit for the observed sample. This causes an excessive forgetting in the less frequently sampled regions. To address this problem, we propose a modification of the algorithm that involves a weight-dependent forgetting, different for each mixture component, in which old observations are forgotten according to the actual weight of the new samples used to replace older values. A comparison of the timedependent versus the weight-dependent approach shows that the latter improves the accuracy of the approximation and exhibits much greater stability.Peer Reviewe

    Online EM with weight-based forgetting

    Get PDF
    Resumen del trabajo presentado al seminario celebrado en el Instituto de Robótica e Informática Industrial (IRII-CSIC-UPC) el 20 de noviembre de 2014.Peer Reviewe

    Competitive function approximation for reinforcement learning

    Get PDF
    The application of reinforcement learning to problems with continuous domains requires representing the value function by means of function approximation. We identify two aspects of reinforcement learning that make the function approximation process hard: non-stationarity of the target function and biased sampling. Non-stationarity is the result of the bootstrapping nature of dynamic programming where the value function is estimated using its current approximation. Biased sampling occurs when some regions of the state space are visited too often, causing a reiterated updating with similar values which fade out the occasional updates of infrequently sampled regions. We propose a competitive approach for function approximation where many different local approximators are available at a given input and the one with expectedly best approximation is selected by means of a relevance function. The local nature of the approximators allows their fast adaptation to non-stationary changes and mitigates the biased sampling problem. The coexistence of multiple approximators updated and tried in parallel permits obtaining a good estimation much faster than would be possible with a single approximator. Experiments in different benchmark problems show that the competitive strategy provides a faster and more stable learning than non-competitive approaches.Preprin

    Second order collocation

    Get PDF
    Technical reportCollocation methods for optimal control commonly assume that the system dynamics is expressed as a first order ODE of the form dx/dt = f(x, u, t), where x is the state and u the control vector. However, in many cases, the dynamics involve the second order derivatives of the coordinates: d^2q/t^2 = g(q, dq/dt, u, t), so that, to preserve the first order form, the usual procedure is to introduce one velocity variable for each coordinate and define the state as x = [q,v]T, where q and v are treated as independent variables. As a consequence, the resulting trajectories do not fulfill the mandatory relationship v = dq/dt except at the collocation points, where it is explicitly imposed. We propose a formulation for Trapezoidal and Hermite-Simpson collocation methods adapted to deal directly with second order dynamics without the need to introduce v as independent from q, and granting the consistency of the trajectories for q and v.Preprin

    Reinforcement learning for robot control using probability density estimations

    Get PDF
    Presentado al ICINCO 2010 celebrado en Funchal (Portugal) del 15 al 18 de junio.The successful application of Reinforcement Learning (RL) techniques to robot control is limited by the fact that, in most robotic tasks, the state and action spaces are continuous, multidimensional, and in essence, too large for conventional RL algorithms to work. The well known curse of dimensionality makes infeasible using a tabular representation of the value function, which is the classical approach that provides convergence guarantees. When a function approximation technique is used to generalize among similar states, the convergence of the algorithm is compromised, since updates unavoidably affect an extended region of the domain, that is, some situations are modified in a way that has not been really experienced, and the update may degrade the approximation. We propose a RL algorithm that uses a probability density estimation in the joint space of states, actions and Q-values as a means of function approximation. This allows us to devise an updating approach that, taking into account the local sampling density, avoids an excessive modification of the approximation far from the observed sample.This work was supported by the project 'CONSOLIDER-INGENIO 2010 Multimodal interaction in pattern recognition and computer vision' (V-00069). This research was partially supported by Consolider Ingenio 2010, project CSD2007-00018.Peer Reviewe

    Natural landmark detection for visually-guided robot navigation

    Get PDF
    The main difficulty to attain fully autonomous robot navigation outdoors is the fast detection of reliable visual references, and their subsequent characterization as landmarks for immediate and unambiguous recognition. Aimed at speed, our strategy has been to track salient regions along image streams by just performing on-line pixel sampling. Persistent regions are considered good candidates for landmarks, which are then characterized by a set of subregions with given color and normalized shape. They are stored in a database for posterior recognition during the navigation process. Some experimental results showing landmark-based navigation of the legged robot Lauron III in an outdoor setting are provided.Peer Reviewe

    Body and leg coordination for omni-directional walking in rough terrain

    Get PDF
    International Conference on Climbing and Walking Robots (CLAWAR), 2000, Madrid (España)In this paper, we address the problem of moving a legged robot in an unknown environment according to externally provided driving commands assuming an arbitrary initial position of legs. This general problem is decomposed in three subproblems: first, decide when to lift a given leg, second, choose where to move lifted legs, and third, coordinate body and leg movements to make the robot advance in the desired direction. To solve this third problem, we extend the posture control mechanism introduced in (1) so that body movements are limited to the desired trajectory. The resulting controller performs smooth transitions between different trajectories and can cope with irregular terrain where valid footholds are scarce.This work was supported by the project 'Navegación basada en visión de robots autónomos en entornos no estructurados.' (070-724).Peer Reviewe

    A competitive strategy for function approximation in Q-learning

    Get PDF
    In this work we propose an approach for generalization in continuous domain Reinforcement Learning that, instead of using a single function approximator, tries many different function approximators in parallel, each one defined in a different region of the domain. Associated with each approximator is a relevance function that locally quantifies the quality of its approximation, so that, at each input point, the approximator with highest relevance can be selected. The relevance function is defined using parametric estimations of the variance of the q-values and the density of samples in the input space, which are used to quantify the accuracy and the confidence in the approximation, respectively. These parametric estimations are obtained from a probability density distribution represented as a Gaussian Mixture Model embedded in the input-output space of each approximator. In our experiments, the proposed approach required a lesser number of experiences for learning and produced more stable convergence profiles than when using a single function approximator.Peer ReviewedPreprin
    • …
    corecore