8 research outputs found

    Realization of multi-input/multi-output switched linear systems from Markov parameters

    Full text link
    This paper presents a four-stage algorithm for the realization of multi-input/multi-output (MIMO) switched linear systems (SLSs) from Markov parameters. In the first stage, a linear time-varying (LTV) realization that is topologically equivalent to the true SLS is derived from the Markov parameters assuming that the submodels have a common MacMillan degree and a mild condition on their dwell times holds. In the second stage, zero sets of LTV Hankel matrices where the realized system has a linear time-invariant (LTI) pulse response matching that of the original SLS are exploited to extract the submodels, up to arbitrary similarity transformations, by a clustering algorithm using a statistics that is invariant to similarity transformations. Recovery is shown to be complete if the dwell times are sufficiently long and some mild identifiability conditions are met. In the third stage, the switching sequence is estimated by three schemes. The first scheme is based on forward/backward corrections and works on the short segments. The second scheme matches Markov parameter estimates to the true parameters for LTV systems and works on the medium-to-long segments. The third scheme also matches Markov parameters, but for LTI systems only and works on the very short segments. In the fourth stage, the submodels estimated in Stage~2 are brought to a common basis by applying a novel basis transformation method which is necessary before performing output predictions to given inputs. A numerical example illustrates the properties of the realization algorithm. A key role in this algorithm is played by time-dependent switching sequences that partition the state-space according to time, unlike many other works in the literature in which partitioning is state and/or input dependent

    Robust learning of probabilistic hybrid models

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Aeronautics and Astronautics, 2008.Includes bibliographical references (p. 125-127).Advances in autonomy, in the fields of control, estimation, and diagnosis, have improved immensely, as seen by spacecraft that navigate toward pinpoint landings, or speech recognition enabled in hand-held devices. Arguably the most important step to controlling and improving a system, is to understand that system. For this reason, accurate models are essential for continued advancements in the field of autonomy. Hybrid stochastic models, such as JMLS and LPHA, allow for representational accuracy of a general scope of problems. The goal of this thesis is to develop a robust method for learning accurate hybrid models automatically from data. A robust method should learn a set of model parameters, but should also avoid convergence to locally optimal solutions that reduce accuracy, and should be less sensitive to sparse or poor quality observation data. These three goals are the focus of this thesis. We present the HML-LPHA algorithm that uses approximate EM for learning maximum likelihood model parameters of LPHA, given a sequence of control inputs {u}0T, and outputs, {y}T+I 1 We implement the algorithm in a scenario that simulates the mechanical wheel failure of the MER Spirit rover wheel and demonstrate empirical convergence of the algorithm. Local convergence is a limitation of many optimization approaches for multimodal functions, including EM. For model learning, this can mean a severe compromise in accuracy. We present the kMeans-EM algorithm, that iteratively learns the locations and shapes of explored local maxima of our model likelihood function, and focuses the search away from these areas of the solution space toward undiscovered maxima that are promising apriori. We find the kMeans-EM algorithm demonstrates iteratively increasing improvement over a Random Restarts method with respect to learning sets of model parameters with higher likelihood values, and reducing Euclidean distance to the true set of model parameters. Lastly, the AHML-LPHA algorithm is an active hybrid model learning approach that augments sparse, and/or very noisy training data, with limited queries of the discrete state.(cont.) We use an active approach for adding data to our training set, where we query at points that obtain the greatest reduction in uncertainty of the distribution over the hybrid state trajectories. Empirical evidence indicates that querying only 6% of the time reduces continous state squared error and MAP mode estimate error of the discrete state. We also find that when the passive learner, HML-LPHA, diverges due to poor initialization or training data, the AHML-LPHA algorithm is capable of convergence; at times, just one query allows for convergence, demonstrating a vast improvement in learning capacity with a very limited amount of data augmentation.by Stephanie Gil.S.M

    Risk-minimizing program execution in robotic domains

    Get PDF
    Thesis (Sc. D.)--Massachusetts Institute of Technology, Dept. of Aeronautics and Astronautics, 2012.Cataloged from PDF version of thesis.Includes bibliographical references (p. 153-161).In this thesis, we argue that autonomous robots operating in hostile and uncertain environments can improve robustness by computing and reasoning explicitly about risk. Autonomous robots with a keen sensitivity to risk can be trusted with critical missions, such as exploring deep space and assisting on the battlefield. We introduce a novel, risk-minimizing approach to program execution that utilizes program flexibility and estimation of risk in order to make runtime decisions that minimize the probability of program failure. Our risk-minimizing executive, called Murphy, utilizes two forms of program flexibility, 1) flexible scheduling of activity timing, and 2) redundant choice between subprocedures, in order to minimize two forms of program risk, 1) exceptions arising from activity failures, and 2) exceptions arising from timing constraint violations in a program. Murphy takes two inputs, a program written in a nondeterministic variant of the Reactive Model-based Programming Language (RMPL) and a set of stochastic activity failure models, one for each activity in a program, and computes two outputs, a risk-minimizing decision policy and value function. The decision policy informs Murphy which decisions to make at runtime in order to minimize risk, while the value function quantifies risk. In order to execute with low latency, Murphy computes the decision policy and value function offline, as a compilation step prior to program execution. In this thesis, we develop three approaches to RMPL program execution. First, we develop an approach that is guaranteed to minimize risk. For this approach, we reason probabilistically about risk by framing program execution as a Markov Decision Process (MDP). Next, we develop an approach that avoids risk altogether. For this approach, we frame program execution as a novel form of constraint-based temporal reasoning. Finally, we develop an execution approach that trades optimality in risk avoidance for tractability. For this approach, we leverage prior work in hierarchical decomposition of MDPs in order to mitigate complexity. We benchmark the tractability of each approach on a set of representative RMPL programs, and we demonstrate the applicability of the approach on a humanoid robot simulator.by Robert T. Effinger, IV.Sc.D

    Model learning for switching linear systems with autonomous mode transitions

    No full text
    Abstract — We present a novel method for model learning in hybrid discrete-continuous systems. The approach uses approximate Expectation-Maximization to learn the Maximum-Likelihood parameters of a switching linear system. The approach extends previous work by 1) considering autonomous mode transitions, where the discrete transitions are conditioned on the continuous state, and 2) learning the effects of control inputs on the system. We demonstrate the approach in simulation. I

    Compact parametric models for efficient sequential decision making in high-dimensional, uncertain domains

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.Cataloged from PDF version of thesis.Includes bibliographical references (p. 137-144).Within artificial intelligence and robotics there is considerable interest in how a single agent can autonomously make sequential decisions in large, high-dimensional, uncertain domains. This thesis presents decision-making algorithms for maximizing the expected sum of future rewards in two types of large, high-dimensional, uncertain situations: when the agent knows its current state but does not have a model of the world dynamics within a Markov decision process (MDP) framework, and in partially observable Markov decision processes (POMDPs), when the agent knows the dynamics and reward models, but only receives information about its state through its potentially noisy sensors. One of the key challenges in the sequential decision making field is the tradeoff between optimality and tractability. To handle high-dimensional (many variables), large (many potential values per variable) domains, an algorithm must have a computational complexity that scales gracefully with the number of dimensions. However, many prior approaches achieve such scalability through the use of heuristic methods with limited or no guarantees on how close to optimal, and under what circumstances, are the decisions made by the algorithm. Algorithms that do provide rigorous optimality bounds often do so at the expense of tractability. This thesis proposes that the use of parametric models of the world dynamics, rewards and observations can enable efficient, provably close to optimal, decision making in large, high-dimensional uncertain environments.(cont.) In support of this, we present a reinforcement learning (RL) algorithm where the use of a parametric model allows the algorithm to make close to optimal decisions on all but a number of samples that scales polynomially with the dimension, a significant improvement over most prior RL provably approximately optimal algorithms. We also show that parametric models can be used to reduce the computational complexity from an exponential to polynomial dependence on the state dimension in forward search partially observable MDP planning. Under mild conditions our new forward-search POMDP planner maintains prior optimality guarantees on the resulting decisions. We present experimental results on a robot navigation over varying terrain RL task and a large global driving POMDP planning simulation.by Emma Patricia Brunskill.Ph.D

    Efficient planning under uncertainty with macro-actions

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Aeronautics and Astronautics, 2010.Cataloged from PDF version of thesis.Includes bibliographical references (p. 163-168).Planning in large, partially observable domains is challenging, especially when good performance requires considering situations far in the future. Existing planners typically construct a policy by performing fully conditional planning, where each future action is conditioned on a set of possible observations that could be obtained at every timestep. Unfortunately, fully-conditional planning can be computationally expensive, and state-of-the-art solvers are either limited in the size of problems that can be solved, or can only plan out to a limited horizon. We propose that for a large class of real-world, planning under uncertainty problems, it is necessary to perform far-lookahead decision-making, but unnecessary to construct policies that condition all actions on observations obtained at the previous timestep. Instead, these problems can be solved by performing semi conditional planning, where the constructed policy only conditions actions on observations at certain key points. Between these key points, the policy assumes that a macro-action - a temporally-extended, fixed length, open-loop action sequence, comprising a series of primitive actions, is executed. These macro-actions are evaluated within a forward-search framework, which only considers beliefs that are reachable from the agent's current belief under different actions and observations; a belief summarizes an agent's past history of actions and observations. Together, semi-conditional planning in a forward search manner restricts the policy space in exchange for conditional planning out to a longer-horizon. Two technical challenges have to be overcome in order to perform semi-conditional planning efficiently - how the macro-actions can be automatically generated, as well as how to efficiently incorporate the macro action into the forward search framework. We propose an algorithm which automatically constructs the macro-actions that are evaluated within a forward search planning framework, iteratively refining the macro actions as more computation time is made available for planning. In addition, we show that for a subset of problem domains, it is possible to analytically compute the distribution over posterior beliefs that result from a single macro-action. This ability to directly compute a distribution over posterior beliefs enables us to enjoy computational savings when performing macro-action forward search. Performance and computational analysis for the algorithms proposed in this thesis are presented, as well as simulation experiments that demonstrate superior performance relative to existing state-of-the-art solvers on large planning under uncertainty domains. We also demonstrate our planning under uncertainty algorithms on target-tracking applications for an actual autonomous helicopter, highlighting the practical potential for planning in real-world, long-horizon, partially observable domains.by Ruijie He.Ph.D
    corecore