112 research outputs found

    Bayesian nonparametric learning of complex dynamical phenomena

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.Cataloged from PDF version of thesis.Includes bibliographical references (p. 257-270).The complexity of many dynamical phenomena precludes the use of linear models for which exact analytic techniques are available. However, inference on standard nonlinear models quickly becomes intractable. In some cases, Markov switching processes, with switches between a set of simpler models, are employed to describe the observed dynamics. Such models typically rely on pre-specifying the number of Markov modes. In this thesis, we instead take a Bayesian nonparametric approach in defining a prior on the model parameters that allows for flexibility in the complexity of the learned model and for development of efficient inference algorithms. We start by considering dynamical phenomena that can be well-modeled as a hidden discrete Markov process, but in which there is uncertainty about the cardinality of the state space. The standard finite state hidden Markov model (HMM) has been widely applied in speech recognition, digital communications, and bioinformatics, amongst other fields. Through the use of the hierarchical Dirichlet process (HDP), one can examine an HMM with an unbounded number of possible states. We revisit this HDPHMM and develop a generalization of the model, the sticky HDP-HMM, that allows more robust learning of smoothly varying state dynamics through a learned bias towards self-transitions. We show that this sticky HDP-HMM not only better segments data according to the underlying state sequence, but also improves the predictive performance of the learned model. Additionally, the sticky HDP-HMM enables learning more complex, multimodal emission distributions.(cont.) We demonstrate the utility of the sticky HDP-HMM on the NIST speaker diarization database, segmenting audio files into speaker labels while simultaneously identifying the number of speakers present. Although the HDP-HMM and its sticky extension are very flexible time series models, they make a strong Markovian assumption that observations are conditionally independent given the discrete HMM state. This assumption is often insufficient for capturing the temporal dependencies of the observations in real data. To address this issue, we develop extensions of the sticky HDP-HMM for learning two classes of switching dynamical processes: the switching linear dynamical system (SLDS) and the switching vector autoregressive (SVAR) process. These conditionally linear dynamical models can describe a wide range of complex dynamical phenomena from the stochastic volatility of financial time series to the dance of honey bees, two examples we use to show the power and flexibility of our Bayesian nonparametric approach. For all of the presented models, we develop efficient Gibbs sampling algorithms employing a truncated approximation to the HDP that allows incorporation of dynamic programming techniques, greatly improving mixing rates. In many applications, one would like to discover and model dynamical behaviors which are shared among several related time series. By jointly modeling such sequences, we may more robustly estimate representative dynamic models, and also uncover interesting relationships among activities.(cont.) In the latter part of this thesis, we consider a Bayesian nonparametric approach to this problem by harnessing the beta process to allow each time series to have infinitely many potential behaviors, while encouraging sharing of behaviors amongst the time series. For this model, we develop an efficient and exact Markov chain Monte Carlo (MCMC) inference algorithm. In particular, we exploit the finite dynamical system induced by a fixed set of behaviors to efficiently compute acceptance probabilities, and reversible jump birth and death proposals to explore new behaviors. We present results on unsupervised segmentation of data from the CMU motion capture database.by Emily B. Fox.Ph.D

    Machine Learning for Robot Grasping and Manipulation

    Get PDF
    Robotics as a technology has an incredible potential for improving our everyday lives. Robots could perform household chores, such as cleaning, cooking, and gardening, in order to give us more time for other pursuits. Robots could also be used to perform tasks in hazardous environments, such as turning off a valve in an emergency or safely sorting our more dangerous trash. However, all of these applications would require the robot to perform manipulation tasks with various objects. Today's robots are used primarily for performing specialized tasks in controlled scenarios, such as manufacturing. The robots that are used in today's applications are typically designed for a single purpose and they have been preprogrammed with all of the necessary task information. In contrast, a robot working in a more general environment will often be confronted with new objects and scenarios. Therefore, in order to reach their full potential as autonomous physical agents, robots must be capable of learning versatile manipulation skills for different objects and situations. Hence, we have worked on a variety of manipulation skills to improve those capabilities of robots, and the results have lead to several new approaches, which are presented in this thesis Learning manipulation skills is, however, an open problem with many challenges that still need to be overcome. The first challenge is to acquire and improve manipulation skills with little to no human supervision. Rather than being preprogrammed, the robot should be able to learn from human demonstrations and through physical interactions with objects. Learning to improve skills through trial and error learning is a particularly important ability for an autonomous robot, as it allows the robot to handle new situations. This ability also removes the burden from the human demonstrator to teach a skill perfectly, as a robot is allowed to make mistakes if it can learn from them. In order to address this challenge, we present a continuum-armed bandits approach for learning to grasp objects. The robot learns to predict the performances of different grasps, as well as how certain it is of this prediction, and selects grasps accordingly. As the robot tries more grasps, its predictions become more accurate, and its grasps improve accordingly. A robot can master a manipulation skill by learning from different objects in various scenarios. Another fundamental challenge is therefore to efficiently generalize manipulations between different scenarios. Rather than relearning from scratch, the robot should find similarities between the current situation and previous scenarios in order to reuse manipulation skills and task information. For example, the robot can learn to adapt manipulation skills to new objects by finding similarities between them and known objects. However, only some similarities between objects will be relevant for a given manipulation. The robot must therefore also learn which similarities are important for adapting the manipulation skill. We present two object representations for generalizing between different situations. Contacts between objects are important for many manipulations, but it is difficult to define general features for representing sets of contacts. Instead, we define a kernel function for comparing contact distributions, which allows the robot to use kernel methods for learning manipulations. The second approach is to use warped parameters to define more abstract features, such as areas and volumes. These features are defined as functions of known object models. The robot can compute these parameters for novel objects by warping the shape of the known object to match the unknown object. Learning about objects also requires the robot to reconcile information from multiple sensor modalities, including touch, hearing, and vision. While some object properties will only be observed by specific sensor modalities, other object properties can be determined from multiple sensor modalities. For example, while color can only be determined by vision, the shape of an object can be observed using vision or touch. The robot should use information from all of its senses in order to quickly learn about objects. We explain how the robot can learn low-dimensional representations of tactile data by incorporating cues from vision data. As touching an object usually occludes the surface, the proposed method was designed to work with weak pairings between the data in the two sensor modalities. The robot can also learn more efficiently if it reuses skills between different tasks. Rather than relearn a skill for each new task, the robot should learn manipulation skills that can be reused for multiple tasks. For an autonomous robot, this would require the robot to divide tasks into smaller steps. Dividing tasks into smaller parts makes it easier to learn the corresponding skills. If a step is a part of many tasks, then the robot will have more opportunities to practice the associated skill, and more tasks will benefit from the resulting performance improvement. In order to learn a set of useful subtasks, we propose a probabilistic model for dividing manipulations into phases. This model captures the conditions for transitioning between different phases, which represent subgoals and constraints of the overall tasks. The robot can use the model together with model-based reinforcement learning in order to learn skills for moving between phases. When confronted with a new task, the robot will have to select a suitable sequence of skills to execute. The robot must therefore also learn to select which manipulation to execute in the current scenario. Selecting sequences of motor primitives is difficult, as the robot must take into consideration the current task, state, and future actions when selecting the next motor skill to execute. We therefore present a value function method for selecting skills in an optimal manner. The robot learns the value function for the continuous state space using a flexible non-parametric model-based approach. Learning manipulation skills also poses certain challenges for learning methods. The robot will not have thousands of samples when learning a new manipulation skill, and must instead actively collect new samples or use data from similar scenarios. The learning methods presented in this thesis are, therefore, designed to work with relatively small amounts of data, and can generally be used during the learning process. Manipulation tasks also present a spectrum of different problem types. Hence, we present supervised, unsupervised, and reinforcement learning approaches in order to address the diverse challenges of learning manipulations skills

    Representing and Inferring Visual Perceptual Skills in Dermatological Image Understanding

    Get PDF
    Experts have a remarkable capability of locating, perceptually organizing, identifying, and categorizing objects in images specific to their domains of expertise. Eliciting and representing their visual strategies and some aspects of domain knowledge will benefit a wide range of studies and applications. For example, image understanding may be improved through active learning frameworks by transferring human domain knowledge into image-based computational procedures, intelligent user interfaces enhanced by inferring dynamic informational needs in real time, and cognitive processing analyzed via unveiling the engaged underlying cognitive processes. An eye tracking experiment was conducted to collect both eye movement and verbal narrative data from three groups of subjects with different medical training levels or no medical training in order to study perceptual skill. Each subject examined and described 50 photographical dermatological images. One group comprised 11 board-certified dermatologists (attendings), another group was 4 dermatologists in training (residents), and the third group 13 novices (undergraduate students with no medical training). We develop a novel hierarchical probabilistic framework to discover the stereotypical and idiosyncratic viewing behaviors exhibited by the three expertise-specific groups. A hidden Markov model is used to describe each subject\u27s eye movement sequence combined with hierarchical stochastic processes to capture and differentiate the discovered eye movement patterns shared by multiple subjects\u27 eye movement sequences within and among the three expertise-specific groups. Through these patterned eye movement behaviors we are able to elicit some aspects of the domain-specific knowledge and perceptual skill from the subjects whose eye movements are recorded during diagnostic reasoning processes on medical images. Analyzing experts\u27 eye movement patterns provides us insight into cognitive strategies exploited to solve complex perceptual reasoning tasks. Independent experts\u27 annotations of diagnostic conceptual units of thought in the transcribed verbal narratives are time-aligned with discovered eye movement patterns to help interpret the patterns\u27 meanings. By mapping eye movement patterns to thought units, we uncover the relationships between visual and linguistic elements of their reasoning and perceptual processes, and show the manner in which these subjects varied their behaviors while parsing the images

    Connected Attribute Filtering Based on Contour Smoothness

    Get PDF

    Statistical and image analysis methods and applications

    Get PDF

    A comparison of the CAR and DAGAR spatial random effects models with an application to diabetics rate estimation in Belgium

    Get PDF
    When hierarchically modelling an epidemiological phenomenon on a finite collection of sites in space, one must always take a latent spatial effect into account in order to capture the correlation structure that links the phenomenon to the territory. In this work, we compare two autoregressive spatial models that can be used for this purpose: the classical CAR model and the more recent DAGAR model. Differently from the former, the latter has a desirable property: its ρ parameter can be naturally interpreted as the average neighbor pair correlation and, in addition, this parameter can be directly estimated when the effect is modelled using a DAGAR rather than a CAR structure. As an application, we model the diabetics rate in Belgium in 2014 and show the adequacy of these models in predicting the response variable when no covariates are available

    A Statistical Approach to the Alignment of fMRI Data

    Get PDF
    Multi-subject functional Magnetic Resonance Image studies are critical. The anatomical and functional structure varies across subjects, so the image alignment is necessary. We define a probabilistic model to describe functional alignment. Imposing a prior distribution, as the matrix Fisher Von Mises distribution, of the orthogonal transformation parameter, the anatomical information is embedded in the estimation of the parameters, i.e., penalizing the combination of spatially distant voxels. Real applications show an improvement in the classification and interpretability of the results compared to various functional alignment methods
    corecore