Search CORE

3,615 research outputs found

Generating Long-term Trajectories Using Deep Hierarchical Networks

Author: Lucey Patrick
Yue Yisong
Zheng Stephan
Publication venue
Publication date: 01/12/2016
Field of study

We study the problem of modeling spatiotemporal trajectories over long time horizons using expert demonstrations. For instance, in sports, agents often choose action sequences with long-term goals in mind, such as achieving a certain strategic position. Conventional policy learning approaches, such as those based on Markov decision processes, generally fail at learning cohesive long-term behavior in such high-dimensional state spaces, and are only effective when myopic modeling lead to the desired behavior. The key difficulty is that conventional approaches are "shallow" models that only learn a single state-action policy. We instead propose a hierarchical policy class that automatically reasons about both long-term and short-term goals, which we instantiate as a hierarchical neural network. We showcase our approach in a case study on learning to imitate demonstrated basketball trajectories, and show that it generates significantly more realistic trajectories compared to non-hierarchical baselines as judged by professional sports analysts.Comment: Published in NIPS 201

arXiv.org e-Print Archive

Caltech Authors

Multi-resolution Tensor Learning for Large-Scale Spatial Data

Author: Yu Rose
Yue Yisong
Zheng Stephan
Publication venue
Publication date: 19/02/2018
Field of study

High-dimensional tensor models are notoriously computationally expensive to train. We present a meta-learning algorithm, MMT, that can significantly speed up the process for spatial tensor models. MMT leverages the property that spatial data can be viewed at multiple resolutions, which are related by coarsening and finegraining from one resolution to another. Using this property, MMT learns a tensor model by starting from a coarse resolution and iteratively increasing the model complexity. In order to not "over-train" on coarse resolution models, we investigate an information-theoretic fine-graining criterion to decide when to transition into higher-resolution models. We provide both theoretical and empirical evidence for the advantages of this approach. When applied to two real-world large-scale spatial datasets for basketball player and animal behavior modeling, our approach demonstrate 3 key benefits: 1) it efficiently captures higher-order interactions (i.e., tensor latent factors), 2) it is orders of magnitude faster than fixed resolution learning and scales to very fine-grained spatial resolutions, and 3) it reliably yields accurate and interpretable models

arXiv.org e-Print Archive

Caltech Authors

Recommended from our members

PTPN22 Silencing in the NOD Model Indicates the Type 1 Diabetes–Associated Allele Is Not a Loss-of-Function Variant

Author: Kissler Stephan
Zheng Peilin
Publication venue: 'American Diabetes Association'
Publication date: 11/04/2014
Field of study

PTPN22 encodes the lymphoid tyrosine phosphatase (LYP) and is the second strongest non-HLA genetic risk factor for type 1 diabetes. The PTPN22 susceptibility allele generates an LYP variant with an arginine-to-tryptophan substitution at position 620 (R620W) that has been reported by several studies to impart a gain of function. However, a recent report investigating both human cells and a knockin mouse model containing the R620W homolog suggested that this variation causes faster protein degradation. Whether LYP R620W is a gain- or loss-of-function variant, therefore, remains controversial. To address this issue, we generated transgenic NOD mice (nonobese diabetic) in which Ptpn22 can be inducibly silenced by RNA interference. We found that Ptpn22 silencing in the NOD model replicated many of the phenotypes observed in C57BL/6 Ptpn22 knockout mice, including an increase in regulatory T cells. Notably, loss of Ptpn22 led to phenotypic changes in B cells opposite to those reported for the human susceptibility allele. Furthermore, Ptpn22 knockdown did not increase the risk of autoimmune diabetes but, rather, conferred protection from disease. Overall, to our knowledge, this is the first functional study of Ptpn22 within a model of type 1 diabetes, and the data do not support a loss of function for the PTPN22 disease variant

Harvard University - DASH

Long-term Forecasting using Tensor-Train RNNs

Author: Anandkumar Anima
Yu Rose
Yue Yisong
Zheng Stephan
Publication venue
Publication date: 31/10/2017
Field of study

We present Tensor-Train RNN (TT-RNN), a novel family of neural sequence architectures for multivariate forecasting in environments with nonlinear dynamics. Long-term forecasting in such systems is highly challenging, since there exist long-term temporal dependencies, higher-order correlations and sensitivity to error propagation. Our proposed tensor recurrent architecture addresses these issues by learning the nonlinear dynamics directly using higher order moments and high-order state transition functions. Furthermore, we decompose the higher-order structure using the tensor-train (TT) decomposition to reduce the number of parameters while preserving the model performance. We theoretically establish the approximation properties of Tensor-Train RNNs for general sequence inputs, and such guarantees are not available for usual RNNs. We also demonstrate significant long-term prediction improvements over general RNN and LSTM architectures on a range of simulated environments with nonlinear dynamics, as well on real-world climate and traffic data

Caltech Authors

MERMAIDE: Learning to Align Learners using Model-Based Meta-Learning

Author: Banerjee Arundhati
Ermon Stefano
Phade Soham
Zheng Stephan
Publication venue
Publication date: 10/04/2023
Field of study

We study how a principal can efficiently and effectively intervene on the rewards of a previously unseen learning agent in order to induce desirable outcomes. This is relevant to many real-world settings like auctions or taxation, where the principal may not know the learning behavior nor the rewards of real people. Moreover, the principal should be few-shot adaptable and minimize the number of interventions, because interventions are often costly. We introduce MERMAIDE, a model-based meta-learning framework to train a principal that can quickly adapt to out-of-distribution agents with different learning strategies and reward functions. We validate this approach step-by-step. First, in a Stackelberg setting with a best-response agent, we show that meta-learning enables quick convergence to the theoretically known Stackelberg equilibrium at test time, although noisy observations severely increase the sample complexity. We then show that our model-based meta-learning approach is cost-effective in intervening on bandit agents with unseen explore-exploit strategies. Finally, we outperform baselines that use either meta-learning or agent behavior modeling, in both

0

-shot and

K=1

-shot settings with partial agent information

arXiv.org e-Print Archive