63 research outputs found

    Adaptive and Robust Multi-task Learning

    Full text link
    We study the multi-task learning problem that aims to simultaneously analyze multiple datasets collected from different sources and learn one model for each of them. We propose a family of adaptive methods that automatically utilize possible similarities among those tasks while carefully handling their differences. We derive sharp statistical guarantees for the methods and prove their robustness against outlier tasks. Numerical experiments on synthetic and real datasets demonstrate the efficacy of our new methods.Comment: 69 pages, 2 figure

    Policy evaluation from a single path: Multi-step methods, mixing and mis-specification

    Full text link
    We study non-parametric estimation of the value function of an infinite-horizon γ\gamma-discounted Markov reward process (MRP) using observations from a single trajectory. We provide non-asymptotic guarantees for a general family of kernel-based multi-step temporal difference (TD) estimates, including canonical KK-step look-ahead TD for K=1,2,…K = 1, 2, \ldots and the TD(λ)(\lambda) family for λ∈[0,1)\lambda \in [0,1) as special cases. Our bounds capture its dependence on Bellman fluctuations, mixing time of the Markov chain, any mis-specification in the model, as well as the choice of weight function defining the estimator itself, and reveal some delicate interactions between mixing time and model mis-specification. For a given TD method applied to a well-specified model, its statistical error under trajectory data is similar to that of i.i.d. sample transition pairs, whereas under mis-specification, temporal dependence in data inflates the statistical error. However, any such deterioration can be mitigated by increased look-ahead. We complement our upper bounds by proving minimax lower bounds that establish optimality of TD-based methods with appropriately chosen look-ahead and weighting, and reveal some fundamental differences between value function estimation and ordinary non-parametric regression

    PU-Flow: a Point Cloud Upsampling Network with Normalizing Flows

    Full text link
    Point cloud upsampling aims to generate dense point clouds from given sparse ones, which is a challenging task due to the irregular and unordered nature of point sets. To address this issue, we present a novel deep learning-based model, called PU-Flow, which incorporates normalizing flows and weight prediction techniques to produce dense points uniformly distributed on the underlying surface. Specifically, we exploit the invertible characteristics of normalizing flows to transform points between Euclidean and latent spaces and formulate the upsampling process as ensemble of neighbouring points in a latent space, where the ensemble weights are adaptively learned from local geometric context. Extensive experiments show that our method is competitive and, in most test cases, it outperforms state-of-the-art methods in terms of reconstruction quality, proximity-to-surface accuracy, and computation efficiency. The source code will be publicly available at https://github.com/unknownue/pu-flow

    Engraftment of engineered ES cell–derived cardiomyocytes but not BM cells restores contractile function to the infarcted myocardium

    Get PDF
    Cellular cardiomyoplasty is an attractive option for the treatment of severe heart failure. It is, however, still unclear and controversial which is the most promising cell source. Therefore, we investigated and examined the fate and functional impact of bone marrow (BM) cells and embryonic stem cell (ES cell)–derived cardiomyocytes after transplantation into the infarcted mouse heart. This proved particularly challenging for the ES cells, as their enrichment into cardiomyocytes and their long-term engraftment and tumorigenicity are still poorly understood. We generated transgenic ES cells expressing puromycin resistance and enhanced green fluorescent protein cassettes under control of a cardiac-specific promoter. Puromycin selection resulted in a highly purified (>99%) cardiomyocyte population, and the yield of cardiomyocytes increased 6–10-fold because of induction of proliferation on purification. Long-term engraftment (4–5 months) was observed when co-transplanting selected ES cell–derived cardiomyocytes and fibroblasts into the injured heart of syngeneic mice, and no teratoma formation was found (n = 60). Although transplantation of ES cell–derived cardiomyocytes improved heart function, BM cells had no positive effects. Furthermore, no contribution of BM cells to cardiac, endothelial, or smooth muscle neogenesis was detected. Hence, our results demonstrate that ES-based cell therapy is a promising approach for the treatment of impaired myocardial function and provides better results than BM-derived cells

    Policy Evaluation in Batch Reinforcement Learning

    No full text
    Policy evaluation is a central problem in batch reinforcement learning. It refers to the assessment of a given decision policy using logged data. Practical methods for policy evaluation typically involve some form of function approximation so as to relieve issues caused by the enormous scales of real-world systems and the length of planning horizons. The thesis is primarily concerned with statistical analysis of policy evaluation and show how function approximation improves the efficacy of methods. In the first part of the thesis, we consider off-policy evaluation with linear function approximation. We show the equivalence of a regression-based fitted Q-iteration method, marginalized importance sampling methods and a model-based method that estimates a conditional mean embedding of the transition operator. Moreover, our theory reveals that the hardness of off-policy evaluation is determined by the mismatch between data and target distributions, which is reflected by a projected chi-square-divergence in error bounds. We prove that the estimators are minimax optimal in terms of sample size, planning horizon and the mismatch term. In the second part of the thesis, we study on-policy evaluation with kernel methods. In particular, we are interested in a regularized form of the kernel least-squares temporal-difference (LSTD) estimate. We use empirical process theory techniques to derive a non-asymptotic upper bound on the error with explicit dependence on the eigenvalues of the associated kernel operator, as well as the instance-dependent variance of the Bellman residual error. In addition, we prove minimax lower bounds over sub-classes of MRPs, which shows that our rate is optimal in terms of the sample size and the effective horizon. Our analysis sheds light on how we could tune the complexity of function class to favorably strike a balance between the curse of dimension and the curse of horizon in policy evaluation

    Investigate liquid perception practice of flood mitigation in Pearl River Delta, Guangzhou

    No full text
    Thesis (Undergrad) -- University of Melbourne, Faculty of Architecture, Building and Plannin
    • …
    corecore