63 research outputs found
Adaptive and Robust Multi-task Learning
We study the multi-task learning problem that aims to simultaneously analyze
multiple datasets collected from different sources and learn one model for each
of them. We propose a family of adaptive methods that automatically utilize
possible similarities among those tasks while carefully handling their
differences. We derive sharp statistical guarantees for the methods and prove
their robustness against outlier tasks. Numerical experiments on synthetic and
real datasets demonstrate the efficacy of our new methods.Comment: 69 pages, 2 figure
Policy evaluation from a single path: Multi-step methods, mixing and mis-specification
We study non-parametric estimation of the value function of an
infinite-horizon -discounted Markov reward process (MRP) using
observations from a single trajectory. We provide non-asymptotic guarantees for
a general family of kernel-based multi-step temporal difference (TD) estimates,
including canonical -step look-ahead TD for and the
TD family for as special cases. Our bounds
capture its dependence on Bellman fluctuations, mixing time of the Markov
chain, any mis-specification in the model, as well as the choice of weight
function defining the estimator itself, and reveal some delicate interactions
between mixing time and model mis-specification. For a given TD method applied
to a well-specified model, its statistical error under trajectory data is
similar to that of i.i.d. sample transition pairs, whereas under
mis-specification, temporal dependence in data inflates the statistical error.
However, any such deterioration can be mitigated by increased look-ahead. We
complement our upper bounds by proving minimax lower bounds that establish
optimality of TD-based methods with appropriately chosen look-ahead and
weighting, and reveal some fundamental differences between value function
estimation and ordinary non-parametric regression
PU-Flow: a Point Cloud Upsampling Network with Normalizing Flows
Point cloud upsampling aims to generate dense point clouds from given sparse
ones, which is a challenging task due to the irregular and unordered nature of
point sets. To address this issue, we present a novel deep learning-based
model, called PU-Flow, which incorporates normalizing flows and weight
prediction techniques to produce dense points uniformly distributed on the
underlying surface. Specifically, we exploit the invertible characteristics of
normalizing flows to transform points between Euclidean and latent spaces and
formulate the upsampling process as ensemble of neighbouring points in a latent
space, where the ensemble weights are adaptively learned from local geometric
context. Extensive experiments show that our method is competitive and, in most
test cases, it outperforms state-of-the-art methods in terms of reconstruction
quality, proximity-to-surface accuracy, and computation efficiency. The source
code will be publicly available at https://github.com/unknownue/pu-flow
Engraftment of engineered ES cell–derived cardiomyocytes but not BM cells restores contractile function to the infarcted myocardium
Cellular cardiomyoplasty is an attractive option for the treatment of severe heart failure. It is, however, still unclear and controversial which is the most promising cell source. Therefore, we investigated and examined the fate and functional impact of bone marrow (BM) cells and embryonic stem cell (ES cell)–derived cardiomyocytes after transplantation into the infarcted mouse heart. This proved particularly challenging for the ES cells, as their enrichment into cardiomyocytes and their long-term engraftment and tumorigenicity are still poorly understood. We generated transgenic ES cells expressing puromycin resistance and enhanced green fluorescent protein cassettes under control of a cardiac-specific promoter. Puromycin selection resulted in a highly purified (>99%) cardiomyocyte population, and the yield of cardiomyocytes increased 6–10-fold because of induction of proliferation on purification. Long-term engraftment (4–5 months) was observed when co-transplanting selected ES cell–derived cardiomyocytes and fibroblasts into the injured heart of syngeneic mice, and no teratoma formation was found (n = 60). Although transplantation of ES cell–derived cardiomyocytes improved heart function, BM cells had no positive effects. Furthermore, no contribution of BM cells to cardiac, endothelial, or smooth muscle neogenesis was detected. Hence, our results demonstrate that ES-based cell therapy is a promising approach for the treatment of impaired myocardial function and provides better results than BM-derived cells
Policy Evaluation in Batch Reinforcement Learning
Policy evaluation is a central problem in batch reinforcement learning. It refers to the assessment of a given decision policy using logged data. Practical methods for policy evaluation typically involve some form of function approximation so as to relieve issues caused by the enormous scales of real-world systems and the length of planning horizons. The thesis is primarily concerned with statistical analysis of policy evaluation and show how function approximation improves the efficacy of methods.
In the first part of the thesis, we consider off-policy evaluation with linear function approximation. We show the equivalence of a regression-based fitted Q-iteration method, marginalized importance sampling methods and a model-based method that estimates a conditional mean embedding of the transition operator. Moreover, our theory reveals that the hardness of off-policy evaluation is determined by the mismatch between data and target distributions, which is reflected by a projected chi-square-divergence in error bounds. We prove that the estimators are minimax optimal in terms of sample size, planning horizon and the mismatch term.
In the second part of the thesis, we study on-policy evaluation with kernel methods. In particular, we are interested in a regularized form of the kernel least-squares temporal-difference (LSTD) estimate. We use empirical process theory techniques to derive a non-asymptotic upper bound on the error with explicit dependence on the eigenvalues of the associated kernel operator, as well as the instance-dependent variance of the Bellman residual error. In addition, we prove minimax lower bounds over sub-classes of MRPs, which shows that our rate is optimal in terms of the sample size and the effective horizon. Our analysis sheds light on how we could tune the complexity of function class to favorably strike a balance between the curse of dimension and the curse of horizon in policy evaluation
Investigate liquid perception practice of flood mitigation in Pearl River Delta, Guangzhou
Thesis (Undergrad) -- University of Melbourne, Faculty of Architecture, Building and Plannin
- …