289 research outputs found
Adaptive control for time-varying systems: congelation and interconnection
This thesis investigates the adaptive control problem for systems with time-varying parameters. Two concepts are developed and exploited throughout the thesis: the congelation of variables, and the active nodes.
The thesis first revisits the classical adaptive schemes and explains the challenges brought by the presence of time-varying parameters. Then, the concept of congelation of variables is introduced and its use in combinations with passivity-based, immersion-and-invariant, and identification-based adaptive schemes are discussed. As the congelation of variables method introduces additional interconnection in the closed-loop system, a framework for small-gain-like control synthesis for interconnected systems is needed.\vspace{2ex}
To this end, the thesis proceeds by introducing the notion of active nodes. This is instrumental to show that as long as a class of node systems that possess adjustable damping parameters, that is the active nodes, satisfy certain graph-theoretic conditions, the desired small-gain-like property for the overall system can be enforced via tuning these adjustable parameters. Such conditions for interconnected systems with quadratic, nonlinear, and linearly parametrized supply rates, respectively, are elaborated from the analysis and control synthesis perspectives. The placement and the computation/adaptation of the damping parameters are also discussed.
Following the introduction of these two fundamental tools, the thesis proceeds by discussing state-feedback designs for a class of lower-triangular nonlinear systems. The backstepping technique and the congelation of variables method are combined for passivity-based, immersion-and-invariance, and identification-based schemes. The notion of active nodes is exploited to yield simple and systematic proofs.
Based on the results established for lower-triangular systems, the thesis continues to investigate output-feedback adaptive control problems. An immersion-and-invariance scheme for single-input single-output linear systems and a passivity-based scheme for nonlinear systems in observer form are proposed. The proof and interpretation of these results are also based on the notion of active nodes. The simulation results show that the adaptive control schemes proposed in the thesis have superior performance when compared with the classical schemes in the presence of time-varying parameters.
Finally, the thesis studies two applications of the theoretical results proposed. The servo control problem for serial elastic actuators, and the disease control problem for interconnected settlements. The discussions show that these problems can be solved efficiently using the framework provided by the thesis.Open Acces
Improved Techniques for Maximum Likelihood Estimation for Diffusion ODEs
Diffusion models have exhibited excellent performance in various domains. The
probability flow ordinary differential equation (ODE) of diffusion models
(i.e., diffusion ODEs) is a particular case of continuous normalizing flows
(CNFs), which enables deterministic inference and exact likelihood evaluation.
However, the likelihood estimation results by diffusion ODEs are still far from
those of the state-of-the-art likelihood-based generative models. In this work,
we propose several improved techniques for maximum likelihood estimation for
diffusion ODEs, including both training and evaluation perspectives. For
training, we propose velocity parameterization and explore variance reduction
techniques for faster convergence. We also derive an error-bounded high-order
flow matching objective for finetuning, which improves the ODE likelihood and
smooths its trajectory. For evaluation, we propose a novel training-free
truncated-normal dequantization to fill the training-evaluation gap commonly
existing in diffusion ODEs. Building upon these techniques, we achieve
state-of-the-art likelihood estimation results on image datasets (2.56 on
CIFAR-10, 3.43/3.69 on ImageNet-32) without variational dequantization or data
augmentation.Comment: Accepted in ICML202
DPM-Solver-v3: Improved Diffusion ODE Solver with Empirical Model Statistics
Diffusion probabilistic models (DPMs) have exhibited excellent performance
for high-fidelity image generation while suffering from inefficient sampling.
Recent works accelerate the sampling procedure by proposing fast ODE solvers
that leverage the specific ODE form of DPMs. However, they highly rely on
specific parameterization during inference (such as noise/data prediction),
which might not be the optimal choice. In this work, we propose a novel
formulation towards the optimal parameterization during sampling that minimizes
the first-order discretization error of the ODE solution. Based on such
formulation, we propose DPM-Solver-v3, a new fast ODE solver for DPMs by
introducing several coefficients efficiently computed on the pretrained model,
which we call empirical model statistics. We further incorporate multistep
methods and a predictor-corrector framework, and propose some techniques for
improving sample quality at small numbers of function evaluations (NFE) or
large guidance scales. Experiments show that DPM-Solver-v3 achieves
consistently better or comparable performance in both unconditional and
conditional sampling with both pixel-space and latent-space DPMs, especially in
510 NFEs. We achieve FIDs of 12.21 (5 NFE), 2.51 (10 NFE) on
unconditional CIFAR10, and MSE of 0.55 (5 NFE, 7.5 guidance scale) on Stable
Diffusion, bringing a speed-up of 15%30% compared to previous
state-of-the-art training-free methods. Code is available at
https://github.com/thu-ml/DPM-Solver-v3.Comment: Accepted at NeurIPS 202
Towards an Understanding of Large Language Models in Software Engineering Tasks
Large Language Models (LLMs) have drawn widespread attention and research due
to their astounding performance in tasks such as text generation and reasoning.
Derivative products, like ChatGPT, have been extensively deployed and highly
sought after. Meanwhile, the evaluation and optimization of LLMs in software
engineering tasks, such as code generation, have become a research focus.
However, there is still a lack of systematic research on the application and
evaluation of LLMs in the field of software engineering. Therefore, this paper
is the first to comprehensively investigate and collate the research and
products combining LLMs with software engineering, aiming to answer two
questions: (1) What are the current integrations of LLMs with software
engineering? (2) Can LLMs effectively handle software engineering tasks? To
find the answers, we have collected related literature as extensively as
possible from seven mainstream databases, and selected 123 papers for analysis.
We have categorized these papers in detail and reviewed the current research
status of LLMs from the perspective of seven major software engineering tasks,
hoping this will help researchers better grasp the research trends and address
the issues when applying LLMs. Meanwhile, we have also organized and presented
papers with evaluation content to reveal the performance and effectiveness of
LLMs in various software engineering tasks, providing guidance for researchers
and developers to optimize
Positional Information Matters for Invariant In-Context Learning: A Case Study of Simple Function Classes
In-context learning (ICL) refers to the ability of a model to condition on a
few in-context demonstrations (input-output examples of the underlying task) to
generate the answer for a new query input, without updating parameters. Despite
the impressive ICL ability of LLMs, it has also been found that ICL in LLMs is
sensitive to input demonstrations and limited to short context lengths. To
understand the limitations and principles for successful ICL, we conduct an
investigation with ICL linear regression of transformers. We characterize
several Out-of-Distribution (OOD) cases for ICL inspired by realistic LLM ICL
failures and compare transformers with DeepSet, a simple yet powerful
architecture for ICL. Surprisingly, DeepSet outperforms transformers across a
variety of distribution shifts, implying that preserving permutation invariance
symmetry to input demonstrations is crucial for OOD ICL. The phenomenon
specifies a fundamental requirement by ICL, which we termed as ICL invariance.
Nevertheless, the positional encodings in LLMs will break ICL invariance. To
this end, we further evaluate transformers with identical positional encodings
and find preserving ICL invariance in transformers achieves state-of-the-art
performance across various ICL distribution shiftsComment: Ongoing work; preliminary versio
- …