Search CORE

15 research outputs found

Real-time ensemble control with reduced-order modeling

Author: Lin Binghuai
McLaughlin Dennis
Publication venue: Massachusetts Institute of Technology. Earth Resources Laboratory
Publication date: 01/01/2013
Field of study

The control of spatially distributed systems is often complicated by significant uncertainty about system inputs, both time-varying exogenous inputs and time-invariant parameters. Spatial variations of uncertain parameters can be particularly problematic in geoscience applications, making it difficult to forecast the impact of proposed controls. One of the most effective ways to deal with uncertainties in control problems is to incorporate periodic measurements of the system’s states into the control process. Stochastic control provides a convenient way to do this, by integrating uncertainty, monitoring, forecasting, and control in a consistent analytical framework. This paper describes an ensemble-based approach to closed-loop stochastic control that relies on a computationally efficient reduced-order model. The use of ensembles of uncertain parameters and states makes it possible to consider a range of probabilistic performance objectives and to derive real-time controls that explicitly account for uncertainty. The process divides naturally into measurement updating, control, and forecasting steps carried out recursively and initialized with a prior ensemble that describes parameter uncertainty. The performance of the ensemble controller is investigated here with a numerical experiment based on a solute transport control problem. This experiment evaluates the performance of open and closed-loop controllers with full and reduced-order models as well as the performance obtained with a controller based on perfect knowledge of the system and the nominal performance obtained with no control. The experimental results show that a closed-loop controller that relies on measurements consistently performs better than an open loop controller that does not. They also show that a reduced-order forecasting model based on offline simulations gives nearly the same performance as a significantly more computationally demanding full order model. Finally, the experiment indicates that a moderate penalty on the variance of control cost yields a robust control strategy that reduces uncertainty about system performance with little or no increase in average cost. Taken together, these results confirm that reduced-order ensemble closed-loop control is a flexible and efficient control option for uncertain spatially distributed systems.Shell Oil Compan

DSpace@MIT

Crossref

Real-time ensemble control with reduced-order modeling

Author: Lin Binghuai
McLaughlin Dennis
Publication venue: Massachusetts Institute of Technology. Earth Resources Laboratory
Publication date: 01/01/2013
Field of study

DSpace@MIT

Reduced-order modeling for ensemble real-time estimation and control

Author: Lin Binghuai
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2012
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Civil and Environmental Engineering, 2012.Cataloged from PDF version of thesis.Includes bibliographical references.Efficient exploitation of subsurface resources requires better understanding of subsurface physical properties as well as optimization of control strategies. Advances in technology have created the possibility of providing real time measurements of subsurface conditions. These measurements can be used to reduce uncertainty in the description of subsurface conditions, and combining uncertainty quantification and control optimization leads to improved management of subsurface resources through the closed-loop control framework. The ensemble closed-loop control utilizes an ensemble representation to describe complex probabilistic distributions of uncertain model parameters. To reduce the computational burden and make it feasible to apply the ensemble closed-loop control to large-scale problems, this thesis proposes a robust reduced-order model for subsurface solute transport that is sufficiently accurate in the ensemble closed-loop process. The reduced-order model is based on a second-order expansion of the governing equations discretized by the mixed finite element method and the upwind finite difference method. As a result, the reduced-order model can incorporate state and parameter changes explicitly and thus it is possible to perform dimension reduction in both state and parameter spaces. The high-dimensional state space is reduced by the proper orthogonal decomposition, which can capture key features of states for complex systems, while the high-dimensional parameter space is reduced by the discrete cosine transform, which allows for efficient and robust parameterization of physical properties. The efficiency and robustness of the reduced-order model are demonstrated by an uncertainty quantification example using the ensemble Kalman filter. It is shown that model predictions by the reduced-order are sufficiently accurate for updating uncertain model states and parameters. The channelized geological features presented in the example are well preserved and captured by the reduced representations of states and parameters. A further example, which combines reduced-order modeling with the ensemble closed-loop control, illustrates the possibility of performing robust control of large-scale problems under uncertainty with improved efficiency by reduced-order modeling.by Binghuai Lin.Ph.D

DSpace@MIT

Learning Robust Representations for Continual Relation Extraction via Adversarial Class Augmentation

Author: Cao Yunbo
Li Sujian
Lin Binghuai
Liu Tianyu
Song Yifan
Sui Zhifang
Wang Peiyi
Publication venue
Publication date: 10/10/2022
Field of study

Continual relation extraction (CRE) aims to continually learn new relations from a class-incremental data stream. CRE model usually suffers from catastrophic forgetting problem, i.e., the performance of old relations seriously degrades when the model learns new relations. Most previous work attributes catastrophic forgetting to the corruption of the learned representations as new relations come, with an implicit assumption that the CRE models have adequately learned the old relations. In this paper, through empirical studies we argue that this assumption may not hold, and an important reason for catastrophic forgetting is that the learned representations do not have good robustness against the appearance of analogous relations in the subsequent learning process. To address this issue, we encourage the model to learn more precise and robust representations through a simple yet effective adversarial class augmentation mechanism (ACA), which is easy to implement and model-agnostic. Experimental results show that ACA can consistently improve the performance of state-of-the-art CRE models on two popular benchmarks.Comment: Accepted by EMNLP 202

arXiv.org e-Print Archive

Making Large Language Models Better Reasoners with Alignment

Author: Cao Yunbo
Chen Liang
Li Lei
Lin Binghuai
Liu Tianyu
Song Feifan
Sui Zhifang
Wang Peiyi
Publication venue
Publication date: 05/09/2023
Field of study

Reasoning is a cognitive process of using evidence to reach a sound conclusion. The reasoning capability is essential for large language models (LLMs) to serve as the brain of the artificial general intelligence agent. Recent studies reveal that fine-tuning LLMs on data with the chain of thought (COT) reasoning process can significantly enhance their reasoning capabilities. However, we find that the fine-tuned LLMs suffer from an \textit{Assessment Misalignment} problem, i.e., they frequently assign higher scores to subpar COTs, leading to potential limitations in their reasoning abilities. To address this problem, we introduce an \textit{Alignment Fine-Tuning (AFT)} paradigm, which involves three steps: 1) fine-tuning LLMs with COT training data; 2) generating multiple COT responses for each question, and categorizing them into positive and negative ones based on whether they achieve the correct answer; 3) calibrating the scores of positive and negative responses given by LLMs with a novel constraint alignment loss. Specifically, the constraint alignment loss has two objectives: a) Alignment, which guarantees that positive scores surpass negative scores to encourage answers with high-quality COTs; b) Constraint, which keeps the negative scores confined to a reasonable range to prevent the model degradation. Beyond just the binary positive and negative feedback, the constraint alignment loss can be seamlessly adapted to the ranking situations when ranking feedback is accessible. Furthermore, we also delve deeply into recent ranking-based alignment methods, such as DPO, RRHF, and PRO, and discover that the constraint, which has been overlooked by these approaches, is also crucial for their performance. Extensive experiments on four reasoning benchmarks with both binary and ranking feedback demonstrate the effectiveness of AFT.Comment: Large Language Models; Reasoning; Alignmen

arXiv.org e-Print Archive

Bi-Drop: Enhancing Fine-tuning Generalization via Synchronous sub-net Estimation and Optimization

Author: Cao Yunbo
Dai Damai
Lin Binghuai
Liu Tianyu
Sui Zhifang
Tong Shoujie
Xia Heming
Xu Runxin
Publication venue
Publication date: 22/10/2023
Field of study

Pretrained language models have achieved remarkable success in natural language understanding. However, fine-tuning pretrained models on limited training data tends to overfit and thus diminish performance. This paper presents Bi-Drop, a fine-tuning strategy that selectively updates model parameters using gradients from various sub-nets dynamically generated by dropout. The sub-net estimation of Bi-Drop is performed in an in-batch manner, so it overcomes the problem of hysteresis in sub-net updating, which is possessed by previous methods that perform asynchronous sub-net estimation. Also, Bi-Drop needs only one mini-batch to estimate the sub-net so it achieves higher utility of training data. Experiments on the GLUE benchmark demonstrate that Bi-Drop consistently outperforms previous fine-tuning methods. Furthermore, empirical results also show that Bi-Drop exhibits excellent generalization ability and robustness for domain transfer, data imbalance, and low-resource scenarios.Comment: EMNLP 2023 Findings. Camera-ready version. Co-first authors with equal contribution

arXiv.org e-Print Archive

Soft Language Clustering for Multilingual Model Pre-training

Author: Cao Yunbo
Jiang Yufan
Jing Yi
Lin Binghuai
Meng Fandong
Yin Yongjing
Zeng Jiali
Zhou Jie
Publication venue
Publication date: 13/06/2023
Field of study

Multilingual pre-trained language models have demonstrated impressive (zero-shot) cross-lingual transfer abilities, however, their performance is hindered when the target language has distant typology from source languages or when pre-training data is limited in size. In this paper, we propose XLM-P, which contextually retrieves prompts as flexible guidance for encoding instances conditionally. Our XLM-P enables (1) lightweight modeling of language-invariant and language-specific knowledge across languages, and (2) easy integration with other multilingual pre-training methods. On the tasks of XTREME including text classification, sequence labeling, question answering, and sentence retrieval, both base- and large-size language models pre-trained with our proposed method exhibit consistent performance improvement. Furthermore, it provides substantial advantages for low-resource languages in unsupervised sentence retrieval and for target languages that differ greatly from the source language in cross-lingual transfer

arXiv.org e-Print Archive

Large Language Models are not Fair Evaluators

Author: Cao Yunbo
Chen Liang
Li Lei
Lin Binghuai
Liu Qi
Liu Tianyu
Sui Zhifang
Wang Peiyi
Zhu Dawei
Publication venue
Publication date: 29/05/2023
Field of study

We uncover a systematic bias in the evaluation paradigm of adopting large language models~(LLMs), e.g., GPT-4, as a referee to score the quality of responses generated by candidate models. We find that the quality ranking of candidate responses can be easily hacked by simply altering their order of appearance in the context. This manipulation allows us to skew the evaluation result, making one model appear considerably superior to the other, e.g., vicuna could beat ChatGPT on 66 over 80 tested queries. To address this issue, we propose two simple yet effective calibration strategies: 1) Multiple Evidence Calibration, which requires the evaluator model to generate multiple detailed pieces of evidence before assigning ratings; 2) Balanced Position Calibration, which aggregates results across various orders to determine the final score. Extensive experiments demonstrate that our approach successfully mitigates evaluation bias, resulting in closer alignment with human judgments. To facilitate future research on more robust large language model comparison, we integrate the techniques in the paper into an easy-to-use toolkit \emph{FairEval}, along with the human annotations.\footnote{\url{https://github.com/i-Eval/FairEval}}Comment: work in progres

arXiv.org e-Print Archive

Efficient characterization of uncertain model parameters with a reduced-order ensemble Kalman filter

Author: Lin Binghuai
McLaughlin Dennis
Publication venue: Massachusetts Institute of Technology. Earth Resources Laboratory
Publication date: 01/01/2013
Field of study

Spatially variable model parameters are often highly uncertain and di fficult to observe. This has prompted the widespread use of Bayesian characterization methods that can infer parameter values from measurements of related variables, while explicitly accounting for uncertainty. Ensemble versions of Bayesian characterization are particularly convenient when uncertain variables have complex spatial structures that do not conform to Gaussian descriptions. However, ensemble methods can be time-consuming for high-dimensional problems. This paper describes a reduced-order approach to ensemble characterization that is particularly well-suited for subsurface flow and transport problems. It uses a truncated discrete cosine transform (DCT) to reduce the dimensionality of spatially variable time-invariant model parameters and a nonlinear extension of principle orthogonal decomposition (POD) to reduce the dimensionality of dynamic model states. The resulting nonlinear reduced-order model can be included in the forecast step of a reduced-order ensemble Kalman fi lter. These concepts are illustrated in a subsurface solute transport problem using ensembles produced by full and reduced-order order models. These ensembles are very similar when there are no measurement updates. When the forecast ensemble is recursively updated with measurements the reduced-order Kalman fi lter does at least as well as the full-order fi lter in characterizing a dynamic solute plume, even though its augmented state dimension is only 2% of the dimension of the full-order state. This substantial increase in effi ciency implies that a reduced-order fi lter with the same ensemble size as its full-order counterpart can give comparable performance for orders of magnitude less computational e ffort or can use a much larger ensemble for the same computational e ffort. The possibility of substantial increases in ensemble size could lead to performance improvements through reductions in sampling error and in the rank of the ensemble null space. Also, a reduced-order model similar to the one described here could be used in ensemble real-time control applications, where it can decrease the eff ort required for both characterization and control.Shell Oil Compan

DSpace@MIT

Efficient Characterization of Uncertain Model Parameters with a Reduced-Order Ensemble Kalman Filter

Author: Lin Binghuai
McLaughlin Dennis
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/2013
Field of study

2\%

of the dimension of the full-order state. This substantial increase in efficiency implies that a reduced-order filter with the same ensemble size as its full-order counterpart can give comparable performance for orders of magnitude less computational effort or it can use a much larger ensemble for the same computational effort. The possibility of substantial increases in ensemble size could lead to performance improvements through reductions in sampling error and in the rank of the ensemble null space. Also, a reduced-order model similar to the one described here could be used in ensemble real-time control applications, where it can decrease the effort required for both characterization and control.Shell Oil Compan

DSpace@MIT

Crossref