131 research outputs found
Optimized Dimensionality Reduction for Moment-based Distributionally Robust Optimization
Moment-based distributionally robust optimization (DRO) provides an
optimization framework to integrate statistical information with traditional
optimization approaches. Under this framework, one assumes that the underlying
joint distribution of random parameters runs in a distributional ambiguity set
constructed by moment information and makes decisions against the worst-case
distribution within the set. Although most moment-based DRO problems can be
reformulated as semidefinite programming (SDP) problems that can be solved in
polynomial time, solving high-dimensional SDPs is still time-consuming. Unlike
existing approximation approaches that first reduce the dimensionality of
random parameters and then solve the approximated SDPs, we propose an optimized
dimensionality reduction (ODR) approach. We first show that the ranks of the
matrices in the SDP reformulations are small, by which we are then motivated to
integrate the dimensionality reduction of random parameters with the subsequent
optimization problems. Such integration enables two outer and one inner
approximations of the original problem, all of which are low-dimensional SDPs
that can be solved efficiently. More importantly, these approximations can
theoretically achieve the optimal value of the original high-dimensional SDPs.
As these approximations are nonconvex SDPs, we develop modified Alternating
Direction Method of Multipliers (ADMM) algorithms to solve them efficiently. We
demonstrate the effectiveness of our proposed ODR approach and algorithm in
solving two practical problems. Numerical results show significant advantages
of our approach on the computational time and solution quality over the three
best possible benchmark approaches. Our approach can obtain an optimal or
near-optimal (mostly within 0.1%) solution and reduce the computational time by
up to three orders of magnitude
Drying mediated orientation and assembly structure of amphiphilic Janus particles
Amphiphilic Janus particles demonstrate unique assembly structures when dried on a hydrophilic substrate. Particle orientations are influenced by amphiphilicity and Janus balance. A three-stage model is developed to describe the process. Simulation further indicates the dominant force is capillary attraction due to the interface pinning at rough Janus boundaries
Reliable Generation of EHR Time Series via Diffusion Models
Electronic Health Records (EHRs) are rich sources of patient-level data,
including laboratory tests, medications, and diagnoses, offering valuable
resources for medical data analysis. However, concerns about privacy often
restrict access to EHRs, hindering downstream analysis. Researchers have
explored various methods for generating privacy-preserving EHR data. In this
study, we introduce a new method for generating diverse and realistic synthetic
EHR time series data using Denoising Diffusion Probabilistic Models (DDPM). We
conducted experiments on six datasets, comparing our proposed method with eight
existing methods. Our results demonstrate that our approach significantly
outperforms all existing methods in terms of data utility while requiring less
training effort. Our approach also enhances downstream medical data analysis by
providing diverse and realistic synthetic EHR data
Cure the headache of Transformers via Collinear Constrained Attention
As the rapid progression of practical applications based on Large Language
Models continues, the importance of extrapolating performance has grown
exponentially in the research domain. In our study, we identified an anomalous
behavior in Transformer models that had been previously overlooked, leading to
a chaos around closest tokens which carried the most important information.
We've coined this discovery the "headache of Transformers". To address this at
its core, we introduced a novel self-attention structure named Collinear
Constrained Attention (CoCA). This structure can be seamlessly integrated with
existing extrapolation, interpolation methods, and other optimization
strategies designed for traditional Transformer models. We have achieved
excellent extrapolating performance even for 16 times to 24 times of sequence
lengths during inference without any fine-tuning on our model. We have also
enhanced CoCA's computational and spatial efficiency to ensure its
practicality. We plan to open-source CoCA shortly. In the meantime, we've made
our code available in the appendix for reappearing experiments.Comment: 16 pages, 6 figure
Interaction of cellulase with three phenolic acids
The activity of cellulase against filter paper was enhanced by 28.32% and 15.17% after the addition of 0.83 mg/ml of ferulic acid and p-coumaric acid, respectively, and by 10.15% after the addition of salicylic acid at 0.67 mg/ml. The effects of three phenolic acids on the structure of cellulase were investigated via ultraviolet spectrophotometry, fluorescence spectroscopy, and circular dichroism (CD) spectroscopy. Ultraviolet spectroscopic results indicated that the peak absorbance of cellulase significantly increased and exhibited a 4–5 nm redshift after the addition of the three phenolic acids, suggesting that the phenolic acids strongly interacted with the enzyme. Fluorescence investigation of the interaction between the enzyme and the phenolic acids showed that ferulic acid and p-coumaric acid covalently reacted with the aromatic amino acid residues in cellulase, whereas salicylic acid interacted non-covalently with cellulase. CD analysis revealed that the addition of the phenolic acids significantly decreased α-helix content but increased β-sheet and random coil contents. The possible mechanism underlying the effects of these phenolic acids on cellulase activity was also discussed.</p
China’s carbon capture, utilization and storage (CCUS) policy:A critical review
Carbon capture, utilization and storage (CCUS), has been deemed an essential component for climate change mitigation and is conducive to enabling a low-carbon and sustainable future. Since the 12th Five-year Plan, China has included this technology as part of its future national carbon mitigation strategies. China's policy framework in relation to CCUS has had a strong influencing role in the technology's progress to date. This paper employs the “policy cycle” to analyze China's existing CCUS regulatory framework at the national and provincial level, evaluate its performance and clarify its shortcomings in light of the comparisons of policy movements undertaken in other countries. The results indicate that China's CCUS policy is insufficient for further development of the technology and many issues remain to be solved. This includes the lack of an enforceable legal framework, insufficient information for the operationalization of projects, weak market stimulus, and a lack of financial subsidies. These factors may be the reason we have seen low participation rates of Chinese companies in CCUS and little public understanding of what the technology offers. To overcome these challenges, suggestions are provided for improving China's CCUS legal and policy framework
Continual Learning in Predictive Autoscaling
Predictive Autoscaling is used to forecast the workloads of servers and
prepare the resources in advance to ensure service level objectives (SLOs) in
dynamic cloud environments. However, in practice, its prediction task often
suffers from performance degradation under abnormal traffics caused by external
events (such as sales promotional activities and applications
re-configurations), for which a common solution is to re-train the model with
data of a long historical period, but at the expense of high computational and
storage costs. To better address this problem, we propose a replay-based
continual learning method, i.e., Density-based Memory Selection and Hint-based
Network Learning Model (DMSHM), using only a small part of the historical log
to achieve accurate predictions. First, we discover the phenomenon of sample
overlap when applying replay-based continual learning in prediction tasks. In
order to surmount this challenge and effectively integrate new sample
distribution, we propose a density-based sample selection strategy that
utilizes kernel density estimation to calculate sample density as a reference
to compute sample weight, and employs weight sampling to construct a new memory
set. Then we implement hint-based network learning based on hint representation
to optimize the parameters. Finally, we conduct experiments on public and
industrial datasets to demonstrate that our proposed method outperforms
state-of-the-art continual learning methods in terms of memory capacity and
prediction accuracy. Furthermore, we demonstrate remarkable practicability of
DMSHM in real industrial applications
Uncertainty Quantification for Molecular Property Predictions with Graph Neural Architecture Search
Graph Neural Networks (GNNs) have emerged as a prominent class of data-driven
methods for molecular property prediction. However, a key limitation of typical
GNN models is their inability to quantify uncertainties in the predictions.
This capability is crucial for ensuring the trustworthy use and deployment of
models in downstream tasks. To that end, we introduce AutoGNNUQ, an automated
uncertainty quantification (UQ) approach for molecular property prediction.
AutoGNNUQ leverages architecture search to generate an ensemble of
high-performing GNNs, enabling the estimation of predictive uncertainties. Our
approach employs variance decomposition to separate data (aleatoric) and model
(epistemic) uncertainties, providing valuable insights for reducing them. In
our computational experiments, we demonstrate that AutoGNNUQ outperforms
existing UQ methods in terms of both prediction accuracy and UQ performance on
multiple benchmark datasets. Additionally, we utilize t-SNE visualization to
explore correlations between molecular features and uncertainty, offering
insight for dataset improvement. AutoGNNUQ has broad applicability in domains
such as drug discovery and materials science, where accurate uncertainty
quantification is crucial for decision-making
- …