25 research outputs found
Bi-level Physics-Informed Neural Networks for PDE Constrained Optimization using Broyden's Hypergradients
Deep learning based approaches like Physics-informed neural networks (PINNs)
and DeepONets have shown promise on solving PDE constrained optimization
(PDECO) problems. However, existing methods are insufficient to handle those
PDE constraints that have a complicated or nonlinear dependency on optimization
targets. In this paper, we present a novel bi-level optimization framework to
resolve the challenge by decoupling the optimization of the targets and
constraints. For the inner loop optimization, we adopt PINNs to solve the PDE
constraints only. For the outer loop, we design a novel method by using
Broyden's method based on the Implicit Function Theorem (IFT), which is
efficient and accurate for approximating hypergradients. We further present
theoretical explanations and error analysis of the hypergradients computation.
Extensive experiments on multiple large-scale and nonlinear PDE constrained
optimization problems demonstrate that our method achieves state-of-the-art
results compared with strong baselines
Towards Safe Reinforcement Learning via Constraining Conditional Value-at-Risk
Though deep reinforcement learning (DRL) has obtained substantial success, it
may encounter catastrophic failures due to the intrinsic uncertainty of both
transition and observation. Most of the existing methods for safe reinforcement
learning can only handle transition disturbance or observation disturbance
since these two kinds of disturbance affect different parts of the agent;
besides, the popular worst-case return may lead to overly pessimistic policies.
To address these issues, we first theoretically prove that the performance
degradation under transition disturbance and observation disturbance depends on
a novel metric of Value Function Range (VFR), which corresponds to the gap in
the value function between the best state and the worst state. Based on the
analysis, we adopt conditional value-at-risk (CVaR) as an assessment of risk
and propose a novel reinforcement learning algorithm of
CVaR-Proximal-Policy-Optimization (CPPO) which formalizes the risk-sensitive
constrained optimization problem by keeping its CVaR under a given threshold.
Experimental results show that CPPO achieves a higher cumulative reward and is
more robust against both observation and transition disturbances on a series of
continuous control tasks in MuJoCo
A Unified Hard-Constraint Framework for Solving Geometrically Complex PDEs
We present a unified hard-constraint framework for solving geometrically
complex PDEs with neural networks, where the most commonly used Dirichlet,
Neumann, and Robin boundary conditions (BCs) are considered. Specifically, we
first introduce the "extra fields" from the mixed finite element method to
reformulate the PDEs so as to equivalently transform the three types of BCs
into linear forms. Based on the reformulation, we derive the general solutions
of the BCs analytically, which are employed to construct an ansatz that
automatically satisfies the BCs. With such a framework, we can train the neural
networks without adding extra loss terms and thus efficiently handle
geometrically complex PDEs, alleviating the unbalanced competition between the
loss terms corresponding to the BCs and PDEs. We theoretically demonstrate that
the "extra fields" can stabilize the training process. Experimental results on
real-world geometrically complex PDEs showcase the effectiveness of our method
compared with state-of-the-art baselines.Comment: 10 pages, 6 figures, NeurIPS 202
Consistent Attack: Universal Adversarial Perturbation on Embodied Vision Navigation
Embodied agents in vision navigation coupled with deep neural networks have
attracted increasing attention. However, deep neural networks have been shown
vulnerable to malicious adversarial noises, which may potentially cause
catastrophic failures in Embodied Vision Navigation. Among different
adversarial noises, universal adversarial perturbations (UAP), i.e., a constant
image-agnostic perturbation applied on every input frame of the agent, play a
critical role in Embodied Vision Navigation since they are
computation-efficient and application-practical during the attack. However,
existing UAP methods ignore the system dynamics of Embodied Vision Navigation
and might be sub-optimal. In order to extend UAP to the sequential decision
setting, we formulate the disturbed environment under the universal noise
, as a -disturbed Markov Decision Process (-MDP). Based
on the formulation, we analyze the properties of -MDP and propose two
novel Consistent Attack methods, named Reward UAP and Trajectory UAP, for
attacking Embodied agents, which consider the dynamic of the MDP and calculate
universal noises by estimating the disturbed distribution and the disturbed Q
function. For various victim models, our Consistent Attack can cause a
significant drop in their performance in the PointGoal task in Habitat with
different datasets and different scenes. Extensive experimental results
indicate that there exist serious potential risks for applying Embodied Vision
Navigation methods to the real world
Task Aware Dreamer for Task Generalization in Reinforcement Learning
A long-standing goal of reinforcement learning is to acquire agents that can
learn on training tasks and generalize well on unseen tasks that may share a
similar dynamic but with different reward functions. A general challenge is to
quantitatively measure the similarities between these different tasks, which is
vital for analyzing the task distribution and further designing algorithms with
stronger generalization. To address this, we present a novel metric named Task
Distribution Relevance (TDR) via optimal Q functions of different tasks to
capture the relevance of the task distribution quantitatively. In the case of
tasks with a high TDR, i.e., the tasks differ significantly, we show that the
Markovian policies cannot differentiate them, leading to poor performance.
Based on this insight, we encode all historical information into policies for
distinguishing different tasks and propose Task Aware Dreamer (TAD), which
extends world models into our reward-informed world models to capture invariant
latent features over different tasks. In TAD, we calculate the corresponding
variational lower bound of the data log-likelihood, including a novel term to
distinguish different tasks via states, to optimize reward-informed world
models. Extensive experiments in both image-based control tasks and state-based
control tasks demonstrate that TAD can significantly improve the performance of
handling different tasks simultaneously, especially for those with high TDR,
and demonstrate a strong generalization ability to unseen tasks
Lattice dynamics and heat transport in zeolitic imidazolate framework glasses
The glassy state of zeolitic imidazolate frameworks (ZIFs) has shown great potential for energy-related applications, including solid electrolytes. However, their thermal conductivity (κ), an essential parameter influencing thermal dissipation, remains largely unexplored. In this work, using a combination of experiments, atomistic simulations, and lattice dynamics calculations, we investigate κ and the underlying heat conduction mechanism in ZIF glasses with varying ratios of imidazolate (Im) to benzimidazolate (bIm) linkers. The substitution of bIm for Im tunes the node-linker couplings but exhibits only a minor impact on the average diffusivity of low-frequency lattice modes. On the other hand, the linker substitution induces significant volume expansion, which, in turn, suppresses the contributions from lattice vibrations to κ, leading to decreased total heat conduction. Furthermore, spatial localization of internal high-frequency linker vibrations is promoted upon substitution, reducing their mode diffusivities. This is ascribed to structural deformations of the bIm units in the glasses. Our work unveils the detailed influences of linker substitution on the dual heat conduction characteristics of ZIF glasses and guides the κ regulation of related hybrid materials in practical applications.</p
Neoadjuvant Immune Checkpoint Inhibitors in hepatocellular carcinoma: a meta-analysis and systematic review
BackgroundNeoadjuvant immunotherapy has demonstrated beneficial outcomes in various cancer types; however, standardized protocols for neoadjuvant immunotherapy in hepatocellular carcinoma (HCC) are currently lacking. This systematic review and meta-analysis aims to investigate the reliability of neoadjuvant immunotherapy’s efficacy and safety in the context of HCC.MethodsA systematic search was conducted across PubMed (MEDLINE), EMBASE, the Web of Science, the Cochrane Library, and conference proceedings to identify clinical trials involving resectable HCC and neoadjuvant immunotherapy. Single-arm meta-analyses were employed to compute odds ratios and 95% confidence intervals (CIs). Heterogeneity analysis, data quality assessment, and subgroup analyses based on the type of immunotherapy drugs and combination therapies were performed. This meta-analysis is registered in PROSPERO (identifier CRD42023474276).ResultsThis meta-analysis included 255 patients from 11 studies. Among resectable HCC patients, neoadjuvant immunotherapy exhibited an overall major pathological response (MPR) rate of 0.47 (95% CI 0.31-0.70) and a pathological complete response (pCR) rate of 0.22 (95% CI 0.14-0.36). The overall objective response rate (ORR) was 0.37 (95% CI 0.20-0.69), with a grade 3-4 treatment-related adverse event (TRAE) incidence rate of 0.35 (95% CI 0.24-0.51). Furthermore, the combined surgical resection rate was 3.08 (95% CI 1.66-5.72). Subgroup analysis shows no significant differences in the efficacy and safety of different single-agent immunotherapies; the efficacy of dual ICIs (Immune Checkpoint Inhibitors) combination therapy is superior to targeted combined immunotherapy and monotherapy, while the reverse is observed in terms of safety.DiscussionNeoadjuvant immunotherapy presents beneficial outcomes in the treatment of resectable HCC. However, large-scale, high-quality experiments are warranted in the future to provide robust data support
Approximating Ground States by Neural Network Quantum States
Motivated by the Carleo’s work [Science, 2017, 355: 602], we focus on finding the neural network quantum statesapproximation of the unknown ground state of a given Hamiltonian H in terms of the best relative error and explore the influences of sum, tensor product, local unitary of Hamiltonians on the best relative error. Besides, we illustrate our method with some examples
Understanding Adversarial Attacks on Observations in Deep Reinforcement Learning
Deep reinforcement learning models are vulnerable to adversarial attacks that
can decrease a victim's cumulative expected reward by manipulating the victim's
observations. Despite the efficiency of previous optimization-based methods for
generating adversarial noise in supervised learning, such methods might not be
able to achieve the lowest cumulative reward since they do not explore the
environmental dynamics in general. In this paper, we provide a framework to
better understand the existing methods by reformulating the problem of
adversarial attacks on reinforcement learning in the function space. Our
reformulation generates an optimal adversary in the function space of the
targeted attacks, repelling them via a generic two-stage framework. In the
first stage, we train a deceptive policy by hacking the environment, and
discover a set of trajectories routing to the lowest reward or the worst-case
performance. Next, the adversary misleads the victim to imitate the deceptive
policy by perturbing the observations. Compared to existing approaches, we
theoretically show that our adversary is stronger under an appropriate noise
level. Extensive experiments demonstrate our method's superiority in terms of
efficiency and effectiveness, achieving the state-of-the-art performance in
both Atari and MuJoCo environments
Research on the Maturity Evaluation Model of Enterprise Safety Culture
To regulate the safety behavior of employees and improve the occupational safety level of enterprises. Based on the perspective of safety culture, this paper designed an index system based on the four dimensions of safety concept, system, behavior, and physical culture, and it explored a new quantitative assessment method of safety culture level by introducing the concept of maturity into the evaluation of safety culture using the grey fuzzy comprehensive evaluation method. Combining the characteristics of enterprise safety culture, safety culture was divided into five levels, including original level, starting level, development level, completion level, and leading level, and the maturity model of enterprise safety culture was established. Finally, taking an enterprise to be evaluated as an example, the evaluation steps and application of evaluation results were introduced. The results showed that the evaluation model of enterprise safety culture maturity constructed in this paper provides systematic measurement indexes and scientific evaluation methods for evaluating the safety culture maturity of enterprises