25 research outputs found

    Bi-level Physics-Informed Neural Networks for PDE Constrained Optimization using Broyden's Hypergradients

    Full text link
    Deep learning based approaches like Physics-informed neural networks (PINNs) and DeepONets have shown promise on solving PDE constrained optimization (PDECO) problems. However, existing methods are insufficient to handle those PDE constraints that have a complicated or nonlinear dependency on optimization targets. In this paper, we present a novel bi-level optimization framework to resolve the challenge by decoupling the optimization of the targets and constraints. For the inner loop optimization, we adopt PINNs to solve the PDE constraints only. For the outer loop, we design a novel method by using Broyden's method based on the Implicit Function Theorem (IFT), which is efficient and accurate for approximating hypergradients. We further present theoretical explanations and error analysis of the hypergradients computation. Extensive experiments on multiple large-scale and nonlinear PDE constrained optimization problems demonstrate that our method achieves state-of-the-art results compared with strong baselines

    Towards Safe Reinforcement Learning via Constraining Conditional Value-at-Risk

    Full text link
    Though deep reinforcement learning (DRL) has obtained substantial success, it may encounter catastrophic failures due to the intrinsic uncertainty of both transition and observation. Most of the existing methods for safe reinforcement learning can only handle transition disturbance or observation disturbance since these two kinds of disturbance affect different parts of the agent; besides, the popular worst-case return may lead to overly pessimistic policies. To address these issues, we first theoretically prove that the performance degradation under transition disturbance and observation disturbance depends on a novel metric of Value Function Range (VFR), which corresponds to the gap in the value function between the best state and the worst state. Based on the analysis, we adopt conditional value-at-risk (CVaR) as an assessment of risk and propose a novel reinforcement learning algorithm of CVaR-Proximal-Policy-Optimization (CPPO) which formalizes the risk-sensitive constrained optimization problem by keeping its CVaR under a given threshold. Experimental results show that CPPO achieves a higher cumulative reward and is more robust against both observation and transition disturbances on a series of continuous control tasks in MuJoCo

    A Unified Hard-Constraint Framework for Solving Geometrically Complex PDEs

    Full text link
    We present a unified hard-constraint framework for solving geometrically complex PDEs with neural networks, where the most commonly used Dirichlet, Neumann, and Robin boundary conditions (BCs) are considered. Specifically, we first introduce the "extra fields" from the mixed finite element method to reformulate the PDEs so as to equivalently transform the three types of BCs into linear forms. Based on the reformulation, we derive the general solutions of the BCs analytically, which are employed to construct an ansatz that automatically satisfies the BCs. With such a framework, we can train the neural networks without adding extra loss terms and thus efficiently handle geometrically complex PDEs, alleviating the unbalanced competition between the loss terms corresponding to the BCs and PDEs. We theoretically demonstrate that the "extra fields" can stabilize the training process. Experimental results on real-world geometrically complex PDEs showcase the effectiveness of our method compared with state-of-the-art baselines.Comment: 10 pages, 6 figures, NeurIPS 202

    Consistent Attack: Universal Adversarial Perturbation on Embodied Vision Navigation

    Full text link
    Embodied agents in vision navigation coupled with deep neural networks have attracted increasing attention. However, deep neural networks have been shown vulnerable to malicious adversarial noises, which may potentially cause catastrophic failures in Embodied Vision Navigation. Among different adversarial noises, universal adversarial perturbations (UAP), i.e., a constant image-agnostic perturbation applied on every input frame of the agent, play a critical role in Embodied Vision Navigation since they are computation-efficient and application-practical during the attack. However, existing UAP methods ignore the system dynamics of Embodied Vision Navigation and might be sub-optimal. In order to extend UAP to the sequential decision setting, we formulate the disturbed environment under the universal noise δ\delta, as a δ\delta-disturbed Markov Decision Process (δ\delta-MDP). Based on the formulation, we analyze the properties of δ\delta-MDP and propose two novel Consistent Attack methods, named Reward UAP and Trajectory UAP, for attacking Embodied agents, which consider the dynamic of the MDP and calculate universal noises by estimating the disturbed distribution and the disturbed Q function. For various victim models, our Consistent Attack can cause a significant drop in their performance in the PointGoal task in Habitat with different datasets and different scenes. Extensive experimental results indicate that there exist serious potential risks for applying Embodied Vision Navigation methods to the real world

    Task Aware Dreamer for Task Generalization in Reinforcement Learning

    Full text link
    A long-standing goal of reinforcement learning is to acquire agents that can learn on training tasks and generalize well on unseen tasks that may share a similar dynamic but with different reward functions. A general challenge is to quantitatively measure the similarities between these different tasks, which is vital for analyzing the task distribution and further designing algorithms with stronger generalization. To address this, we present a novel metric named Task Distribution Relevance (TDR) via optimal Q functions of different tasks to capture the relevance of the task distribution quantitatively. In the case of tasks with a high TDR, i.e., the tasks differ significantly, we show that the Markovian policies cannot differentiate them, leading to poor performance. Based on this insight, we encode all historical information into policies for distinguishing different tasks and propose Task Aware Dreamer (TAD), which extends world models into our reward-informed world models to capture invariant latent features over different tasks. In TAD, we calculate the corresponding variational lower bound of the data log-likelihood, including a novel term to distinguish different tasks via states, to optimize reward-informed world models. Extensive experiments in both image-based control tasks and state-based control tasks demonstrate that TAD can significantly improve the performance of handling different tasks simultaneously, especially for those with high TDR, and demonstrate a strong generalization ability to unseen tasks

    Lattice dynamics and heat transport in zeolitic imidazolate framework glasses

    Get PDF
    The glassy state of zeolitic imidazolate frameworks (ZIFs) has shown great potential for energy-related applications, including solid electrolytes. However, their thermal conductivity (κ), an essential parameter influencing thermal dissipation, remains largely unexplored. In this work, using a combination of experiments, atomistic simulations, and lattice dynamics calculations, we investigate κ and the underlying heat conduction mechanism in ZIF glasses with varying ratios of imidazolate (Im) to benzimidazolate (bIm) linkers. The substitution of bIm for Im tunes the node-linker couplings but exhibits only a minor impact on the average diffusivity of low-frequency lattice modes. On the other hand, the linker substitution induces significant volume expansion, which, in turn, suppresses the contributions from lattice vibrations to κ, leading to decreased total heat conduction. Furthermore, spatial localization of internal high-frequency linker vibrations is promoted upon substitution, reducing their mode diffusivities. This is ascribed to structural deformations of the bIm units in the glasses. Our work unveils the detailed influences of linker substitution on the dual heat conduction characteristics of ZIF glasses and guides the κ regulation of related hybrid materials in practical applications.</p

    Neoadjuvant Immune Checkpoint Inhibitors in hepatocellular carcinoma: a meta-analysis and systematic review

    Get PDF
    BackgroundNeoadjuvant immunotherapy has demonstrated beneficial outcomes in various cancer types; however, standardized protocols for neoadjuvant immunotherapy in hepatocellular carcinoma (HCC) are currently lacking. This systematic review and meta-analysis aims to investigate the reliability of neoadjuvant immunotherapy’s efficacy and safety in the context of HCC.MethodsA systematic search was conducted across PubMed (MEDLINE), EMBASE, the Web of Science, the Cochrane Library, and conference proceedings to identify clinical trials involving resectable HCC and neoadjuvant immunotherapy. Single-arm meta-analyses were employed to compute odds ratios and 95% confidence intervals (CIs). Heterogeneity analysis, data quality assessment, and subgroup analyses based on the type of immunotherapy drugs and combination therapies were performed. This meta-analysis is registered in PROSPERO (identifier CRD42023474276).ResultsThis meta-analysis included 255 patients from 11 studies. Among resectable HCC patients, neoadjuvant immunotherapy exhibited an overall major pathological response (MPR) rate of 0.47 (95% CI 0.31-0.70) and a pathological complete response (pCR) rate of 0.22 (95% CI 0.14-0.36). The overall objective response rate (ORR) was 0.37 (95% CI 0.20-0.69), with a grade 3-4 treatment-related adverse event (TRAE) incidence rate of 0.35 (95% CI 0.24-0.51). Furthermore, the combined surgical resection rate was 3.08 (95% CI 1.66-5.72). Subgroup analysis shows no significant differences in the efficacy and safety of different single-agent immunotherapies; the efficacy of dual ICIs (Immune Checkpoint Inhibitors) combination therapy is superior to targeted combined immunotherapy and monotherapy, while the reverse is observed in terms of safety.DiscussionNeoadjuvant immunotherapy presents beneficial outcomes in the treatment of resectable HCC. However, large-scale, high-quality experiments are warranted in the future to provide robust data support

    Approximating Ground States by Neural Network Quantum States

    No full text
    Motivated by the Carleo&rsquo;s work [Science, 2017, 355: 602], we focus on finding the neural network quantum statesapproximation of the unknown ground state of a given Hamiltonian H in terms of the best relative error and explore the influences of sum, tensor product, local unitary of Hamiltonians on the best relative error. Besides, we illustrate our method with some examples

    Understanding Adversarial Attacks on Observations in Deep Reinforcement Learning

    Full text link
    Deep reinforcement learning models are vulnerable to adversarial attacks that can decrease a victim's cumulative expected reward by manipulating the victim's observations. Despite the efficiency of previous optimization-based methods for generating adversarial noise in supervised learning, such methods might not be able to achieve the lowest cumulative reward since they do not explore the environmental dynamics in general. In this paper, we provide a framework to better understand the existing methods by reformulating the problem of adversarial attacks on reinforcement learning in the function space. Our reformulation generates an optimal adversary in the function space of the targeted attacks, repelling them via a generic two-stage framework. In the first stage, we train a deceptive policy by hacking the environment, and discover a set of trajectories routing to the lowest reward or the worst-case performance. Next, the adversary misleads the victim to imitate the deceptive policy by perturbing the observations. Compared to existing approaches, we theoretically show that our adversary is stronger under an appropriate noise level. Extensive experiments demonstrate our method's superiority in terms of efficiency and effectiveness, achieving the state-of-the-art performance in both Atari and MuJoCo environments

    Research on the Maturity Evaluation Model of Enterprise Safety Culture

    No full text
    To regulate the safety behavior of employees and improve the occupational safety level of enterprises. Based on the perspective of safety culture, this paper designed an index system based on the four dimensions of safety concept, system, behavior, and physical culture, and it explored a new quantitative assessment method of safety culture level by introducing the concept of maturity into the evaluation of safety culture using the grey fuzzy comprehensive evaluation method. Combining the characteristics of enterprise safety culture, safety culture was divided into five levels, including original level, starting level, development level, completion level, and leading level, and the maturity model of enterprise safety culture was established. Finally, taking an enterprise to be evaluated as an example, the evaluation steps and application of evaluation results were introduced. The results showed that the evaluation model of enterprise safety culture maturity constructed in this paper provides systematic measurement indexes and scientific evaluation methods for evaluating the safety culture maturity of enterprises
    corecore