1,797 research outputs found

    Defense Against Reward Poisoning Attacks in Reinforcement Learning

    Get PDF
    We study defense strategies against reward poisoning attacks in reinforcement learning. As a threat model, we consider attacks that minimally alter rewards to make the attacker's target policy uniquely optimal under the poisoned rewards, with the optimality gap specified by an attack parameter. Our goal is to design agents that are robust against such attacks in terms of the worst-case utility w.r.t. the true, unpoisoned, rewards while computing their policies under the poisoned rewards. We propose an optimization framework for deriving optimal defense policies, both when the attack parameter is known and unknown. Moreover, we show that defense policies that are solutions to the proposed optimization problems have provable performance guarantees. In particular, we provide the following bounds with respect to the true, unpoisoned, rewards: a) lower bounds on the expected return of the defense policies, and b) upper bounds on how suboptimal these defense policies are compared to the attacker's target policy. We conclude the paper by illustrating the intuitions behind our formal results, and showing that the derived bounds are non-trivial

    Nonlinear dynamics of the interface of dielectric liquids in a strong electric field: Reduced equations of motion

    Full text link
    The evolution of the interface between two ideal dielectric liquids in a strong vertical electric field is studied. It is found that a particular flow regime, for which the velocity potential and the electric field potential are linearly dependent functions, is possible if the ratio of the permittivities of liquids is inversely proportional to the ratio of their densities. The corresponding reduced equations for interface motion are derived. In the limit of small density ratio, these equations coincide with the well-known equations describing the Laplacian growth.Comment: 10 page

    Admissible Policy Teaching through Reward Design

    Get PDF
    We study reward design strategies for incentivizing a reinforcement learning agent to adopt a policy from a set of admissible policies. The goal of the reward designer is to modify the underlying reward function cost-efficiently while ensuring that any approximately optimal deterministic policy under the new reward function is admissible and performs well under the original reward function. This problem can be viewed as a dual to the problem of optimal reward poisoning attacks: instead of forcing an agent to adopt a specific policy, the reward designer incentivizes an agent to avoid taking actions that are inadmissible in certain states. Perhaps surprisingly, and in contrast to the problem of optimal reward poisoning attacks, we first show that the reward design problem for admissible policy teaching is computationally challenging, and it is NP-hard to find an approximately optimal reward modification. We then proceed by formulating a surrogate problem whose optimal solution approximates the optimal solution to the reward design problem in our setting, but is more amenable to optimization techniques and analysis. For this surrogate problem, we present characterization results that provide bounds on the value of the optimal solution. Finally, we design a local search algorithm to solve the surrogate problem and showcase its utility using simulation-based experiments

    Potentially fatal tricuspid valve aspergilloma detected after laparoscopic abdominal surgery

    Get PDF
    Fungal endocarditis accounts for 1.3-6% of all cases of infective endocarditis. The most common causative organism is Candida, followed by Aspergillus and other mould fungi. Aspergillus endocarditis is usually associated with high morbidity and mortality. Establishing a definitive and timely diagnosis remains difficult and there are many reports of undetected aspergillomas leading to fatalities in the perioperative period. We present a case report of preoperatively undiagnosed large mobile tricuspid valve aspergilloma obstructing the right ventricular inlet, diagnosed incidentally on the second postoperative day after laparoscopic pancreatic abscess drainage. The patient was successfully managed with emergency open-heart surgery and systemic antifungal agents in the postoperative period.Keywords: Infective endocarditis, aspergilloma, tricuspid valu

    Curie temperature engineering in a novel 2D analog of iron ore (hematene) via strain

    Get PDF
    As a newly exfoliated magnetic 2D material from hematite, hematene is the most far-reaching ultrathin magnetic indirect bandgap semiconductor. We have carried out a detailed structural analysis of hematene via prefacing strain by means of first-principles calculations based on density functional theory (DFT). Hematene in the pristine form emerges out to be a magnetic semiconductor with a bandgap of 1.0/2.0 eV for the majority/minority spin channel. The dependence of magnetic anisotropy energy (MAE), TC, and the bandgap on compressive and tensile strains has been scanned exclusively. It is examined that TC depends firmly on the compressive strain and increases up to 21.1% at a compressive strain of 6% whereas it decreases significantly for tensile strain. The MAE is negatively correlated with the tensile and compressive strain. The value of MAE for all compressive strain cases is more than that of the pristine hematene. These results summarize that the studied 2D hematene has broad application prospects in spintronics, memory-based devices, and valleytronics

    Extreme timescale core-level spectroscopy with tailored XUV pulses

    Full text link
    A new approach for few-femtosecond time-resolved photoelectron spectroscopy in condensed matter that balances the combined needs for both temporal and energy resolution is demonstrated. Here, the method is designed to investigate a prototypical Mott insulator, tantalum disulphide (1T-TaS2), which transforms from its charge-density-wave ordered Mott insulating state to a conducting state in a matter of femtoseconds. The signature to be observed through the phase transition is a charge-density-wave induced splitting of the Ta 4f core-levels, which can be resolved with sub-eV spectral resolution. Combining this spectral resolution with few-femtosecond time resolution enables the collapse of the charge ordered Mott state to be clocked. Precise knowledge of the sub-20-femtosecond dynamics will provide new insight into the physical mechanism behind the collapse and may reveal Mott physics on the timescale of electronic hopping.Comment: 20 pages, 6 figure

    Risk factors for early mortality in patients with pulmonary tuberculosis admitted to the emergency room.

    Get PDF
    Abstract Background and objectives Mortality of patients with pulmonary tuberculosis (TB) admitted to emergency departments is high. This study was aimed at analysing the risk factors associated with early mortality and designing a risk score based on simple parameters. Methods This prospective case-control study enrolled patients admitted to the emergency department of a referral TB hospital. Clinical, radiological, biochemical and microbiological risk factors associated with death were compared among patients dying within one week from admission (cases) and those surviving (controls). Results Forty-nine of 250 patients (19.6%) experienced early mortality. Multiple logistic regression analysis showed that oxygen saturation (SaO2) ≤90%, severe malnutrition, tachypnoea, tachycardia, hypotension, advanced disease at chest radiography, severe anaemia, hyponatremia, hypoproteinemia and hypercapnia were independently and significantly associated with early mortality. A clinical scoring system was further designed to stratify the risk of death by selecting five simple parameters (SpO2 ≤ 90%, tachypnoea, hypotension, advanced disease at chest radiography and tachycardia). This model predicted early mortality with a positive predictive value of 94.88% and a negative predictive value of 19.90%. Conclusions The scoring system based on simple parameters may help to refer severely ill patients early to a higher level to reduce mortality, improve success rates, minimise the need for pulmonary rehabilitation and prevent post-treatment sequelae
    • …
    corecore