68 research outputs found

    Decentralized Control for Discrete-time Mean-Field Systems with Multiple Controllers of Delayed Information

    Full text link
    In this paper, the finite horizon asymmetric information linear quadratic (LQ) control problem is investigated for a discrete-time mean field system. Different from previous works, multiple controllers with different information sets are involved in the mean field system dynamics. The coupling of different controllers makes it quite difficult in finding the optimal control strategy. Fortunately, by applying the Pontryagin's maximum principle, the corresponding decentralized control problem of the finite horizon is investigated. The contributions of this paper can be concluded as: For the first time, based on the solution of a group of mean-field forward and backward stochastic difference equations (MF-FBSDEs), the necessary and sufficient solvability conditions are derived for the asymmetric information LQ control for the mean field system with multiple controllers. Furthermore, by the use of an innovative orthogonal decomposition approach, the optimal decentralized control strategy is derived, which is based on the solution to a non-symmetric Riccati-type equation

    State-Wise Safe Reinforcement Learning With Pixel Observations

    Full text link
    In the context of safe exploration, Reinforcement Learning (RL) has long grappled with the challenges of balancing the tradeoff between maximizing rewards and minimizing safety violations, particularly in complex environments with contact-rich or non-smooth dynamics, and when dealing with high-dimensional pixel observations. Furthermore, incorporating state-wise safety constraints in the exploration and learning process, where the agent must avoid unsafe regions without prior knowledge, adds another layer of complexity. In this paper, we propose a novel pixel-observation safe RL algorithm that efficiently encodes state-wise safety constraints with unknown hazard regions through a newly introduced latent barrier-like function learning mechanism. As a joint learning framework, our approach begins by constructing a latent dynamics model with low-dimensional latent spaces derived from pixel observations. We then build and learn a latent barrier-like function on top of the latent dynamics and conduct policy optimization simultaneously, thereby improving both safety and the total expected return. Experimental evaluations on the safety-gym benchmark suite demonstrate that our proposed method significantly reduces safety violations throughout the training process, and demonstrates faster safety convergence compared to existing methods while achieving competitive results in reward return.Comment: 10 pages, 5 figure

    Cardiac magnetic resonance imaging for discrimination of hypertensive heart disease and hypertrophic cardiomyopathy: a systematic review and meta-analysis

    Get PDF
    IntroductionDifferentiating hypertensive heart disease (HHD) from hypertrophic cardiomyopathy (HCM) is crucial yet challenging due to overlapping clinical and morphological features. Recent studies have explored the use of various cardiac magnetic resonance (CMR) parameters to distinguish between these conditions, but findings have remained inconclusive. This study aims to identify which CMR parameters effectively discriminate between HHD and HCM and to investigate their underlying pathophysiological mechanisms through a meta-analysis.MethodsThe researchers conducted a systematic and comprehensive search for all studies that used CMR to discriminate between HHD and HCM and calculated the Hedges'g effect size for each of the included studies, which were then pooled using a random-effects model and tested for the effects of potential influencing variables through subgroup and regression analyses.ResultsIn this review, 26 studies encompassing 1,349 HHD and 1,581 HCM cases were included for meta-analysis. Analysis revealed that HHD showed a significant lower in T1 mapping (g = −0.469, P < 0.001), extracellular volume (g = −0.417, P = 0.024), left ventricular mass index (g = −0.437, P < 0.001), and maximal left ventricular wall thickness (g = −2.076, P < 0.001), alongside a significant higher in end-systolic volume index (g = 0.993, P < 0.001) and end-diastolic volume index (g = 0.553, P < 0.001), compared to HCM.ConclusionThis study clearly demonstrates that CMR parameters can effectively differentiate between HHD and HCM. HHD is characterized by significantly lower diffuse interstitial fibrosis and myocardial hypertrophy, along with better-preserved diastolic function but lower systolic function, compared to HCM. The findings highlight the need for standardized CMR protocols, considering the significant influence of MRI machine vendors, post-processing software, and study regions on diagnostic parameters. These insights are crucial for improving diagnostic accuracy and optimizing treatment strategies for patients with HHD and HCM.Systematic Review Registrationhttps://www.crd.york.ac.uk/prospero/display_record.php?ID=CRD42023470557, PROSPERO (CRD42023470557)

    Phase I study of the Syk inhibitor sovleplenib in relapsed or refractory mature B-cell tumors

    Get PDF
    Sovleplenib (HMPL-523) is a selective spleen tyrosine kinase (Syk) inhibitor with antitumor activity in preclinical models of B-cell malignancy. We conducted a dose-escalation and dose-expansion phase I study of sovleplenib in patients with relapsed/refractory mature Bcell tumors. Dose escalation followed a 3+3 design; patients received oral sovleplenib (200-800 mg once daily [q.d.] or 200 mg twice daily [b.i.d.], 28-day cycles). During dose expansion, patients were enrolled into four cohorts per lymphoma classification and treated at the recommended phase 2 dose (RP2D). Overall, 134 Chinese patients were enrolled (dose escalation, n=27; dose expansion, n=107). Five patients experienced dose-limiting toxicities: one each of amylase increased (200 mg q.d.), febrile neutropenia (800 mg q.d), renal failure (800 mg q.d.), hyperuricemia and blood creatine phosphokinase increased (200 mg b.i.d.) and blood bilirubin increased and pneumonia (200 mg b.i.d.). RP2D was determined as 600 mg (>65 kg) or 400 mg (≤65 kg) q.d. The primary efficacy end point of independent review committee–assessed objective response rate in indolent B-cell lymphoma was 50.8% (95% CI, 37.5–64.1) in 59 evaluable patients at RP2D (follicular lymphoma: 60.5%, marginal zone lymphoma: 28.6%, lymphoplasmacytic lymphoma/Waldenström macroglobulinemia, 0%). The most common (≥10% patients) grade ≥3 treatment-related adverse events in the doseexpansion phase were decreased neutrophil count (29.9%), pneumonia (12.1%) and decreased white blood cell count (11.2%). Pharmacokinetic exposures increased doseproportionally with ascending dose levels from 200–800 mg, without observed saturation. Sovleplenib showed antitumor activity in relapsed/refractory B-cell lymphoma with acceptable safety. Further studies are warranted

    Optimal control for stochastic systems with multiple controllers of different information structures

    No full text
    In this article, we investigate the optimal linear quadratic control problem for stochastic systems with multiple controllers, where each controller has its own information structure, which differs from each other. More specifically, we consider the optimal control problem for systems with multiple controllers of different delayed state information. First, the necessary and sufficient solvability conditions are given in terms of forward and backward difference equations (FBSDEs). Further, an innovation method is proposed to decouple the FBSDEs, and the optimal control strategies are derived based on a given nonsymmetric Riccati equation. Finally, a numerical example is provided to show the effectiveness of the main results. It is stressed that the proposed methods and results can be seen as an important addition to the optimal control theory with asymmetric-information-structure controllers.Agency for Science, Technology and Research (A*STAR)This work was supported in part by the Agency for Science, Technology and Research of Singapore under Grant A1788a0023, in part by the National Natural Science Foundation of China under Grant 61903210, Grant 61633014, and Grant 61873179, in part by the Natural Science Foundation of Shandong Province under Grant ZR2019BF002, in part by the China Postdoctoral Science Foundation under Grant 2019M652324, and in part by the Qingdao Postdoctoral Application Research Project
    • …
    corecore