86 research outputs found

    Reinforcement Learning With Reward Machines in Stochastic Games

    Full text link
    We investigate multi-agent reinforcement learning for stochastic games with complex tasks, where the reward functions are non-Markovian. We utilize reward machines to incorporate high-level knowledge of complex tasks. We develop an algorithm called Q-learning with reward machines for stochastic games (QRM-SG), to learn the best-response strategy at Nash equilibrium for each agent. In QRM-SG, we define the Q-function at a Nash equilibrium in augmented state space. The augmented state space integrates the state of the stochastic game and the state of reward machines. Each agent learns the Q-functions of all agents in the system. We prove that Q-functions learned in QRM-SG converge to the Q-functions at a Nash equilibrium if the stage game at each time step during learning has a global optimum point or a saddle point, and the agents update Q-functions based on the best-response strategy at this point. We use the Lemke-Howson method to derive the best-response strategy given current Q-functions. The three case studies show that QRM-SG can learn the best-response strategies effectively. QRM-SG learns the best-response strategies after around 7500 episodes in Case Study I, 1000 episodes in Case Study II, and 1500 episodes in Case Study III, while baseline methods such as Nash Q-learning and MADDPG fail to converge to the Nash equilibrium in all three case studies

    DEADiff: An Efficient Stylization Diffusion Model with Disentangled Representations

    Full text link
    The diffusion-based text-to-image model harbors immense potential in transferring reference style. However, current encoder-based approaches significantly impair the text controllability of text-to-image models while transferring styles. In this paper, we introduce DEADiff to address this issue using the following two strategies: 1) a mechanism to decouple the style and semantics of reference images. The decoupled feature representations are first extracted by Q-Formers which are instructed by different text descriptions. Then they are injected into mutually exclusive subsets of cross-attention layers for better disentanglement. 2) A non-reconstructive learning method. The Q-Formers are trained using paired images rather than the identical target, in which the reference image and the ground-truth image are with the same style or semantics. We show that DEADiff attains the best visual stylization results and optimal balance between the text controllability inherent in the text-to-image model and style similarity to the reference image, as demonstrated both quantitatively and qualitatively. Our project page is https://tianhao-qi.github.io/DEADiff/.Comment: Accepted by CVPR 202

    Identification of disulfidptosis related subtypes, characterization of tumor microenvironment infiltration, and development of DRG prognostic prediction model in RCC, in which MSH3 is a key gene during disulfidptosis

    Get PDF
    Disulfidptosis is a newly discovered mode of cell death induced by disulfide stress. However, the prognostic value of disulfidptosis-related genes (DRGs) in renal cell carcinoma (RCC) remains to be further elucidated. In this study, consistent cluster analysis was used to classify 571 RCC samples into three DRG-related subtypes based on changes in DRGs expression. Through univariate regression analysis and LASSO-Cox regression analysis of differentially expressed genes (DEGs) among three subtypes, we constructed and validated a DRG risk score to predict the prognosis of patients with RCC, while also identifying three gene subtypes. Analysis of DRG risk score, clinical characteristics, tumor microenvironment (TME), somatic cell mutations, and immunotherapy sensitivity revealed significant correlations between them. A series of studies have shown that MSH3 can be a potential biomarker of RCC, and its low expression is associated with poor prognosis in patients with RCC. Last but not least, overexpression of MSH3 promotes cell death in two RCC cell lines under glucose starvation conditions, indicating that MSH3 is a key gene in the process of cell disulfidptosis. In summary, we identify potential mechanism of RCC progression through DRGs -related tumor microenvironment remodeling. In addition, this study has successfully established a new disulfidptosis-related genes prediction model and discovered a key gene MSH3. They may be new prognostic biomarkers for RCC patients, provide new insights for the treatment of RCC patients, and may inspire new methods for the diagnosis and treatment of RCC patients

    Loss of circSRY reduces γH2AX level in germ cells and impairs mouse spermatogenesis.

    Get PDF
    Sry on the Y chromosome is the master switch of sex determination in mammals. It has been well established that Sry encodes a transcription factor that is transiently expressed in somatic cells of the male gonad, leading to the formation of testes. In the testis of adult mice, Sry is expressed as a circular RNA (circRNA) transcript. However, the physiological function of Sry circRNA (circSRY) remains unknown since its discovery in 1993. Here we show that circSRY is mainly expressed in the spermatocytes, but not in mature sperm or somatic cells of the testis. Loss of circSRY led to germ cell apoptosis and the reduction of sperm count in the epididymis. The level of γH2AX was decreased, and failure of XY body formation was noted in circSRY KO germ cells. Further study demonstrated that circSRY directly bound to miR-138-5p in spermatocytes, and in vitro assay suggested that circSRY regulates H2AX mRNA through sponging miR-138-5p. Our study demonstrates that, besides determining sex, Sry also plays an important role in spermatogenesis as a circRNA

    Clinical application of superselective transarterial embolization of renal tumors in zero ischaemia robotic-assisted laparoscopic partial nephrectomy

    Get PDF
    ObjectiveTo assess the feasibility and safety of zero ischaemia robotic-assisted laparoscopic partial nephrectomy (RALPN) after preoperative superselective transarterial embolization (STE) of T1 renal cancer.MethodsWe retrospectively analyzed the data of 32 patients who underwent zero ischaemia RALPN after STE and 140 patients who received standard robot-assisted laparoscopic partial nephrectomy (S-RALPN). In addition, we selected 35 patients treated with off-clamp RALPN (O-RALPN) from September 2017 to March 2022 for comparison. STE was performed by the same interventional practitioner, and zero ischaemia laparoscopic partial nephrectomy (LPN) was carried out by experienced surgeon 1-12 hours after STE. The intraoperative data and postoperative complications were recorded. The postoperative renal function, routine urine test, urinary Computed Tomography (CT), and preoperative and postoperative glomerular filtration rate (GFR) data were analyzed.ResultsAll operations were completed successfully. There were no cases of conversion to opening and no deaths. The renal arterial trunk was not blocked. No blood transfusions were needed. The mean operation time was 91.5 ± 34.28 minutes. The mean blood loss was 58.59 ± 54.11 ml. No recurrence or metastasis occurred.ConclusionFor patients with renal tumors, STE of renal tumors in zero ischaemia RALPN can preserve more renal function, and it provides a safe and feasible surgical method

    PwHAP5, a CCAAT-binding transcription factor, interacts with PwFKBP12 and plays a role in pollen tube growth orientation in Picea wilsonii

    Get PDF
    The HAP complex occurs in many eukaryotic organisms and is involved in multiple physiological processes. Here it was found that in Picea wilsonii, HAP5 (PwHAP5), a putative CCAAT-binding transcription factor gene, is involved in pollen tube development and control of tube orientation. Quantitative real-time reverse transcription-PCR showed that PwHAP5 transcripts were expressed strongly in germinating pollen and could be induced by Ca2+. Overexpression of PwHAP5 in pollen altered pollen tube orientation, whereas the tube with PwHAP5RNAi showed normal growth without diminishing pollen tube growth. Furthermore, PwFKBP12, which encodes an FK506-binding protein (FKBP) was screened and a bimolecular fluorescence complementation assay performed to confirm the interaction of PwHAP5 and PwFKBP12 in vivo. Transient expression of PwFKBP12 in pollen showed normal pollen tube growth, whereas the tube with PwFKBP12RNAi bent. The phenotype of overexpression of HAP5 on pollen tube was restored by FKBP12. Altogether, our study supported the role of HAP5 in pollen tube development and orientation regulation and identified FKBP12 as a novel partner to interact with HAP5 involved in the process

    CONTRACT-BASED MULTI-RATE CONTROL DESIGN

    No full text
    This thesis investigates a contract-based control design framework that aims to solve the constraint partition problem raised in multi-rate control. There are three components in such a framework: the contract coordinator, the high-level motion planner, and the low-level tracking controller. The primary idea of the contract coordinator is to resolve the online constraint partition problem, thereby enhancing coordination between levels and enabling a more adaptable multi-rate control framework. This flexibility in the control framework translates to more choices in selecting the initial state. The contract coordinator is designed based on solving a linear matrix inequality (LMI) problem in real-time, which takes the online information from both levels and outputs the new constraint partition online. Given the input and state constraints, a robust model predictive control with varying constraints is proposed for high-level motion planning. The low level implements the robust control barrier function-based quadratic programming to ensure safety given the constraints from the LMI coordinator. The theoretical guarantees of each level, including feasibility, safety, and stability, are provided. One numerical simulation example is given to show the effectiveness of the proposed method
    corecore