63 research outputs found

    Vid2Act: Activate Offline Videos for Visual RL

    Full text link
    Pretraining RL models on offline video datasets is a promising way to improve their training efficiency in online tasks, but challenging due to the inherent mismatch in tasks, dynamics, and behaviors across domains. A recent model, APV, sidesteps the accompanied action records in offline datasets and instead focuses on pretraining a task-irrelevant, action-free world model within the source domains. We present Vid2Act, a model-based RL method that learns to transfer valuable action-conditioned dynamics and potentially useful action demonstrations from offline to online settings. The main idea is to use the world models not only as simulators for behavior learning but also as tools to measure the domain relevance for both dynamics representation transfer and policy transfer. Specifically, we train the world models to generate a set of time-varying task similarities using a domain-selective knowledge distillation loss. These similarities serve two purposes: (i) adaptively transferring the most useful source knowledge to facilitate dynamics learning, and (ii) learning to replay the most relevant source actions to guide the target policy. We demonstrate the advantages of Vid2Act over the action-free visual RL pretraining method in both Meta-World and DeepMind Control Suite

    Numerical investigation of multi-nozzle ejector device with inclined nozzles for marine gas turbine

    Get PDF
    The high-temperature exhaust gases and the hot surfaces of the ejector device in marine gas turbines generate significant levels of infrared radiation. An appropriate nozzle structure can effectively lower the exhaust gas temperature and reduce the high-temperature radiation surface area, thereby minimizing external infrared radiation. In this study, a numerical simulation of the nozzle structure in the ejector device was conducted using computational fluid dynamics (CFD) methods. By investigating the orthogonal combinations of nozzle inclination angles and the number of nozzles, the temperature distribution and flow characteristics under different operating conditions were analysed. The results showed that as the nozzle inclination angle increased, the entrainment coefficient (Ce) and the temperature ratio at the inlet and outlet (Rt) initially improved but then worsened. Simultaneously, the pressure loss coefficient (Cpl) increased with the inclination angle. The CRITIC weight method was employed to objectively allocate weights to Rt, Ce, and Cpl, determining the optimal solution. The results indicated that Rt and Cpl had significant and similar weights. The optimal solution was found in Case 10 (α = 5°, x = 4), with corresponding evaluation indices of Ce=2.38, Cpl=11.45, and =0.68. This study\u27s findings are of great importance for enhancing the performance of marine gas turbines and reducing external infrared radiation

    Large Language Models are In-Context Semantic Reasoners rather than Symbolic Reasoners

    Full text link
    The emergent few-shot reasoning capabilities of Large Language Models (LLMs) have excited the natural language and machine learning community over recent years. Despite of numerous successful applications, the underlying mechanism of such in-context capabilities still remains unclear. In this work, we hypothesize that the learned \textit{semantics} of language tokens do the most heavy lifting during the reasoning process. Different from human's symbolic reasoning process, the semantic representations of LLMs could create strong connections among tokens, thus composing a superficial logical chain. To test our hypothesis, we decouple semantics from the language reasoning process and evaluate three kinds of reasoning abilities, i.e., deduction, induction and abduction. Our findings reveal that semantics play a vital role in LLMs' in-context reasoning -- LLMs perform significantly better when semantics are consistent with commonsense but struggle to solve symbolic or counter-commonsense reasoning tasks by leveraging in-context new knowledge. The surprising observations question whether modern LLMs have mastered the inductive, deductive and abductive reasoning abilities as in human intelligence, and motivate research on unveiling the magic existing within the black-box LLMs. On the whole, our analysis provides a novel perspective on the role of semantics in developing and evaluating language models' reasoning abilities. Code is available at {\url{https://github.com/XiaojuanTang/ICSR}}

    JARVIS-1: Open-World Multi-task Agents with Memory-Augmented Multimodal Language Models

    Full text link
    Achieving human-like planning and control with multimodal observations in an open world is a key milestone for more functional generalist agents. Existing approaches can handle certain long-horizon tasks in an open world. However, they still struggle when the number of open-world tasks could potentially be infinite and lack the capability to progressively enhance task completion as game time progresses. We introduce JARVIS-1, an open-world agent that can perceive multimodal input (visual observations and human instructions), generate sophisticated plans, and perform embodied control, all within the popular yet challenging open-world Minecraft universe. Specifically, we develop JARVIS-1 on top of pre-trained multimodal language models, which map visual observations and textual instructions to plans. The plans will be ultimately dispatched to the goal-conditioned controllers. We outfit JARVIS-1 with a multimodal memory, which facilitates planning using both pre-trained knowledge and its actual game survival experiences. JARVIS-1 is the existing most general agent in Minecraft, capable of completing over 200 different tasks using control and observation space similar to humans. These tasks range from short-horizon tasks, e.g., "chopping trees" to long-horizon tasks, e.g., "obtaining a diamond pickaxe". JARVIS-1 performs exceptionally well in short-horizon tasks, achieving nearly perfect performance. In the classic long-term task of ObtainDiamondPickaxe\texttt{ObtainDiamondPickaxe}, JARVIS-1 surpasses the reliability of current state-of-the-art agents by 5 times and can successfully complete longer-horizon and more challenging tasks. The project page is available at https://craftjarvis.org/JARVIS-1Comment: update project pag

    The First Case of Ischemia-Free Kidney Transplantation in Humans

    Get PDF
    Background: Ischemia-reperfusion injury (IRI) has been considered an inevitable event in organ transplantation since the first successful kidney transplant was performed in 1954. To avoid IRI, we have established a novel procedure called ischemia-free organ transplantation. Here, we describe the first case of ischemia-free kidney transplantation (IFKT). Materials and Methods: The kidney graft was donated by a 19-year-old brain-dead donor. The recipient was a 47-year-old man with end-stage diabetic nephropathy. The graft was procured, preserved, and implanted without cessation of blood supply using normothermic machine perfusion. Results: The graft appearance, perfusion flow, and urine production suggested that the kidney was functioning well-during the whole procedure. The creatinine dropped rapidly to normal range within 3 days post-transplantation. The levels of serum renal injury markers were low post-transplantation. No rejection or vascular or infectious complications occurred. The patient had an uneventful recovery. Conclusion: This paper marks the first case of IFKT in humans. This innovation may offer a unique solution to optimizing transplant outcomes in kidney transplantation

    A predictive energy management strategy for multi-mode plug-in hybrid electric vehicles based on multi neural networks

    Get PDF
    Online optimal energy management of plug-in hybrid electric vehicles has been continually investigated for better fuel economy. This paper proposed a predictive energy management strategy based on multi neural networks for a multi-mode plug-in hybrid electric vehicle. To attain it, firstly, the offline optimal results prepared for knowledge learning are derived by dynamic programming and Pontryagin’s minimum principle. Then, the mode recognition neural network is trained based on the optimal results of dynamic programming and the recurrent neural network is firstly exploited to realize online co-state estimation application. Consequently, the velocity prediction-based online model predictive control framework is established with the co-state correction and slacked constraints to solve the real-time optimal control sequence. A series of numerical simulation results validate that the optimal performance yielded from global optimal strategy can be exploited online to attain the satisfied cost reduction, compared with equivalent consumption minimum strategy, with the assistance of estimated real time co-state and slacked reference. In addition, the computation duration of proposed algorithm decreases by 23.40%, compared with conventional Pontryagin’s minimum principle-based model predictive control scheme, thereby proving its online application potential

    Scoliotic posture as the initial symptom in adolescents with lumbar disc herniation: its curve pattern and natural history after lumbar discectomy

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>There have been few studies focusing on the curve pattern of scoliosis caused by lumbar disc herniation (LDH) in adolescents and the natural history of scoliosis after discectomy. The current study was carried out to identify the curve pattern of scoliosis and investigate the effect of posterior discectomy on the curve improvement in adolescents with LDH.</p> <p>Methods</p> <p>This review focused on a group of 26 adolescents with LDH who initially presented to our clinic for evaluation of scoliosis, followed by posterior discectomy between 2000 and 2009. Radiographic measurements included curve pattern, specific curve features, trunk shift, and sagittal profile. The correlation between the side of disc herniation and the direction of lumbosacral curve and the trunk shift was evaluated.</p> <p>Results</p> <p>A typical curve pattern was initially identified in all of the patients as a short lumbosacral curve accompanied with a long thoracic or thoracolumbar curve toward the opposite side. 23 of 26 patients (88.5%) had a trunk shift more than 2.0 cm away from the midline, showing a poor coronal balance. A relatively straight sagittal profile was noted in all the patients. 84.6% (22/26) patients had a disc herniation at the convex side of lumbosacral curve. Similarly, 73.1% (19/26) patients showed a trunk shift toward the opposite side of disc herniation. All of the patients had an marked curve improvement immediately after discectomy. In the 17 patients with a more than 2-year follow-up, only two had a residual lumbosacral curve greater than or equal to 20 degrees. The mean ODI improved from 21.4% before surgery to 7.3% at the final follow-up.</p> <p>Conclusions</p> <p>A short lumbosacral curve accompanied with a long thoracic or thoracolumbar curve toward the opposite side, and a relatively straight sagittal profile have been noted in all the patients. The direction of lumbosacral curve and trunk shift was related to the side of disc herniation. A majority of patients have a small curve size while assosiated with a significant coronal imbalance. Earlier decompression can provide a greater opportunity for spontaneous correction of scoliosis.</p
    • …
    corecore