62 research outputs found
Reparameterized Policy Learning for Multimodal Trajectory Optimization
We investigate the challenge of parametrizing policies for reinforcement
learning (RL) in high-dimensional continuous action spaces. Our objective is to
develop a multimodal policy that overcomes limitations inherent in the
commonly-used Gaussian parameterization. To achieve this, we propose a
principled framework that models the continuous RL policy as a generative model
of optimal trajectories. By conditioning the policy on a latent variable, we
derive a novel variational bound as the optimization objective, which promotes
exploration of the environment. We then present a practical model-based RL
method, called Reparameterized Policy Gradient (RPG), which leverages the
multimodal policy parameterization and learned world model to achieve strong
exploration capabilities and high data efficiency. Empirical results
demonstrate that our method can help agents evade local optima in tasks with
dense rewards and solve challenging sparse-reward environments by incorporating
an object-centric intrinsic reward. Our method consistently outperforms
previous approaches across a range of tasks. Code and supplementary materials
are available on the project page https://haosulab.github.io/RPG
Distilling Large Vision-Language Model with Out-of-Distribution Generalizability
Large vision-language models have achieved outstanding performance, but their
size and computational requirements make their deployment on
resource-constrained devices and time-sensitive tasks impractical. Model
distillation, the process of creating smaller, faster models that maintain the
performance of larger models, is a promising direction towards the solution.
This paper investigates the distillation of visual representations in large
teacher vision-language models into lightweight student models using a small-
or mid-scale dataset. Notably, this study focuses on open-vocabulary
out-of-distribution (OOD) generalization, a challenging problem that has been
overlooked in previous model distillation literature. We propose two principles
from vision and language modality perspectives to enhance student's OOD
generalization: (1) by better imitating teacher's visual representation space,
and carefully promoting better coherence in vision-language alignment with the
teacher; (2) by enriching the teacher's language representations with
informative and finegrained semantic attributes to effectively distinguish
between different labels. We propose several metrics and conduct extensive
experiments to investigate their techniques. The results demonstrate
significant improvements in zero-shot and few-shot student performance on
open-vocabulary out-of-distribution classification, highlighting the
effectiveness of our proposed approaches. Our code will be released at
https://github.com/xuanlinli17/large_vlm_distillation_oo
Deductive Verification of Chain-of-Thought Reasoning
Large Language Models (LLMs) significantly benefit from Chain-of-Thought
(CoT) prompting in performing various reasoning tasks. While CoT allows models
to produce more comprehensive reasoning processes, its emphasis on intermediate
reasoning steps can inadvertently introduce hallucinations and accumulated
errors, thereby limiting models' ability to solve complex reasoning tasks.
Inspired by how humans engage in careful and meticulous deductive logical
reasoning processes to solve tasks, we seek to enable language models to
perform explicit and rigorous deductive reasoning, and also ensure the
trustworthiness of their reasoning process through self-verification. However,
directly verifying the validity of an entire deductive reasoning process is
challenging, even with advanced models like ChatGPT. In light of this, we
propose to decompose a reasoning verification process into a series of
step-by-step subprocesses, each only receiving their necessary context and
premises. To facilitate this procedure, we propose Natural Program, a natural
language-based deductive reasoning format. Our approach enables models to
generate precise reasoning steps where subsequent steps are more rigorously
grounded on prior steps. It also empowers language models to carry out
reasoning self-verification in a step-by-step manner. By integrating this
verification process into each deductive reasoning stage, we significantly
enhance the rigor and trustfulness of generated reasoning steps. Along this
process, we also improve the answer correctness on complex reasoning tasks.
Code will be released at https://github.com/lz1oceani/verify_cot
OpenShape: Scaling Up 3D Shape Representation Towards Open-World Understanding
We introduce OpenShape, a method for learning multi-modal joint
representations of text, image, and point clouds. We adopt the commonly used
multi-modal contrastive learning framework for representation alignment, but
with a specific focus on scaling up 3D representations to enable open-world 3D
shape understanding. To achieve this, we scale up training data by ensembling
multiple 3D datasets and propose several strategies to automatically filter and
enrich noisy text descriptions. We also explore and compare strategies for
scaling 3D backbone networks and introduce a novel hard negative mining module
for more efficient training. We evaluate OpenShape on zero-shot 3D
classification benchmarks and demonstrate its superior capabilities for
open-world recognition. Specifically, OpenShape achieves a zero-shot accuracy
of 46.8% on the 1,156-category Objaverse-LVIS benchmark, compared to less than
10% for existing methods. OpenShape also achieves an accuracy of 85.3% on
ModelNet40, outperforming previous zero-shot baseline methods by 20% and
performing on par with some fully-supervised methods. Furthermore, we show that
our learned embeddings encode a wide range of visual and semantic concepts
(e.g., subcategories, color, shape, style) and facilitate fine-grained text-3D
and image-3D interactions. Due to their alignment with CLIP embeddings, our
learned shape representations can also be integrated with off-the-shelf
CLIP-based models for various applications, such as point cloud captioning and
point cloud-conditioned image generation.Comment: Project Website: https://colin97.github.io/OpenShape
Optimize Individualized Energy Delivery for Septic Patients Using Predictive Deep Learning Models: A Real World Study
Background and Objectives: We aim to establish deep learning models to
optimize the individualized energy delivery for septic patients. Methods and
Study Design: We conducted a study of adult septic patients in Intensive Care
Unit (ICU), collecting 47 indicators for 14 days. After data cleaning and
preprocessing, we used stats to explore energy delivery in deceased and
surviving patients. We filtered out nutrition-related features and divided the
data into three metabolic phases: acute early, acute late, and rehabilitation.
Models were built using data before September 2020 and validated on the rest.
We then established optimal energy target models for each phase using deep
learning. Results: A total of 277 patients and 3115 data were included in this
study. The models indicated that the optimal energy targets in the three phases
were 900kcal/d, 2300kcal/d, and 2000kcal/d, respectively. Excessive energy
intake increased mortality rapidly in the early period of the acute phase.
Insufficient energy in the late period of the acute phase significantly raised
the mortality of septic patients. For the rehabilitation phase, too much or too
little energy delivery both associated with high mortality. Conclusion: Our
study established time-series prediction models for septic patients to optimize
energy delivery in the ICU. This approach indicated the feasibility of
developing nutritional tools for critically ill patients. We recommended
permissive underfeeding only in the early acute phase. Later, increased energy
intake may improve survival and settle energy debts caused by underfeeding
Comprehensive genomic analysis of Oesophageal Squamous Cell Carcinoma reveals clinical relevance
Abstract Oesophageal carcinoma is the fourth leading cause of cancer-related death in China, and more than 90% of these tumours are oesophageal squamous cell carcinoma (ESCC). Although several ESCC genomic sequencing studies have identified mutated somatic genes, the number of samples in each study was relatively small, and the molecular basis of ESCC has not been fully elucidated. Here, we performed an integrated analysis of 490 tumours by combining the genomic data from 7 previous ESCC projects. We identified 18 significantly mutated genes (SMGs). PTEN, DCDC1 and CUL3 were first reported as SMGs in ESCC. Notably, the AJUBA mutations and mutational signature4 were significantly correlated with a poorer survival in patients with ESCC. Hierarchical clustering analysis of the copy number alteration (CNA) of cancer gene census (CGC) genes in ESCC patients revealed three subtypes, and subtype3 exhibited more CNAs and marked for worse prognosis compared with subtype2. Moreover, database annotation suggested that two significantly differential CNA genes (PIK3CA and FBXW7) between subtype3 and subtype2 may serve as therapeutic drug targets. This study has extended our knowledge of the genetic basis of ESCC and shed some light into the clinical relevance, which would help improve the therapy and prognosis of ESCC patients
Whole exome sequencing of insulinoma reveals recurrent T372R mutations in YY1
Functional pancreatic neuroendocrine tumours (PNETs) are mainly represented by insulinoma, which secrete insulin independent of glucose and cause hypoglycaemia. The major genetic alterations in sporadic insulinomas are still unknown. Here we identify recurrent somatic T372R mutations in YY1 by whole exome sequencing of 10 sporadic insulinomas. Further screening in 103 additional insulinomas reveals this hotspot mutation in 30% (34/113) of all tumours. T372R mutation alters the expression of YY1 target genes in insulinomas. Clinically, the T372R mutation is associated with the later onset of tumours. Genotyping of YY1, a target of mTOR inhibitors, may contribute to medical treatment of insulinomas. Our findings highlight the importance of YY1 in pancreatic β-cells and may provide therapeutic targets for PNETs
High coverage of targeted lipidomics revealed lipid changes in the follicular fluid of patients with insulin-resistant polycystic ovary syndrome and a positive correlation between plasmalogens and oocyte quality
BackgroundPolycystic ovary syndrome with insulin resistance (PCOS-IR) is the most common endocrine and metabolic disease in women of reproductive age, and low fertility in PCOS patients may be associated with oocyte quality; however, the molecular mechanism through which PCOS-IR affects oocyte quality remains unknown.MethodsA total of 22 women with PCOS-IR and 23 women without polycystic ovary syndrome (control) who underwent in vitro fertilization and embryo transfer were recruited, and clinical information pertaining to oocyte quality was analyzed. Lipid components of follicular fluid (FF) were detected using high-coverage targeted lipidomics, which identified 344 lipid species belonging to 19 lipid classes. The exact lipid species associated with oocyte quality were identified.ResultsThe number (rate) of two pronuclear (2PN) zygotes, the number (rate) of 2PN cleaved embryos, and the number of high-quality embryos were significantly lower in the PCOS-IR group. A total of 19 individual lipid classes and 344 lipid species were identified and quantified. The concentrations of the 19 lipid species in the normal follicular fluid (control) ranged between 10-3 mol/L and 10-9 mol/L. In addition, 39 lipid species were significantly reduced in the PCOS-IR group, among which plasmalogens were positively correlated with oocyte quality.ConclusionsThis study measured the levels of various lipids in follicular fluid, identified a significantly altered lipid profile in the FF of PCOS-IR patients, and established a correlation between poor oocyte quality and plasmalogens in PCOS-IR patients. These findings have contributed to the development of plasmalogen replacement therapy to enhance oocyte quality and have improved culture medium formulations for oocyte in vitro maturation (IVM)
Genomic Analyses Reveal Mutational Signatures and Frequently Altered Genes in Esophageal Squamous Cell Carcinoma
Esophageal squamous cell carcinoma (ESCC) is one of the most common cancers worldwide and the fourth most lethal cancer in China. However, although genomic studies have identified some mutations associated with ESCC, we know little of the mutational processes responsible. To identify genome-wide mutational signatures, we performed either whole-genome sequencing (WGS) or whole-exome sequencing (WES) on 104 ESCC individuals and combined our data with those of 88 previously reported samples. An APOBEC-mediated mutational signature in 47% of 192 tumors suggests that APOBEC-catalyzed deamination provides a source of DNA damage in ESCC. Moreover, PIK3CA hotspot mutations (c.1624G>A [p.Glu542Lys] and c.1633G>A [p.Glu545Lys]) were enriched in APOBEC-signature tumors, and no smoking-associated signature was observed in ESCC. In the samples analyzed by WGS, we identified focal (<100 kb) amplifications of CBX4 and CBX8. In our combined cohort, we identified frequent inactivating mutations in AJUBA, ZNF750, and PTCH1 and the chromatin-remodeling genes CREBBP and BAP1, in addition to known mutations. Functional analyses suggest roles for several genes (CBX4, CBX8, AJUBA, and ZNF750) in ESCC. Notably, high activity of hedgehog signaling and the PI3K pathway in approximately 60% of 104 ESCC tumors indicates that therapies targeting these pathways might be particularly promising strategies for ESCC. Collectively, our data provide comprehensive insights into the mutational signatures of ESCC and identify markers for early diagnosis and potential therapeutic targets
- âŚ