Search CORE

99 research outputs found

Domain-Agnostic Molecular Generation with Self-feedback

Author: Chen Huajun
Chen Zhuo
Fan Xiaohui
Fang Yin
Zhang Ningyu
Publication venue
Publication date: 01/09/2023
Field of study

The generation of molecules with desired properties has gained tremendous popularity, revolutionizing the way scientists design molecular structures and providing valuable support for chemical and drug design. However, despite the potential of language models in molecule generation, they face numerous challenges such as the generation of syntactically or chemically flawed molecules, narrow domain focus, and limitations in creating diverse and directionally feasible molecules due to a dearth of annotated data or external molecular databases. To this end, we introduce MolGen, a pre-trained molecular language model tailored specifically for molecule generation. MolGen acquires intrinsic structural and grammatical insights by reconstructing over 100 million molecular SELFIES, while facilitating knowledge transfer between different domains through domain-agnostic molecular prefix tuning. Moreover, we present a self-feedback paradigm that inspires the pre-trained model to align with the ultimate goal of producing molecules with desirable properties. Extensive experiments on well-known benchmarks confirm MolGen's optimization capabilities, encompassing penalized logP, QED, and molecular docking properties. Further analysis shows that MolGen can accurately capture molecule distributions, implicitly learn their structural characteristics, and efficiently explore chemical space. The pre-trained model, codes, and datasets are publicly available for future research at https://github.com/zjunlp/MolGen.Comment: Work in progress. Add results of binding affinit

arXiv.org e-Print Archive

Mol-Instructions: A Large-Scale Biomolecular Instruction Dataset for Large Language Models

Author: Chen Huajun
Chen Zhuo
Fan Xiaohui
Fang Yin
Huang Rui
Liang Xiaozhuan
Liu Kangwei
Zhang Ningyu
Publication venue
Publication date: 29/08/2023
Field of study

Large Language Models (LLMs), with their remarkable task-handling capabilities and innovative outputs, have catalyzed significant advancements across a spectrum of fields. However, their proficiency within specialized domains such as biomolecular studies remains limited. To address this challenge, we introduce Mol-Instructions, a meticulously curated, comprehensive instruction dataset expressly designed for the biomolecular realm. Mol-Instructions is composed of three pivotal components: molecule-oriented instructions, protein-oriented instructions, and biomolecular text instructions, each curated to enhance the understanding and prediction capabilities of LLMs concerning biomolecular features and behaviors. Through extensive instruction tuning experiments on the representative LLM, we underscore the potency of Mol-Instructions to enhance the adaptability and cognitive acuity of large models within the complex sphere of biomolecular studies, thereby promoting advancements in the biomolecular research community. Mol-Instructions is made publicly accessible for future research endeavors and will be subjected to continual updates for enhanced applicability.Comment: Project homepage: https://github.com/zjunlp/Mol-Instructions. Add quantitative evaluation

arXiv.org e-Print Archive

Recommended from our members

Dissociate lattice oxygen redox reactions from capacity and voltage drops of battery electrodes.

Author: Chuang Yi-de
Dai Kehua
Hussain Zahid
Lebens-Higgins Zachary
Li Qinghao
Liu Gao
Pan Feng
Piper Louis FJ
Rong Xiaohui
Sallis Shawn
Shen Zhi-Xun
Wu Jinpeng
Yang Wanli
Zeng Rong
Zhuo Zengqing
Publication venue: eScholarship, University of California
Publication date: 01/02/2020
Field of study

The oxygen redox (OR) activity is conventionally considered detrimental to the stability and kinetics of batteries. However, OR reactions are often confused by irreversible oxygen oxidation. Here, based on high-efficiency mapping of resonant inelastic x-ray scattering of both the transition metal and oxygen, we distinguish the lattice OR in Na0.6[Li0.2Mn0.8]O2 and compare it with Na2/3[Mg1/3Mn2/3]O2. Both systems display strong lattice OR activities but with distinct electrochemical stability. The comparison shows that the substantial capacity drop in Na0.6[Li0.2Mn0.8]O2 stems from non-lattice oxygen oxidations, and its voltage decay from an increasing Mn redox contribution upon cycling, contrasting those in Na2/3[Mg1/3Mn2/3]O2. We conclude that lattice OR is not the ringleader of the stability issue. Instead, irreversible oxygen oxidation and the changing cationic reactions lead to the capacity and voltage fade. We argue that lattice OR and other oxygen activities should/could be studied and treated separately to achieve viable OR-based electrodes

eScholarship - University of California

BMP10 preserves cardiac function through its dual activation of SMAD-mediated and STAT3-mediated pathways

Author: Cao Dayan
Chen Hanying
Chen Jinghai
Chen Yuwen
Ji Hongrui
Li Xiaohui
Liu Ying
Liu Zhuo
Qu Xiuxia
Shou Weinian
Xiao Deyong
Zhang Wenjun
Zhu Ping
Publication venue: 'American Society for Biochemistry & Molecular Biology (ASBMB)'
Publication date: 27/12/2019
Field of study

Bone morphogenetic protein 10 (BMP10) is a cardiac peptide growth factor belonging to the transforming growth factor β superfamily that critically controls cardiovascular development, growth, and maturation. It has been shown that BMP10 elicits its intracellular signaling through a receptor complex of activin receptor-like kinase 1 with morphogenetic protein receptor type II or activin receptor type 2A. Previously, we generated and characterized a transgenic mouse line expressing BMP10 from the α-myosin heavy chain gene promoter and found that these mice have normal cardiac hypertrophic responses to both physiological and pathological stimuli. In this study, we report that these transgenic mice exhibit significantly reduced levels of cardiomyocyte apoptosis and cardiac fibrosis in response to a prolonged administration of the β-adrenoreceptor agonist isoproterenol. We further confirmed this cardioprotective function with a newly generated conditional Bmp10 transgenic mouse line, in which Bmp10 was activated in adult hearts by tamoxifen. Moreover, the intraperitoneal administration of recombinant human BMP10 was found to effectively protect hearts from injury, suggesting potential therapeutic utility of using BMP10 to prevent heart failure. Gene profiling and biochemical analyses indicated that BMP10 activates the SMAD-mediated canonical pathway and, unexpectedly, also the signal transducer and activator of transcription 3 (STAT3)-mediated signaling pathway both in vivo and in vitro Additional findings further supported the notion that BMP10's cardioprotective function likely is due to its dual activation of SMAD- and STAT3-regulated signaling pathways, promoting cardiomyocyte survival and suppressing cardiac fibrosis

IUPUIScholarWorks

Genotype Profile of Global EYS-Associated Inherited Retinal Dystrophy and Clinical Findings in a Large Chinese Cohort

Author: De-Fu Chen
Haoyu Chang
Hua Gao
Ke Xu
Ren-Juan Shen
Xiao-Fang Wang
Xiaohui Zhang
Yang Li
Yue Xie
Zhuo-Kun Feng
Zi-Bing Jin
Publication venue: 'Frontiers Media SA'
Publication date: 01/06/2021
Field of study

PurposeThe aim of this study was to probe the global profile of the EYS-associated genotype-phenotype trait in the worldwide reported IRD cases and to build a model for predicting disease progression as a reference for clinical consultation.MethodsThis retrospective study of 420 well-documented IRD cases with mutations in the EYS gene included 39 patients from a genotype-phenotype study of inherited retinal dystrophy (IRD) conducted at the Beijing Institute of Ophthalmology and 381 cases retrieved from global reports. All patients underwent ophthalmic evaluation. Mutations were revealed using next-generation sequencing, followed by Sanger DNA sequencing and real-time quantitative PCR analysis. Multiple regression models and statistical analysis were used to assess the genotype and phenotype characteristics and traits in this large cohort.ResultsA total of 420 well-defined patients with 841 identified mutations in the EYS gene were successfully obtained. The most common pathogenic variant was a frameshift c.4957dupA (p.S1653Kfs∗2) in exon 26, with an allele frequency of 12.7% (107/841), followed by c.8805C > A (p.Y2935X) in exon 43, with an allele frequency of 5.9% (50/841). Two new hot spots were identified in the Chinese cohort, c.1750G > T (p.E584X) and c.7492G > C (p.A2498P). Several EYS mutation types were identified, with CNV being relatively common. The mean age of onset was 20.54 ± 11.33 (4–46) years. Clinical examinations revealed a typical progression of RPE atrophy from the peripheral area to the macula.ConclusionThis large global cohort of 420 IRD cases, with 262 distinct variants, identified genotype-phenotype correlations and mutation spectra with hotspots in the EYS gene

Directory of Open Access Journals

Recommended from our members

Reproducible copy number variation patterns among single circulating tumor cells of lung cancer patients

Author: An T.
Bai F.
Bai H.
Chapman Alec Randolph
Duan J.
Gao Yan
Li Z.
Lu Y.
Ma Q.
Ni Xiaohui
Su X.-D.
Su Zhe
Sun Y.
Wang J.
Wang S.
Wang Y.
Wang Z.
Wu M.
Xie Xiaoliang Sunney
Xu L.
Yang X.
Yong Jun
Zhao J.
Zhuo Minglei
Zong Chenghang
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 16/10/2014
Field of study

Circulating tumor cells (CTCs) enter peripheral blood from primary tumors and seed metastases. The genome sequencing of CTCs could offer noninvasive prognosis or even diagnosis, but has been hampered by low single-cell genome coverage of scarce CTCs. Here, we report the use of the recently developed multiple annealing and looping-based amplification cycles for whole-genome amplification of single CTCs from lung cancer patients. We observed characteristic cancer-associated single-nucleotide variations and insertions/deletions in exomes of CTCs. These mutations provided information needed for individualized therapy, such as drug resistance and phenotypic transition, but were heterogeneous from cell to cell. In contrast, every CTC from an individual patient, regardless of the cancer subtypes, exhibited reproducible copy number variation (CNV) patterns, similar to those of the metastatic tumor of the same patient. Interestingly, different patients with the same lung cancer adenocarcinoma (ADC) shared similar CNV patterns in their CTCs. Even more interestingly, patients of small-cell lung cancer have CNV patterns distinctly different from those of ADC patients. Our finding suggests that CNVs at certain genomic loci are selected for the metastasis of cancer. The reproducibility of cancer-specific CNVs offers potential for CTC-based cancer diagnostics.Chemistry and Chemical Biolog

Harvard University - DASH

Design and implementation of the START (STem cells for ARDS Treatment) trial, a phase 1/2 trial of human mesenchymal stem/stromal cells for the treatment of moderate-severe acute respiratory distress syndrome

Author: Caballero Lizette
Calfee Carolyn S
Cosgrove Katherine
Delucchi Kevin L
Fang Xiaohui
Gotts Jeffrey E
Kangelaris Kirsten N
Leavitt Andrew D
Lee Jae-Woo
Lee Jae-Woo
Levitt Joseph E
Liu Kathleen D
Matthay Michael A
McKenna David H
McMillan Melanie L
Rogers Angela J
Thompson B Taylor
Wiener-Kronish Jeanine P
Wilson Jennifer G
Zhuo Hanjing
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Background: Despite advances in supportive care, moderate-severe acute respiratory distress syndrome (ARDS) is associated with high mortality rates, and novel therapies to treat this condition are needed. Compelling pre-clinical data from mouse, rat, sheep and ex vivo perfused human lung models support the use of human mesenchymal stem (stromal) cells (MSCs) as a novel intravenous therapy for the early treatment of ARDS. Methods: This article describes the study design and challenges encountered during the implementation and phase 1 component of the START (STem cells for ARDS Treatment) trial, a phase 1/2 trial of bone marrow-derived human MSCs for moderate-severe ARDS. A trial enrolling 69 subjects is planned (9 subjects in phase 1, 60 subjects in phase 2 treated with MSCs or placebo in a 2:1 ratio). Results: This report describes study design features that are unique to a phase 1 trial in critically ill subjects and the specific challenges of implementation of a cell-based therapy trial in the ICU. Conclusions: Experience gained during the design and implementation of the START study will be useful to investigators planning future phase 1 clinical trials based in the ICU, as well as trials of cell-based therapy for other acute illnesses. Trial registration Clinical Trials Registration: NCT01775774 and NCT02097641

Crossref

Harvard University - DASH

Springer - Publisher Connector

PubMed Central

eScholarship - University of California

QKI is a critical pre-mRNA alternative splicing regulator of cardiac myofibrillogenesis and contractile function

Author: Ba Lina
Cai Chen-Leng
Cao Dayan
Chen Xinyun
Gao Hongyu
Huang Guoying
Huang Jie
Ji Hongrui
Li Xiaohui
Li Xiuya
Liu Ying
Liu Yunlong
Liu Zhuo
Na Jie
Qi Hanping
Sanderson Maria
Sheng Wei
Shou Weinian
Simpson Ed
Sun Ning
Xu Chen
Yamamura Kenichi
Yang Lei
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

The RNA-binding protein QKI belongs to the hnRNP K-homology domain protein family, a well-known regulator of pre-mRNA alternative splicing and is associated with several neurodevelopmental disorders. Qki is found highly expressed in developing and adult hearts. By employing the human embryonic stem cell (hESC) to cardiomyocyte differentiation system and generating QKI-deficient hESCs (hESCs-QKIdel) using CRISPR/Cas9 gene editing technology, we analyze the physiological role of QKI in cardiomyocyte differentiation, maturation, and contractile function. hESCs-QKIdel largely maintain normal pluripotency and normal differentiation potential for the generation of early cardiogenic progenitors, but they fail to transition into functional cardiomyocytes. In this work, by using a series of transcriptomic, cell and biochemical analyses, and the Qki-deficient mouse model, we demonstrate that QKI is indispensable to cardiac sarcomerogenesis and cardiac function through its regulation of alternative splicing in genes involved in Z-disc formation and contractile physiology, suggesting that QKI is associated with the pathogenesis of certain forms of cardiomyopathies

IUPUIScholarWorks

Directory of Open Access Journals

Cost-effectiveness of a national population-based screening program for type 2 diabetes: the Brazil experience

Author: AF Oliveira
AJ Palmer
Bruce B. Duncan
Carísi A. Polanczyk
CDC Diabetes Cost-effectiveness Group
CL Szwarcwald
CM Toscano
Cristiana M. Toscano
DA Malerbi
DATASUS
DB Rolka
HC Gerstein
J Perk
JA Johnson
JD Lubitz
JJ Barendregt
K Pottie
KM Anderson
Kumiko Imai
L Hansson
LB Nucci
LB Nucci
LL Humphrey
LL Humphrey
M Donk Van den
M Marseille
Maria Inês Schmidt
MI Harris
MI Schmidt
Michael Engelgau
Ministerio Saude da
MM Engelgau
National Institute for Health and Care Excellence (NICE)
P Gaede
P Zhang
Ping Zhang
R Collins
R Kahn
RC Eastman
RK Simmons
RL Sacco
RO Estacio
RR Holman
RR Holman
RS Kuchenbecker
RW Schrier
Screening for type
The Diabetes Control and Complications Trial Research Group
TJ Hoerger
UK Prospective Diabetes Study (UKPDS) Group
UK Prospective Diabetes Study (UKPDS) Group.
UK Prospective Diabetes Study Group
X Zhuo
Xiaohui Zhuo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Background: The cost-effectiveness of screening for type 2 diabetes mellitus (DM2) in developing countries remains unknown. The Brazilian government conducted a nationwide population screening program for type 2 diabetes mellitus (BNDSP) in which 22 million capillary glucose tests were performed in individuals aged 40 years and older. The objective of this study was to evaluate the life-time cost-effectiveness of a national population-based screening program for DM2 conducted in Brazil. Methods: We used a Markov-based cost-effectiveness model to simulate the long-term costs and benefits of screening for DM2, compared to no screening program. The analysis was conducted from a public health care system perspective. Sensitivity analyses were conducted to examine the robustness of results to key model parameters. Results: Brazilian National diabetes screening program will yield a large health benefit and higher costs. Compared with no screening, screen detection of undiagnosed diabetes resulted in US

31,147 per QALY gained. Results from sensitivity analyses found that screening targeted at hypertensive individuals would cost US

22,695/QALY. When benefits from early glycemic control on cardiovascular outcomes were considered, the cost per QALY gained would reduce significantly. Conclusions: In the base case analysis, not considering the intangible benefit of transferring diabetes management to primary care nor the benefit of using statin to treat eligible diabetic patients, CE ratios were not cost-effective considering thresholds proposed by the World Health Organization. However, significant uncertainty was demonstrated in sensitivity analysis. Our results indicate that policy-makers should carefully balance the benefit and cost of the program while considering using a population-based approach to screen for diabetes

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Springer - Publisher Connector

Lume 5.8

PubMed Central