Search CORE

150 research outputs found

Improving the accuracy of template-based predictions by mixing and matching between initial models

Author: Michal Guerquin
Ram Samudrala
Tianyun Liu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

BACKGROUND: Comparative modeling is a technique to predict the three dimensional structure of a given protein sequence based primarily on its alignment to one or more proteins with experimentally determined structures. A major bottleneck of current comparative modeling methods is the lack of methods to accurately refine a starting initial model so that it approaches the resolution of the corresponding experimental structure. We investigate the effectiveness of a graph-theoretic clique finding approach to solve this problem. RESULTS: Our method takes into account the information presented in multiple templates/alignments at the three-dimensional level by mixing and matching regions between different initial comparative models. This method enables us to obtain an optimized conformation ensemble representing the best combination of secondary structures, resulting in the refined models of higher quality. In addition, the process of mixing and matching accumulates near-native conformations, resulting in discriminating the native-like conformation in a more effective manner. In the seventh Critical Assessment of Structure Prediction (CASP7) experiment, the refined models produced are more accurate than the starting initial models. CONCLUSION: This novel approach can be applied without any manual intervention to improve the quality of comparative predictions where multiple template/alignment combinations are available for modeling, producing conformational models of higher quality than the starting initial predictions

Crossref

Springer - Publisher Connector

PubMed Central

Identification of recurring protein structure microenvironments and discovery of novel functional sites around CYS residues

Author: Altman Russ B
Liu Tianyun
Wu Shirley
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background The emergence of structural genomics presents significant challenges in the annotation of biologically uncharacterized proteins. Unfortunately, our ability to analyze these proteins is restricted by the limited catalog of known molecular functions and their associated 3D motifs. Results In order to identify novel 3D motifs that may be associated with molecular functions, we employ an unsupervised, two-phase clustering approach that combines k-means and hierarchical clustering with knowledge-informed cluster selection and annotation methods. We applied the approach to approximately 20,000 cysteine-based protein microenvironments (3D regions 7.5 Å in radius) and identified 70 interesting clusters, some of which represent known motifs (<it>e.g</it>. metal binding and phosphatase activity), and some of which are novel, including several zinc binding sites. Detailed annotation results are available online for all 70 clusters at <url>http://feature.stanford.edu/clustering/cys</url>. Conclusions The use of microenvironments instead of backbone geometric criteria enables flexible exploration of protein function space, and detection of recurring motifs that are discontinuous in sequence and diverse in structure. Clustering microenvironments may thus help to functionally characterize novel proteins and better understand the protein structure-function relationship.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Predicting drug side-effects by chemical systems biology

Author: Altman Russ B
Liu Tianyun
Tatonetti Nicholas P
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Chemical systems biology approaches can explain unexpected observations of drug inefficacy or side-effects

Crossref

PubMed Central

PROTINFO: new algorithms for enhanced protein structure predictions

Author: Hung Ling-Hong
Liu Tianyun
Ngan Shing-Chung
Samudrala Ram
Publication venue: Oxford University Press
Publication date: 01/01/2005
Field of study

We describe new algorithms and modules for protein structure prediction available as part of the PROTINFO web server. The modules, comparative and de novo modelling, have significantly improved back-end algorithms that were rigorously evaluated at the sixth meeting on the Critical Assessment of Protein Structure Prediction methods. We were one of four server groups invited to make an oral presentation (only the best performing groups are asked to do so). These two modules allow a user to submit a protein sequence and return atomic coordinates representing the tertiary structure of that protein. The PROTINFO server is available at

CiteSeerX

Crossref

PubMed Central

FFT: Towards Harmlessness Evaluation and Analysis for LLMs with Factuality, Fairness, Toxicity

Author: Chen Yilong
Cui Shiyao
Liu Tianyun
Liu Tingwen
Wang Siqi
Zhang Wenyuan
Zhang Zhenyu
Publication venue
Publication date: 30/11/2023
Field of study

The widespread of generative artificial intelligence has heightened concerns about the potential harms posed by AI-generated texts, primarily stemming from factoid, unfair, and toxic content. Previous researchers have invested much effort in assessing the harmlessness of generative language models. However, existing benchmarks are struggling in the era of large language models (LLMs), due to the stronger language generation and instruction following capabilities, as well as wider applications. In this paper, we propose FFT, a new benchmark with 2116 elaborated-designed instances, for LLM harmlessness evaluation with factuality, fairness, and toxicity. To investigate the potential harms of LLMs, we evaluate 9 representative LLMs covering various parameter scales, training stages, and creators. Experiments show that the harmlessness of LLMs is still under-satisfactory, and extensive analysis derives some insightful findings that could inspire future research for harmless LLM research.Comment: Work in progres

arXiv.org e-Print Archive