Search CORE

56 research outputs found

Ab initio modeling of small proteins by iterative TASSER simulations

Author: Skolnick Jeffrey
Wu Sitao
Zhang Yang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 31/01/2014
Field of study

Background: Predicting 3-dimensional protein structures from amino-acid sequences is an important unsolved problem in computational structural biology. The problem becomes relatively easier if close homologous proteins have been solved, as high-resolution models can be built by aligning target sequences to the solved homologous structures. However, for sequences without similar folds in the Protein Data Bank (PDB) library, the models have to be predicted from scratch. Progress in the ab initio structure modeling is slow. The aim of this study was to extend the TASSER (threading/assembly/refinement) method for the ab initio modeling and examine systemically its ability to fold small single-domain proteins. Results: We developed I-TASSER by iteratively implementing the TASSER method, which is used in the folding test of three benchmarks of small proteins. First, data on 16 small proteins (< 90 residues) were used to generate I-TASSER models, which had an average Cα-root mean square deviation (RMSD) of 3.8Å, with 6 of them having a Cα-RMSD < 2.5Å. The overall result was comparable with the all-atomic ROSETTA simulation, but the central processing unit (CPU) time by I-TASSER was much shorter (150 CPU days vs. 5 CPU hours). Second, data on 20 small proteins (< 120 residues) were used. I-TASSER folded four of them with a Cα-RMSD < 2.5Å. The average Cα-RMSD of the I-TASSER models was 3.9Å, whereas it was 5.9Å using TOUCHSTONE-II software. Finally, 20 non-homologous small proteins (< 120 residues) were taken from the PDB library. An average Cα-RMSD of 3.9Å was obtained for the third benchmark, with seven cases having a Cα-RMSD < 2.5Å. Conclusion: Our simulation results show that I-TASSER can consistently predict the correct folds and sometimes high-resolution models for small single-domain proteins. Compared with other ab initio modeling methods such as ROSETTA and TOUCHSTONE II, the average performance of I-TASSER is either much better or is similar within a lower computational time. These data, together with the significant performance of automated I-TASSER server (the Zhang-Server) in the 'free modeling' section of the recent Critical Assessment of Structure Prediction (CASP)7 experiment, demonstrate new progresses in automated ab initio model generation. The I-TASSER server is freely available for academic users http://zhang.bioinformatics.ku.edu/I-TASSER webcite

KU ScholarWorks (Univ. of Kansas)

Learning Emotion Representations from Verbal and Nonverbal Communication

Author: Pan Yimu
Wang James Z.
Zhang Sitao
Publication venue
Publication date: 22/05/2023
Field of study

Emotion understanding is an essential but highly challenging component of artificial general intelligence. The absence of extensively annotated datasets has significantly impeded advancements in this field. We present EmotionCLIP, the first pre-training paradigm to extract visual emotion representations from verbal and nonverbal communication using only uncurated data. Compared to numerical labels or descriptions used in previous methods, communication naturally contains emotion information. Furthermore, acquiring emotion representations from communication is more congruent with the human learning process. We guide EmotionCLIP to attend to nonverbal emotion cues through subject-aware context encoding and verbal emotion cues using sentiment-guided contrastive learning. Extensive experiments validate the effectiveness and transferability of EmotionCLIP. Using merely linear-probe evaluation protocol, EmotionCLIP outperforms the state-of-the-art supervised visual emotion recognition methods and rivals many multimodal approaches across various benchmarks. We anticipate that the advent of EmotionCLIP will address the prevailing issue of data scarcity in emotion understanding, thereby fostering progress in related domains. The code and pre-trained models are available at https://github.com/Xeaver/EmotionCLIP.Comment: CVPR 202

arXiv.org e-Print Archive

MicroNAS: Zero-Shot Neural Architecture Search for MCUs

Author: Huang Sitao
Qiao Ye
Xu Haocheng
Zhang Yifan
Publication venue
Publication date: 17/01/2024
Field of study

Neural Architecture Search (NAS) effectively discovers new Convolutional Neural Network (CNN) architectures, particularly for accuracy optimization. However, prior approaches often require resource-intensive training on super networks or extensive architecture evaluations, limiting practical applications. To address these challenges, we propose MicroNAS, a hardware-aware zero-shot NAS framework designed for microcontroller units (MCUs) in edge computing. MicroNAS considers target hardware optimality during the search, utilizing specialized performance indicators to identify optimal neural architectures without high computational costs. Compared to previous works, MicroNAS achieves up to 1104x improvement in search efficiency and discovers models with over 3.23x faster MCU inference while maintaining similar accurac

arXiv.org e-Print Archive

CARMA: Context-Aware Runtime Reconfiguration for Energy-Efficient Sensor Fusion

Author: Faruque Mohammad Abdullah Al
Huang Sitao
Li Yuhui
Malawade Arnav Vaibhav
Seong DongHwan
Zhang Xiaofang
Zhang Yifan
Publication venue
Publication date: 27/06/2023
Field of study

Autonomous systems (AS) are systems that can adapt and change their behavior in response to unanticipated events and include systems such as aerial drones, autonomous vehicles, and ground/aquatic robots. AS require a wide array of sensors, deep-learning models, and powerful hardware platforms to perceive and safely operate in real-time. However, in many contexts, some sensing modalities negatively impact perception while increasing the system's overall energy consumption. Since AS are often energy-constrained edge devices, energy-efficient sensor fusion methods have been proposed. However, existing methods either fail to adapt to changing scenario conditions or to optimize energy efficiency system-wide. We propose CARMA: a context-aware sensor fusion approach that uses context to dynamically reconfigure the computation flow on a Field-Programmable Gate Array (FPGA) at runtime. By clock-gating unused sensors and model sub-components, CARMA significantly reduces the energy used by a multi-sensory object detector without compromising performance. We use a Deep-learning Processor Unit (DPU) based reconfiguration approach to minimize the latency of model reconfiguration. We evaluate multiple context-identification strategies, propose a novel system-wide energy-performance joint optimization, and evaluate scenario-specific perception performance. Across challenging real-world sensing contexts, CARMA outperforms state-of-the-art methods with up to 1.3x speedup and 73% lower energy consumption.Comment: Accepted to be published in the 2023 ACM/IEEE International Symposium on Low Power Electronics and Design (ISLPED 2023

arXiv.org e-Print Archive

Cockayne Syndrome Linked to Elevated R-Loops Induced by Stalled RNA Polymerase II during Transcription Elongation

Author: Fu Xiang-Dong
Hao Yajing
Hu Jing
Qian Hao
Wang Dong
Xu Jun
Zhang Dongyang
Zhang Sitao
Zhang Xuan
Publication venue: eScholarship, University of California
Publication date: 01/01/2024
Field of study

Mutations in the Cockayne Syndrome group B (CSB) gene cause cancer in mice, but premature aging and severe neurodevelopmental defects in humans. CSB, a member of the SWI/SNF family of chromatin remodelers, plays diverse roles in regulating gene expression and transcription-coupled nucleotide excision repair (TC-NER); however, these functions do not explain the distinct phenotypic differences observed between CSB-deficient mice and humans. During investigating Cockayne Syndrome-associated genome instability, we uncover an intrinsic mechanism that involves elongating RNA polymerase II (RNAPII) undergoing transient pauses at internal T-runs where CSB is required to propel RNAPII forward. Consequently, CSB deficiency retards RNAPII elongation in these regions, and when coupled with G-rich sequences upstream, exacerbates genome instability by promoting R-loop formation. These R-loop prone motifs are notably abundant in relatively long genes related to neuronal functions in the human genome, but less prevalent in the mouse genome. These findings provide mechanistic insights into differential impacts of CSB deficiency on mice versus humans and suggest that the manifestation of the Cockayne Syndrome phenotype in humans results from the progressive evolution of mammalian genomes

Directory of Open Access Journals

eScholarship - University of California

LOMETS: A local meta-threading-server for protein structure prediction

Author: Altschul
Baker
Domingues
Fischer
Fischer
Ginalski
Henikoff
Hobohm
Jaroszewski
Jones
Kabsch
Karplus
Karplus
Kurowski
Lundstrom
Mizuguchi
Needleman
Rabiner
Rychlewski
Sali
Shi
Simons
Sitao Wu
Skolnick
Soding
Wallner
Wu
Xu
Xu
Yang Zhang
Zhang
Zhang
Zhang
Zhang
Zhang
Zhang
Zhou
Zhou
Publication venue: Oxford University Press
Publication date: 17/04/2014
Field of study

We developed LOMETS, a local threading meta-server, for quick and automated predictions of protein tertiary structures and spatial constraints. Nine state-of-the-art threading programs are installed and run in a local computer cluster, which ensure the quick generation of initial threading alignments compared with traditional remote-server-based meta-servers. Consensus models are generated from the top predictions of the component-threading servers, which are at least 7% more accurate than the best individual servers based on TM-score at a t-test significance level of 0.1%. Moreover, side-chain and C-alpha (Cα) contacts of 42 and 61% accuracy respectively, as well as long- and short-range distant maps, are automatically constructed from the threading alignments. These data can be easily used as constraints to guide the ab initio procedures such as TASSER for further protein tertiary structure modeling. The LOMETS server is freely available to the academic community at http://zhang.bioinformatics.ku.edu/LOMETS

Crossref

KU ScholarWorks (Univ. of Kansas)

PubMed Central

Identification and on-site application of the main hazard-causing stratum of overlying strata in coal mines

Author: Bing WANG
Fuxing JIANG
Jinhai LIU
Junfeng PAN
Sitao ZHU
Xiufeng ZHANG
Yang CHEN
Yongtao GAO
Yuzhen MA
Publication venue: Editorial Office of Journal of China Coal Society
Publication date: 01/06/2024
Field of study

In response to the challenging task of accurately identifying the main hazard-causing layer of overlying strata in the coal mine surface hydraulic fracturing construction, this study focuses on the industrial test of ground hydraulic fracturing at the 401102 working face of the Mengcun Coal Mine. The research is conducted using the methods of theoretical analysis, microseismic monitoring, and on-site investigation to reveal the dynamic disaster mechanism of mine earthquakes and rock bursts induced by the movement of thick and hard overlying strata in the coal mines. The relationship between the movement characteristics of thick and hard overlying strata based on a three-zone structure loading model of overlying strata and induced dynamic disasters is analyzed, and a prediction model for mining seismic energy and an estimation model for equivalent additional stress in mining areas based on the movement state of key layers are established. A coal mine identification technology for the main hazard-causing layer of overlying strata is proposed based on the K-means clustering algorithm and the elbow rule. The construction layer for hydraulic fracturing is determined, and an industrial test is carried out on-site. The effectiveness is verified based on the microseismic monitoring data and theoretical analysis results, leading to the following conclusions. In the Mengcun Coal Mine’s 401102 working face, both the key stratum responsible for rock bursts and mine seismic activities can be traced to the R9 key stratum of the Anding Group, situated 66 meters away from the coal seam. The primary fracturing movement of this critical stratum R9 imparts an equivalent supplementary disturbance stress value of 7.23 MPa, with the seismic energy liberated by this initial rupture motion quantifying to 6.08×105 J, thereby indicating a pronounced susceptibility towards catastrophic occurrences. After fracturing the key layer which induces mining earthquakes and rock bursts, the theoretical value of the mine earthquake energy is reduced by 94%, and the theoretical value of the equivalent disturbance stress of the working face is reduced by 76%. High-energy microseismic events above the working face with an energy of 5×103 J show a noticeable upward trend, with an upward movement of approximately 15 m. The frequency ratio of microseismic events with an energy level of 103 J or higher significantly decreases from 60.39% to 17.89%, and the maximum microseismic event energy decreases from 6.65×105 J to 9.75×103 J. The proportion of microseismic events with an energy level of 102 J and below significantly increases from 39.61% to 82.11%

Directory of Open Access Journals

ANGLOR: A Composite Machine-Learning Algorithm for Protein Backbone Torsion Angle Prediction

Author: AG de Brevern
AG de Brevern
C Branden
C Bystroff
C Mooney
C Zhang
CJC Burges
David Jones
DT Jones
H Chen
MH Zaman
MJ Wood
MV Berjanskii
NC Fitzkee
O Dor
O Zimmermann
R Karchin
R Kuang
S Haykin
S Neal
S Wu
S Wu
S Wu
SF Altschul
Sitao Wu
U Hobohm
V Vapnik
W Kabsch
Y Zhang
Y Zhang
Y Zhang
Yang Zhang
YM Huang
Publication venue: Public Library of Science
Publication date: 15/10/2008
Field of study

We developed a composite machine-learning based algorithm, called ANGLOR, to predict real-value protein backbone torsion angles from amino acid sequences. The input features of ANGLOR include sequence profiles, predicted secondary structure and solvent accessibility. In a large-scale benchmarking test, the mean absolute error (MAE) of the phi/psi prediction is 28°/46°, which is ∼10% lower than that generated by software in literature. The prediction is statistically different from a random predictor (or a purely secondary-structure-based predictor) with p-value <1.0×10−300 (or <1.0×10−148) by Wilcoxon signed rank test. For some residues (ILE, LEU, PRO and VAL) and especially the residues in helix and buried regions, the MAE of phi angles is much smaller (10–20°) than that in other environments. Thus, although the average accuracy of the ANGLOR prediction is still low, the portion of the accurately predicted dihedral angles may be useful in assisting protein fold recognition and ab initio 3D structure modeling

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

KU ScholarWorks (Univ. of Kansas)

PubMed Central