472 research outputs found
Recommended from our members
Multiomics modeling of the immunome, transcriptome, microbiome, proteome and metabolome adaptations during human pregnancy.
MotivationMultiple biological clocks govern a healthy pregnancy. These biological mechanisms produce immunologic, metabolomic, proteomic, genomic and microbiomic adaptations during the course of pregnancy. Modeling the chronology of these adaptations during full-term pregnancy provides the frameworks for future studies examining deviations implicated in pregnancy-related pathologies including preterm birth and preeclampsia.ResultsWe performed a multiomics analysis of 51 samples from 17 pregnant women, delivering at term. The datasets included measurements from the immunome, transcriptome, microbiome, proteome and metabolome of samples obtained simultaneously from the same patients. Multivariate predictive modeling using the Elastic Net (EN) algorithm was used to measure the ability of each dataset to predict gestational age. Using stacked generalization, these datasets were combined into a single model. This model not only significantly increased predictive power by combining all datasets, but also revealed novel interactions between different biological modalities. Future work includes expansion of the cohort to preterm-enriched populations and in vivo analysis of immune-modulating interventions based on the mechanisms identified.Availability and implementationDatasets and scripts for reproduction of results are available through: https://nalab.stanford.edu/multiomics-pregnancy/.Supplementary informationSupplementary data are available at Bioinformatics online
Converting Your Thoughts to Texts: Enabling Brain Typing via Deep Feature Learning of EEG Signals
An electroencephalography (EEG) based Brain Computer Interface (BCI) enables
people to communicate with the outside world by interpreting the EEG signals of
their brains to interact with devices such as wheelchairs and intelligent
robots. More specifically, motor imagery EEG (MI-EEG), which reflects a
subjects active intent, is attracting increasing attention for a variety of BCI
applications. Accurate classification of MI-EEG signals while essential for
effective operation of BCI systems, is challenging due to the significant noise
inherent in the signals and the lack of informative correlation between the
signals and brain activities. In this paper, we propose a novel deep neural
network based learning framework that affords perceptive insights into the
relationship between the MI-EEG data and brain activities. We design a joint
convolutional recurrent neural network that simultaneously learns robust
high-level feature presentations through low-dimensional dense embeddings from
raw MI-EEG signals. We also employ an Autoencoder layer to eliminate various
artifacts such as background activities. The proposed approach has been
evaluated extensively on a large- scale public MI-EEG dataset and a limited but
easy-to-deploy dataset collected in our lab. The results show that our approach
outperforms a series of baselines and the competitive state-of-the- art
methods, yielding a classification accuracy of 95.53%. The applicability of our
proposed approach is further demonstrated with a practical BCI system for
typing.Comment: 10 page
UBI-XGB: IDENTIFICATION OF UBIQUITIN PROTEINS USING MACHINE LEARNING MODEL
A recent line of research has focused on Ubiquitination, a pervasive and proteasome-mediated protein degradation that controls apoptosis and is crucial in the breakdown of proteins and the development of cell disorders, is a major factor. The turnover of proteins and ubiquitination are two related processes. We predict ubiquitination sites; these attributes are lastly fed into the extreme gradient boosting (XGBoost) classifier. We develop reliable predictors computational tool using experimental identification of protein ubiquitination sites is typically labor- and time-intensive. First, we encoded protein sequence features into matrix data using Dipeptide Deviation from Expected Mean (DDE) features encoding techniques. We also proposed 2nd features extraction model named dipeptide composition (DPC) model. It is vital to develop reliable predictors since experimental identification of protein ubiquitination sites is typically labor- and time-intensive. In this paper, we proposed computational method as named Ubipro-XGBoost, a multi-view feature-based technique for predicting ubiquitination sites. Recent developments in proteomic technology have sparked renewed interest in the identification of ubiquitination sites in a number of human disorders, which have been studied experimentally and clinically. When more experimentally verified ubiquitination sites appear, we developed a predictive algorithm that can locate lysine ubiquitination sites in large-scale proteome data. This paper introduces Ubipro-XGBoost, a machine learning method. Ubipro-XGBoost had an AUC (area under the Receiver Operating Characteristic curve) of 0.914% accuracy, 0.836% Sensitivity, 0.992% Specificity, and 0.839% MCC on a 5-fold cross validation based on DPC model, and 2nd 0.909% accuracy, 0.839% Sensitivity, 0.979% Specificity, and 0. 0.829% MCC on a 5-fold cross validation based on DDE model. The findings demonstrate that the suggested technique, Ubipro-XGBoost, outperforms conventional ubiquitination prediction methods and offers fresh advice for ubiquitination site identification
Co-Attentive Cross-Modal Deep Learning for Medical Evidence Synthesis and Decision Making
Modern medicine requires generalised approaches to the synthesis and
integration of multimodal data, often at different biological scales, that can
be applied to a variety of evidence structures, such as complex disease
analyses and epidemiological models. However, current methods are either slow
and expensive, or ineffective due to the inability to model the complex
relationships between data modes which differ in scale and format. We address
these issues by proposing a cross-modal deep learning architecture and
co-attention mechanism to accurately model the relationships between the
different data modes, while further reducing patient diagnosis time.
Differentiating Parkinson's Disease (PD) patients from healthy patients forms
the basis of the evaluation. The model outperforms the previous
state-of-the-art unimodal analysis by 2.35%, while also being 53% more
parameter efficient than the industry standard cross-modal model. Furthermore,
the evaluation of the attention coefficients allows for qualitative insights to
be obtained. Through the coupling with bioinformatics, a novel link between the
interferon-gamma-mediated pathway, DNA methylation and PD was identified. We
believe that our approach is general and could optimise the process of medical
evidence synthesis and decision making in an actionable way
IDENTIFYING MOLECULAR FUNCTIONS OF DYNEIN MOTOR PROTEINS USING EXTREME GRADIENT BOOSTING ALGORITHM WITH MACHINE LEARNING
The majority of cytoplasmic proteins and vesicles move actively primarily to dynein motor proteins, which are the cause of muscle contraction. Moreover, identifying how dynein are used in cells will rely on structural knowledge. Cytoskeletal motor proteins have different molecular roles and structures, and they belong to three superfamilies of dynamin, actin and myosin. Loss of function of specific molecular motor proteins can be attributed to a number of human diseases, such as Charcot-Charcot-Dystrophy and kidney disease. It is crucial to create a precise model to identify dynein motor proteins in order to aid scientists in understanding their molecular role and designing therapeutic targets based on their influence on human disease. Therefore, we develop an accurate and efficient computational methodology is highly desired, especially when using cutting-edge machine learning methods. In this article, we proposed a machine learning-based superfamily of cytoskeletal motor protein locations prediction method called extreme gradient boosting (XGBoost). We get the initial feature set All by extraction the protein features from the sequence and evolutionary data of the amino acid residues named BLOUSM62. Through our successful eXtreme gradient boosting (XGBoost), accuracy score 0.8676%, Precision score 0.8768%, Sensitivity score 0.760%, Specificity score 0.9752% and MCC score 0.7536%. Our method has demonstrated substantial improvements in the performance of many of the evaluation parameters compared to other state-of-the-art methods. This study offers an effective model for the classification of dynein proteins and lays a foundation for further research to improve the efficiency of protein functional classification
Evaluating Mental Stress Among College Students Using Heart Rate and Hand Acceleration Data Collected from Wearable Sensors
Stress is various mental health disorders including depression and anxiety
among college students. Early stress diagnosis and intervention may lower the
risk of developing mental illnesses. We examined a machine learning-based
method for identification of stress using data collected in a naturalistic
study utilizing self-reported stress as ground truth as well as physiological
data such as heart rate and hand acceleration. The study involved 54 college
students from a large campus who used wearable wrist-worn sensors and a mobile
health (mHealth) application continuously for 40 days. The app gathered
physiological data including heart rate and hand acceleration at one hertz
frequency. The application also enabled users to self-report stress by tapping
on the watch face, resulting in a time-stamped record of the self-reported
stress. We created, evaluated, and analyzed machine learning algorithms for
identifying stress episodes among college students using heart rate and
accelerometer data. The XGBoost method was the most reliable model with an AUC
of 0.64 and an accuracy of 84.5%. The standard deviation of hand acceleration,
standard deviation of heart rate, and the minimum heart rate were the most
important features for stress detection. This evidence may support the efficacy
of identifying patterns in physiological reaction to stress using smartwatch
sensors and may inform the design of future tools for real-time detection of
stress
An explainable model of host genetic interactions linked to COVID-19 severity
We employed a multifaceted computational strategy to identify the genetic factors contributing to increased risk of severe COVID-19 infection from a Whole Exome Sequencing (WES) dataset of a cohort of 2000 Italian patients. We coupled a stratified k-fold screening, to rank variants more associated with severity, with the training of multiple supervised classifiers, to predict severity based on screened features. Feature importance analysis from tree-based models allowed us to identify 16 variants with the highest support which, together with age and gender covariates, were found to be most predictive of COVID-19 severity. When tested on a follow-up cohort, our ensemble of models predicted severity with high accuracy (ACC = 81.88%; AUCROC = 96%; MCC = 61.55%). Our model recapitulated a vast literature of emerging molecular mechanisms and genetic factors linked to COVID-19 response and extends previous landmark Genome-Wide Association Studies (GWAS). It revealed a network of interplaying genetic signatures converging on established immune system and inflammatory processes linked to viral infection response. It also identified additional processes cross-talking with immune pathways, such as GPCR signaling, which might offer additional opportunities for therapeutic intervention and patient stratification. Publicly available PheWAS datasets revealed that several variants were significantly associated with phenotypic traits such as "Respiratory or thoracic disease", supporting their link with COVID-19 severity outcome.A multifaceted computational strategy identifies 16 genetic variants contributing to increased risk of severe COVID-19 infection from a Whole Exome Sequencing dataset of a cohort of Italian patients
An explainable model of host genetic interactions linked to COVID-19 severity
We employed a multifaceted computational strategy to identify the genetic factors contributing to increased risk of severe COVID-19 infection from a Whole Exome Sequencing (WES) dataset of a cohort of 2000 Italian patients. We coupled a stratified k-fold screening, to rank variants more associated with severity, with the training of multiple supervised classifiers, to predict severity based on screened features. Feature importance analysis from tree-based models allowed us to identify 16 variants with the highest support which, together with age and gender covariates, were found to be most predictive of COVID-19 severity. When tested on a follow-up cohort, our ensemble of models predicted severity with high accuracy (ACC = 81.88%; AUCROC = 96%; MCC = 61.55%). Our model recapitulated a vast literature of emerging molecular mechanisms and genetic factors linked to COVID-19 response and extends previous landmark Genome-Wide Association Studies (GWAS). It revealed a network of interplaying genetic signatures converging on established immune system and inflammatory processes linked to viral infection response. It also identified additional processes cross-talking with immune pathways, such as GPCR signaling, which might offer additional opportunities for therapeutic intervention and patient stratification. Publicly available PheWAS datasets revealed that several variants were significantly associated with phenotypic traits such as “Respiratory or thoracic disease”, supporting their link with COVID-19 severity outcome
- …