8 research outputs found
Predicting Variant Pathogenicity with Machine Learning
There are roughly 22,000 protein-coding genes in the human body, many of which play important roles in biological functions. The proteins fold in 3D space, and this is most often necessary for function. A genetic variant can disrupt the secondary structure of a protein (one aspect of structure) or eliminate a site important in protein-protein interaction or post-translational modification. The loss of function or deregulation can result in disease. Thus, there is great biomedical interest in identifying disease-causing single-nucleotide variants.
We hypothesize that we can accurately predict variant pathogenicity. We used machine learning to predict the pathogenicity of a set of 28,369 single-nucleotide variants across 10 genes. The data are acquired from publicly available saturation mutagenesis data sets, which generate every possible amino acid substitution at every position in a protein. Our approach employs a support vector machine using linear, polynomial, and RBF kernel functions. The problem is implemented as a binary classification problem, where a label of 1 indicates a disease-causing variant and a label of 0 indicates a benign variant. The model predicts pathogenicity based on amino acid, post-translational modification, and secondary structure information. We cleaned and analyzed the data with custom Python scripts. Our results show average balanced accuracy scores for classifying pathogenicity of approximately 57.9%, 60.3%, and 60.3% for the linear, polynomial, and RBF kernels, respectively. Therefore, the model is an improvement over random guessing but has room for improvement.https://digitalscholarship.unlv.edu/durep_posters/1045/thumbnail.jp
Systematic Assessment of Protein C-Termini Mutated in Human Disorders
All proteins have a carboxyl terminus, and we previously summarized eight mutations in binding and trafficking sequence determinants in the C-terminus that, when disrupted, cause human diseases. These sequence elements for binding and trafficking sites, as well as post-translational modifications (PTMs), are called minimotifs or short linear motifs. We wanted to determine how frequently mutations in minimotifs in the C-terminus cause disease. We searched specifically for PTMs because mutation of a modified amino acid almost always changes the chemistry of the side chain and can be interpreted as loss-of-function. We analyzed data from ClinVar for disease variants, Minimotif Miner and the C-terminome for PTMs, and RefSeq for protein sequences, yielding 20 such potential disease-causing variants. After additional screening, they include six with a previously reported PTM disruption mechanism and nine with new hypotheses for mutated minimotifs in C-termini that may cause disease. These mutations were generally for different genes, with four different PTM types and several different diseases. Our study helps to identify new molecular mechanisms for nine separate variants that cause disease, and this type of analysis could be extended as databases grow and to binding and trafficking motifs. We conclude that mutated motifs in C-termini are an infrequent cause of disease
Chemical and structural defensive external strategies in six sabellid worms (Annelida)
In the marine environment, sessile invertebrates have developed an impressive
array of mechanisms to avoid predation, bacterial exploitation, and epibiotic
overgrowth. In the present study we investigated several defensive strategies
adopted by six sabellids: the hard bottom species Sabella spallanzanii (Gmelin,
1791), Branchiomma luctuosum Grube, 1869, Branchiomma bairdi (McIntosh,
1885), and Sabellastarte spectabilis (Grube, 1878), and the soft bottom species
Myxicola infundibulum (Renier, 1804), and Megalomma lanigera (Grube, 1846),
which have different morphological characteristics and geographical distribution.
We examined and compared some defensive features such as branchial
crown toughness, tube structure and strength, amount of released mucus, and
antibacterial lysozyme-activity in the mucus. The investigated species utilize a
combination of defence and deterrence strategies that seems to be related to
the colonized habitat. Tube strength was, higher in the hard bottom species
compared with the soft bottom ones, where the tubes are generally buried and
protected within the sediment. Branchial crown appeared stronger and resistant
in hard bottom species, except for S. spallanzanii, which is the species showing
the strongest tube. Sabella spallanzanii, M. infundibulum and S. spectabilis
secreted high amount of mucus with high lysozyme-like activity. By contrast,
B. luctuosum, B. bairdi, and M. lanigera produced low amounts of mucus
exerting lower antibacterial activity