31 research outputs found
Insights into Protein Sequence and Structure-Derived Features Mediating 3D Domain Swapping Mechanism using Support Vector Machine Based Approach
3-dimensional domain swapping is a mechanism where two or more protein molecules form higher order oligomers by exchanging identical or similar subunits. Recently, this phenomenon has received much attention in the context of prions and neurodegenerative diseases, due to its role in the functional regulation, formation of higher oligomers, protein misfolding, aggregation etc. While 3-dimensional domain swap mechanism can be detected from three-dimensional structures, it remains a formidable challenge to derive common sequence or structural patterns from proteins involved in swapping. We have developed a SVM-based classifier to predict domain swapping events using a set of features derived from sequence and structural data. The SVM classifier was trained on features derived from 150 proteins reported to be involved in 3D domain swapping and 150 proteins not known to be involved in swapped conformation or related to proteins involved in swapping phenomenon. The testing was performed using 63 proteins from the positive dataset and 63 proteins from the negative dataset. We obtained 76.33% accuracy from training and 73.81% accuracy from testing. Due to high diversity in the sequence, structure and functions of proteins involved in domain swapping, availability of such an algorithm to predict swapping events from sequence and structure-derived features will be an initial step towards identification of more putative proteins that may be involved in swapping or proteins involved in deposition disease. Further, the top features emerging in our feature selection method may be analysed further to understand their roles in the mechanism of domain swapping
Homozygous deletion of exons 2 and 3 of NPC2 associated with Niemann–Pick disease type C
Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/134235/1/ajmga37794.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/134235/2/ajmga37794-sup-0001-SuppData-S1.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/134235/3/ajmga37794_am.pd
At a glance:the largest Niemann-Pick type C1 cohort with 602 patients diagnosed over 15 years
Niemann-Pick type C1 disease (NPC1 [OMIM 257220]) is a rare and severe autosomal recessive disorder, characterized by a multitude of neurovisceral clinical manifestations and a fatal outcome with no effective treatment to date. Aiming to gain insights into the genetic aspects of the disease, clinical, genetic, and biomarker PPCS data from 602 patients referred from 47 countries and diagnosed with NPC1 in our laboratory were analyzed. Patients’ clinical data were dissected using Human Phenotype Ontology (HPO) terms, and genotype–phenotype analysis was performed. The median age at diagnosis was 10.6 years (range 0–64.5 years), with 287 unique pathogenic/likely pathogenic (P/LP) variants identified, expanding NPC1 allelic heterogeneity. Importantly, 73 P/LP variants were previously unpublished. The most frequent variants detected were: c.3019C > G, p.(P1007A), c.3104C > T, p.(A1035V), and c.2861C > T, p.(S954L). Loss of function (LoF) variants were significantly associated with earlier age at diagnosis, highly increased biomarker levels, and a visceral phenotype (abnormal abdomen and liver morphology). On the other hand, the variants p.(P1007A) and p.(S954L) were significantly associated with later age at diagnosis (p < 0.001) and mildly elevated biomarker levels (p ≤ 0.002), consistent with the juvenile/adult form of NPC1. In addition, p.(I1061T), p.(S954L), and p.(A1035V) were associated with abnormality of eye movements (vertical supranuclear gaze palsy, p ≤ 0.05). We describe the largest and most heterogenous cohort of NPC1 patients published to date. Our results suggest that besides its utility in variant classification, the biomarker PPCS might serve to indicate disease severity/progression. In addition, we establish new genotype–phenotype relationships for “frequent” NPC1 variants.</p
BLProt: prediction of bioluminescent proteins based on support vector machine and relieff feature selection
<p>Abstract</p> <p>Background</p> <p>Bioluminescence is a process in which light is emitted by a living organism. Most creatures that emit light are sea creatures, but some insects, plants, fungi etc, also emit light. The biotechnological application of bioluminescence has become routine and is considered essential for many medical and general technological advances. Identification of bioluminescent proteins is more challenging due to their poor similarity in sequence. So far, no specific method has been reported to identify bioluminescent proteins from primary sequence.</p> <p>Results</p> <p>In this paper, we propose a novel predictive method that uses a Support Vector Machine (SVM) and physicochemical properties to predict bioluminescent proteins. BLProt was trained using a dataset consisting of 300 bioluminescent proteins and 300 non-bioluminescent proteins, and evaluated by an independent set of 141 bioluminescent proteins and 18202 non-bioluminescent proteins. To identify the most prominent features, we carried out feature selection with three different filter approaches, ReliefF, infogain, and mRMR. We selected five different feature subsets by decreasing the number of features, and the performance of each feature subset was evaluated.</p> <p>Conclusion</p> <p>BLProt achieves 80% accuracy from training (5 fold cross-validations) and 80.06% accuracy from testing. The performance of BLProt was compared with BLAST and HMM. High prediction accuracy and successful prediction of hypothetical proteins suggests that BLProt can be a useful approach to identify bioluminescent proteins from sequence information, irrespective of their sequence similarity. The BLProt software is available at <url>http://www.inb.uni-luebeck.de/tools-demos/bioluminescent%20protein/BLProt</url></p
The evolving SARS-CoV-2 epidemic in Africa: Insights from rapidly expanding genomic surveillance.
Investment in severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) sequencing in Africa over the past year has led to a major increase in the number of sequences that have been generated and used to track the pandemic on the continent, a number that now exceeds 100,000 genomes. Our results show an increase in the number of African countries that are able to sequence domestically and highlight that local sequencing enables faster turnaround times and more-regular routine surveillance. Despite limitations of low testing proportions, findings from this genomic surveillance study underscore the heterogeneous nature of the pandemic and illuminate the distinct dispersal dynamics of variants of concern-particularly Alpha, Beta, Delta, and Omicron-on the continent. Sustained investment for diagnostics and genomic surveillance in Africa is needed as the virus continues to evolve while the continent faces many emerging and reemerging infectious disease threats. These investments are crucial for pandemic preparedness and response and will serve the health of the continent well into the 21st century
The evolving SARS-CoV-2 epidemic in Africa: Insights from rapidly expanding genomic surveillance
INTRODUCTION
Investment in Africa over the past year with regard to severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) sequencing has led to a massive increase in the number of sequences, which, to date, exceeds 100,000 sequences generated to track the pandemic on the continent. These sequences have profoundly affected how public health officials in Africa have navigated the COVID-19 pandemic.
RATIONALE
We demonstrate how the first 100,000 SARS-CoV-2 sequences from Africa have helped monitor the epidemic on the continent, how genomic surveillance expanded over the course of the pandemic, and how we adapted our sequencing methods to deal with an evolving virus. Finally, we also examine how viral lineages have spread across the continent in a phylogeographic framework to gain insights into the underlying temporal and spatial transmission dynamics for several variants of concern (VOCs).
RESULTS
Our results indicate that the number of countries in Africa that can sequence the virus within their own borders is growing and that this is coupled with a shorter turnaround time from the time of sampling to sequence submission. Ongoing evolution necessitated the continual updating of primer sets, and, as a result, eight primer sets were designed in tandem with viral evolution and used to ensure effective sequencing of the virus. The pandemic unfolded through multiple waves of infection that were each driven by distinct genetic lineages, with B.1-like ancestral strains associated with the first pandemic wave of infections in 2020. Successive waves on the continent were fueled by different VOCs, with Alpha and Beta cocirculating in distinct spatial patterns during the second wave and Delta and Omicron affecting the whole continent during the third and fourth waves, respectively. Phylogeographic reconstruction points toward distinct differences in viral importation and exportation patterns associated with the Alpha, Beta, Delta, and Omicron variants and subvariants, when considering both Africa versus the rest of the world and viral dissemination within the continent. Our epidemiological and phylogenetic inferences therefore underscore the heterogeneous nature of the pandemic on the continent and highlight key insights and challenges, for instance, recognizing the limitations of low testing proportions. We also highlight the early warning capacity that genomic surveillance in Africa has had for the rest of the world with the detection of new lineages and variants, the most recent being the characterization of various Omicron subvariants.
CONCLUSION
Sustained investment for diagnostics and genomic surveillance in Africa is needed as the virus continues to evolve. This is important not only to help combat SARS-CoV-2 on the continent but also because it can be used as a platform to help address the many emerging and reemerging infectious disease threats in Africa. In particular, capacity building for local sequencing within countries or within the continent should be prioritized because this is generally associated with shorter turnaround times, providing the most benefit to local public health authorities tasked with pandemic response and mitigation and allowing for the fastest reaction to localized outbreaks. These investments are crucial for pandemic preparedness and response and will serve the health of the continent well into the 21st century
