10 research outputs found

    Insights into Protein Sequence and Structure-Derived Features Mediating 3D Domain Swapping Mechanism using Support Vector Machine Based Approach

    Get PDF
    3-dimensional domain swapping is a mechanism where two or more protein molecules form higher order oligomers by exchanging identical or similar subunits. Recently, this phenomenon has received much attention in the context of prions and neurodegenerative diseases, due to its role in the functional regulation, formation of higher oligomers, protein misfolding, aggregation etc. While 3-dimensional domain swap mechanism can be detected from three-dimensional structures, it remains a formidable challenge to derive common sequence or structural patterns from proteins involved in swapping. We have developed a SVM-based classifier to predict domain swapping events using a set of features derived from sequence and structural data. The SVM classifier was trained on features derived from 150 proteins reported to be involved in 3D domain swapping and 150 proteins not known to be involved in swapped conformation or related to proteins involved in swapping phenomenon. The testing was performed using 63 proteins from the positive dataset and 63 proteins from the negative dataset. We obtained 76.33% accuracy from training and 73.81% accuracy from testing. Due to high diversity in the sequence, structure and functions of proteins involved in domain swapping, availability of such an algorithm to predict swapping events from sequence and structure-derived features will be an initial step towards identification of more putative proteins that may be involved in swapping or proteins involved in deposition disease. Further, the top features emerging in our feature selection method may be analysed further to understand their roles in the mechanism of domain swapping

    Homozygous deletion of exons 2 and 3 of NPC2 associated with Niemann–Pick disease type C

    Full text link
    Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/134235/1/ajmga37794.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/134235/2/ajmga37794-sup-0001-SuppData-S1.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/134235/3/ajmga37794_am.pd

    At a glance:the largest Niemann-Pick type C1 cohort with 602 patients diagnosed over 15 years

    Get PDF
    Niemann-Pick type C1 disease (NPC1 [OMIM 257220]) is a rare and severe autosomal recessive disorder, characterized by a multitude of neurovisceral clinical manifestations and a fatal outcome with no effective treatment to date. Aiming to gain insights into the genetic aspects of the disease, clinical, genetic, and biomarker PPCS data from 602 patients referred from 47 countries and diagnosed with NPC1 in our laboratory were analyzed. Patients’ clinical data were dissected using Human Phenotype Ontology (HPO) terms, and genotype–phenotype analysis was performed. The median age at diagnosis was 10.6 years (range 0–64.5 years), with 287 unique pathogenic/likely pathogenic (P/LP) variants identified, expanding NPC1 allelic heterogeneity. Importantly, 73 P/LP variants were previously unpublished. The most frequent variants detected were: c.3019C &gt; G, p.(P1007A), c.3104C &gt; T, p.(A1035V), and c.2861C &gt; T, p.(S954L). Loss of function (LoF) variants were significantly associated with earlier age at diagnosis, highly increased biomarker levels, and a visceral phenotype (abnormal abdomen and liver morphology). On the other hand, the variants p.(P1007A) and p.(S954L) were significantly associated with later age at diagnosis (p &lt; 0.001) and mildly elevated biomarker levels (p ≤ 0.002), consistent with the juvenile/adult form of NPC1. In addition, p.(I1061T), p.(S954L), and p.(A1035V) were associated with abnormality of eye movements (vertical supranuclear gaze palsy, p ≤ 0.05). We describe the largest and most heterogenous cohort of NPC1 patients published to date. Our results suggest that besides its utility in variant classification, the biomarker PPCS might serve to indicate disease severity/progression. In addition, we establish new genotype–phenotype relationships for “frequent” NPC1 variants.</p

    BLProt: prediction of bioluminescent proteins based on support vector machine and relieff feature selection

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Bioluminescence is a process in which light is emitted by a living organism. Most creatures that emit light are sea creatures, but some insects, plants, fungi etc, also emit light. The biotechnological application of bioluminescence has become routine and is considered essential for many medical and general technological advances. Identification of bioluminescent proteins is more challenging due to their poor similarity in sequence. So far, no specific method has been reported to identify bioluminescent proteins from primary sequence.</p> <p>Results</p> <p>In this paper, we propose a novel predictive method that uses a Support Vector Machine (SVM) and physicochemical properties to predict bioluminescent proteins. BLProt was trained using a dataset consisting of 300 bioluminescent proteins and 300 non-bioluminescent proteins, and evaluated by an independent set of 141 bioluminescent proteins and 18202 non-bioluminescent proteins. To identify the most prominent features, we carried out feature selection with three different filter approaches, ReliefF, infogain, and mRMR. We selected five different feature subsets by decreasing the number of features, and the performance of each feature subset was evaluated.</p> <p>Conclusion</p> <p>BLProt achieves 80% accuracy from training (5 fold cross-validations) and 80.06% accuracy from testing. The performance of BLProt was compared with BLAST and HMM. High prediction accuracy and successful prediction of hypothetical proteins suggests that BLProt can be a useful approach to identify bioluminescent proteins from sequence information, irrespective of their sequence similarity. The BLProt software is available at <url>http://www.inb.uni-luebeck.de/tools-demos/bioluminescent%20protein/BLProt</url></p

    The evolving SARS-CoV-2 epidemic in Africa: Insights from rapidly expanding genomic surveillance

    Get PDF
    INTRODUCTION Investment in Africa over the past year with regard to severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) sequencing has led to a massive increase in the number of sequences, which, to date, exceeds 100,000 sequences generated to track the pandemic on the continent. These sequences have profoundly affected how public health officials in Africa have navigated the COVID-19 pandemic. RATIONALE We demonstrate how the first 100,000 SARS-CoV-2 sequences from Africa have helped monitor the epidemic on the continent, how genomic surveillance expanded over the course of the pandemic, and how we adapted our sequencing methods to deal with an evolving virus. Finally, we also examine how viral lineages have spread across the continent in a phylogeographic framework to gain insights into the underlying temporal and spatial transmission dynamics for several variants of concern (VOCs). RESULTS Our results indicate that the number of countries in Africa that can sequence the virus within their own borders is growing and that this is coupled with a shorter turnaround time from the time of sampling to sequence submission. Ongoing evolution necessitated the continual updating of primer sets, and, as a result, eight primer sets were designed in tandem with viral evolution and used to ensure effective sequencing of the virus. The pandemic unfolded through multiple waves of infection that were each driven by distinct genetic lineages, with B.1-like ancestral strains associated with the first pandemic wave of infections in 2020. Successive waves on the continent were fueled by different VOCs, with Alpha and Beta cocirculating in distinct spatial patterns during the second wave and Delta and Omicron affecting the whole continent during the third and fourth waves, respectively. Phylogeographic reconstruction points toward distinct differences in viral importation and exportation patterns associated with the Alpha, Beta, Delta, and Omicron variants and subvariants, when considering both Africa versus the rest of the world and viral dissemination within the continent. Our epidemiological and phylogenetic inferences therefore underscore the heterogeneous nature of the pandemic on the continent and highlight key insights and challenges, for instance, recognizing the limitations of low testing proportions. We also highlight the early warning capacity that genomic surveillance in Africa has had for the rest of the world with the detection of new lineages and variants, the most recent being the characterization of various Omicron subvariants. CONCLUSION Sustained investment for diagnostics and genomic surveillance in Africa is needed as the virus continues to evolve. This is important not only to help combat SARS-CoV-2 on the continent but also because it can be used as a platform to help address the many emerging and reemerging infectious disease threats in Africa. In particular, capacity building for local sequencing within countries or within the continent should be prioritized because this is generally associated with shorter turnaround times, providing the most benefit to local public health authorities tasked with pandemic response and mitigation and allowing for the fastest reaction to localized outbreaks. These investments are crucial for pandemic preparedness and response and will serve the health of the continent well into the 21st century

    Biallelic variants in the transcription factor PAX7 are a new genetic cause of myopathy

    No full text
    Skeletal muscle growth and regeneration rely on muscle stem cells, called satellite cells. Specific transcription factors, particularly PAX7, are key regulators of the function of these cells. Knockout of this factor in mice leads to poor postnatal survival; however, the consequences of a lack of PAX7 in humans have not been established. Purpose Skeletal muscle growth and regeneration rely on muscle stem cells, called satellite cells. Specific transcription factors, particularly PAX7, are key regulators of the function of these cells. Knockout of this factor in mice leads to poor postnatal survival; however, the consequences of a lack of PAX7 in humans have not been established. Methods Here, we study five individuals with myopathy of variable severity from four unrelated consanguineous couples. Exome sequencing identified pathogenic variants in the PAX7 gene. Clinical examination, laboratory tests, and muscle biopsies were performed to characterize the disease. Results The disease was characterized by hypotonia, ptosis, muscular atrophy, scoliosis, and mildly dysmorphic facial features. The disease spectrum ranged from mild to severe and appears to be progressive. Muscle biopsies showed the presence of atrophic fibers and fibroadipose tissue replacement, with the absence of myofiber necrosis. A lack of PAX7 expression was associated with satellite cell pool exhaustion; however, the presence of residual myoblasts together with regenerating myofibers suggest that a population of PAX7-independent myogenic cells partially contributes to muscle regeneration. Conclusion These findings show that biallelic variants in the master transcription factor PAX7 cause a new type of myopathy that specifically affects satellite cell survival.German Bundesministerium für Bildung und Forschung through the Juniorverbund in der Systemmedizin “mitOmics” (FKZ01ZX1405C to T.B.H.) and Horizon2020 through the E-Rare project GENOMIT (01GM1603 and 01GM1207 for H.P. and FWFI2741B26 for J.A.M.) and the Deutsche Forschungsgemeinschaft (SCHO754/52 to L.S. and BA2427/22 to P.B.) as well as the Vereinigung zur Förderung Pädiatrischer Forschung und Fortbildung Salzburg, the EU FP7 Mitochondrial European Educational Training Project (317433 to H.P. and J.A.M.), and the EU Horizon2020 Collaborative Research Project SOUND (633974 to H.P.). N.A.D. is supported by grants from the Fonds de recherche du Québec–Santé (35015), Canadian Institutes of Health Research (388296), Rare Disease Foundation (2301), and CHU Sainte-Justine Foundation. N.A.D. acknowledges the support of ThéCell and Stem Cell Netwo

    Abstract

    No full text
    corecore