25 research outputs found

    The reference human nuclear mitochondrial sequences compilation validated and implemented on the UCSC genome browser

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Eukaryotic nuclear genomes contain fragments of mitochondrial DNA called NumtS (Nuclear mitochondrial Sequences), whose mode and time of insertion, as well as their functional/structural role within the genome are debated issues. Insertion sites match with chromosomal breaks, revealing that micro-deletions usually occurring at non-homologous end joining <it>loci </it>become reduced in presence of NumtS. Some NumtS are involved in recombination events leading to fragment duplication. Moreover, NumtS are polymorphic, a feature that renders them candidates as population markers. Finally, they are a cause of contamination during human mtDNA sequencing, leading to the generation of false heteroplasmies.</p> <p>Results</p> <p>Here we present RHNumtS.2, the most exhaustive human NumtSome catalogue annotating 585 NumtS, 97% of which were here validated in a European individual and in HapMap samples. The NumtS complete dataset and related features have been made available at the UCSC Genome Browser. The produced sequences have been submitted to INSDC databases. The implementation of the RHNumtS.2 tracks within the UCSC Genome Browser has been carried out with the aim to facilitate browsing of the NumtS tracks to be exploited in a wide range of research applications.</p> <p>Conclusions</p> <p>We aimed at providing the scientific community with the most exhaustive overview on the human NumtSome, a resource whose aim is to support several research applications, such as studies concerning human structural variation, diversity, and disease, as well as the detection of false heteroplasmic mtDNA variants. Upon implementation of the NumtS tracks, the application of the BLAT program on the UCSC Genome Browser has now become an additional tool to check for heteroplasmic artefacts, supported by data available through the NumtS tracks.</p

    Genome Digging: Insight into the Mitochondrial Genome of Homo

    Get PDF
    A fraction of the Neanderthal mitochondrial genome sequence has a similarity with a 5,839-bp nuclear DNA sequence of mitochondrial origin (numt) on the human chromosome 1. This fact has never been interpreted. Although this phenomenon may be attributed to contamination and mosaic assembly of Neanderthal mtDNA from short sequencing reads, we explain the mysterious similarity by integration of this numt (mtAncestor-1) into the nuclear genome of the common ancestor of Neanderthals and modern humans not long before their reproductive split.Exploiting bioinformatics, we uncovered an additional numt (mtAncestor-2) with a high similarity to the Neanderthal mtDNA and indicated that both numts represent almost identical replicas of the mtDNA sequences ancestral to the mitochondrial genomes of Neanderthals and modern humans. In the proteins, encoded by mtDNA, the majority of amino acids distinguishing chimpanzees from humans and Neanderthals were acquired by the ancestral hominins. The overall rate of nonsynonymous evolution in Neanderthal mitochondrial protein-coding genes is not higher than in other lineages. The model incorporating the ancestral hominin mtDNA sequences estimates the average divergence age of the mtDNAs of Neanderthals and modern humans to be 450,000-485,000 years. The mtAncestor-1 and mtAncestor-2 sequences were incorporated into the nuclear genome approximately 620,000 years and 2,885,000 years ago, respectively.This study provides the first insight into the evolution of the mitochondrial DNA in hominins ancestral to Neanderthals and humans. We hypothesize that mtAncestor-1 and mtAncestor-2 are likely to be molecular fossils of the mtDNAs of Homo heidelbergensis and a stem Homo lineage. The d(N)/d(S) dynamics suggests that the effective population size of extinct hominins was low. However, the hominin lineage ancestral to humans, Neanderthals and H. heidelbergensis, had a larger effective population size and possessed genetic diversity comparable with those of chimpanzee and gorilla

    A data mining approach to retrieve mitochondrial variability data associated to clinical phenotypes

    No full text
    The maintenance of biological databases is at present a problem of great interest since the progress made in many experimental procedures has led to an ever increasing amount of data. These data need to be structured and stored in databases and made accessible to the biological community in user-friendly ways. Although both the interest and the need of accessing biological databases are high, the mechanisms to fund their maintenance are unclear. Funding agencies cannot support data annotation in terms of labour costs and hence the development of new tools based on “data miming” technologies could greatly contribute to keep biological databases updated. Here we present a new approach aimed to contribute to the annotation in the HmtDB resource (http://www.hmdb.uniba.it/) of variability data associated to clinical phenotypes [1]. These data are prevalently available in literature where they are reported in a completely free style. Thus, we suggest the construction of a knowledge base derived from browsing papers on web and to be used in the retrieval phase. Nevertheless, problems in extracting data from literature come not only from the heterogeneity of presentation styles but mainly from the unstructured format (i.e. the natural language) in which they are represented. In this scenario, the goal is to feed a knowledge base by identifying occurrences of specific biological entities and their features as well as the particular method and experimental setting of the scientific study adopted in the publication. In this work, we describe some solutions to the problem of structuring information contained in scientific literature in digital (i.e., pdf) or paper format

    Human mtDNA site-specific variability values can act as haplogroup markers

    No full text
    Sequencing of entire human mtDNA genomes has become rapid and efficient, leading to the production of a great number of complete mtDNA sequences from a wide range of human populations. We introduce here a new statistical approach for classifying mtDNA nucleotide sites, simply by comparing the mean simple deviation (MSD) of their specific variability values estimated on continent-specific dataset sequences, without the need for any reference sequence. Excellent correspondence was observed between sites with the highest MSD values and those marking known mtDNA haplogroups. This in turn supports the classification of 81 sites (23 in Africa, eight in Asia, eight in Europe, 34 in Oceania, and eight in America) as novel markers of 47 mtDNA haplogroups not yet identified by phylogeographic studies. Not only does this approach allow refinement of mtDNA phylogeny, an essential requirement also for mitochondrial disease studies, but may greatly facilitate the discrimination of candidate disease-causing mutations from haplogroup-specific polymorphisms in mtDNA sequences of patients affected by mitochondrial disorders

    “Human mtDNA site specific variability values can act as haplogroup markers”

    No full text
    Sequencing of entire human mtDNA genomes has become rapid and efficient, leading to the production of a great number of complete mtDNA sequences from a wide range of human populations. We introduce here a new statistical approach for classifying mtDNA nucleotide sites, simply by comparing the mean simple deviation (MSD) of their specific variability values estimated on continent-specific dataset sequences, without the need for any reference sequence. Excellent correspondence was observed between sites with the highest MSD values and those marking known mtDNA haplogroups. This in turn supports the classification of 81 sites (23 in Africa, eight in Asia, eight in Europe, 34 in Oceania, and eight in America) as novel markers of 47mtDNA haplogroups not yet identified by phylogeographic studies. Not only does this approach allow refinement of mtDNA phylogeny, an essential requirement also for mitochondrial disease studies, but may greatly facilitate the discrimination of candidate disease-causing mutations from haplogroup-specific polymorphisms in mtDNA sequences of patients affected by mitochondrial disorders

    Real-world effectiveness of apremilast in multirefractory mucosal involvement of Beh\ue7et\u2019s disease

    No full text
    Relapsing oral and genital ulcers (OGUs) represent the stigmata of Beh\ue7et\u2019s disease (BD) and may be very painful, affecting both quality of life and relationships. A wide number of topical and immunosuppressive drugs can be used to treat ulcers [1], but failures are commonly reported. The efficacy of the phosphodiesterase-4 inhibitor apremilast has been proven in OGUs of BD in two randomized clinical trials (RCT) [2, 3], whereas only two case reports are available until now [4, 5]. We aimed at evaluating the real-world effectiveness of apremilast in BD patients with OGUs refractory to conventional and/or biologic treatments. We retrospectively evaluated patients classified as BD, according to International Criteria for BD [6] and International Study Group [7] criteria, who underwent apremilast (30 mg twice daily) for multirefractory OGUs from November 2017 to January 2019. The number of OGUs was assessed at baseline and either at 3 and 6 months. Pain from ulcers and BD activity were evaluated via 100-mm visual-analogue scale (VAS) and BD Current Activity Form (BDCAF). We also recorded the number of oral and genital ulcer flares both in the 4 weeks prior to apremilast start and throughout the observation period (Table 1 and Supplementary Table 2). The occurrence of adverse events was also reported. Paired t-test or Wilcoxon matched-pair signed rank test were used for statistical analysis. The off-label use of apremilast was approved by the Hospital Ethics Committee in compliance with the Declaration of Helsinki. All patients provided a written informed consent. Thirteen patients (females 9/13) with disease duration (mean \ub1 SD) of 154 \ub1 167 months were analysed (Table 1). At 3 months, (data from 12/13 patients) active OGUs were significantly less (p=0.02 for both) than baseline (Table 2). Three patients stopped the treatment due to diarrhoea. At 6 months, active oral ulcers and oral relapses were still lower than baseline (p=0.03 for both), whereas only a positive trend (p=0.07) for genital ulcers was seen (data from 8/13 patients) (Table 2). Ulcer VAS pain was 67 \ub1 16 at baseline, and a prompt amelioration was observed at 3 months (29 \ub1 32, p=0.002), and confirmed at 6 months (20 \ub1 19, p=0.005) (Table 2). Likewise, BDCAF dropped from 4.5 \ub1 2.9 of baseline to 3.2 \ub1 3.4 at 3 months (p=0.01), and was persistently low up to 6 months (2.3 \ub1 3.7, p=0.01) (Table 2). Serious adverse events were not observed. Our findings are consistent with a recent RCT on 111 BD patients [2], which showed the efficacy of apremilast in reducing both number and pain of oral ulcers [2]. Preliminary results from another study confirm the significant decrease of total number of oral ulcers and resolution of genital ulcers over 12 weeks in the apremilast group [3]. Similarly, in our study the mean number of oral relapses during therapy was significantly lower than that in the 4 weeks prior to apremilast. Interestingly, an appreciable reduction of VAS pain and BDCAF was already seen at 3 months and persisted up to 6 months. Of note, the overall beneficial effect of apremilast also on joint symptoms should be highlighted, as emerged by the BDCAF evaluations. Apremilast was safe and no serious adverse events were observed during the time span of our study. The main limitations of our study were the small sample size and the short-term follow-up. In addition, patients had been referred to our tertiary care centres since they were difficult-to-treat or refractory to therapy, configuring a possible selection bias. Nevertheless we provide evidence that apremilast may induce a meaningful and early benefit in BD patients with multirefractory OGUs also in real-life settings

    Validity of Machine Learning in Predicting Giant Cell Arteritis Flare After Glucocorticoids Tapering

    No full text
    Background: Inferential statistical methods failed in identifying reliable biomarkers and risk factors for relapsing giant cell arteritis (GCA) after glucocorticoids (GCs) tapering. A ML approach allows to handle complex non-linear relationships between patient attributes that are hard to model with traditional statistical methods, merging them to output a forecast or a probability for a given outcome. Objective: The objective of the study was to assess whether ML algorithms can predict GCA relapse after GCs tapering. Methods: GCA patients who underwent GCs therapy and regular follow-up visits for at least 12 months, were retrospectively analyzed and used for implementing 3 ML algorithms, namely, Logistic Regression (LR), Decision Tree (DT), and Random Forest (RF). The outcome of interest was disease relapse within 3 months during GCs tapering. After a ML variable selection method, based on a XGBoost wrapper, an attribute core set was used to train and test each algorithm using 5-fold cross-validation. The performance of each algorithm in both phases was assessed in terms of accuracy and area under receiver operating characteristic curve (AUROC). Results: The dataset consisted of 107 GCA patients (73 women, 68.2%) with mean age (± SD) 74.1 (± 8.5) years at presentation. GCA flare occurred in 40/107 patients (37.4%) within 3 months after GCs tapering. As a result of ML wrapper, the attribute core set with the least number of variables used for algorithm training included presence/absence of diabetes mellitus and concomitant polymyalgia rheumatica as well as erythrocyte sedimentation rate level at GCs baseline. RF showed the best performance, being significantly superior to other algorithms in accuracy (RF 71.4% vs LR 70.4% vs DT 62.9%). Consistently, RF precision (72.1%) was significantly greater than those of LR (62.6%) and DT (50.8%). Conversely, LR was superior to RF and DT in recall (RF 60% vs LR 62.5% vs DT 47.5%). Moreover, RF AUROC (0.76) was more significant compared to LR (0.73) and DT (0.65). Conclusions: RF algorithm can predict GCA relapse after GCs tapering with sufficient accuracy. To date, this is one of the most accurate predictive modelings for such outcome. This ML method represents a reproducible tool, capable of supporting clinicians in GCA patient management
    corecore