52 research outputs found

    Detailed estimation of bioinformatics prediction reliability through the Fragmented Prediction Performance Plots

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>An important and yet rather neglected question related to bioinformatics predictions is the estimation of the amount of data that is needed to allow reliable predictions. Bioinformatics predictions are usually validated through a series of figures of merit, like for example sensitivity and precision, and little attention is paid to the fact that their performance may depend on the amount of data used to make the predictions themselves.</p> <p>Results</p> <p>Here I describe a tool, named Fragmented Prediction Performance Plot (FPPP), which monitors the relationship between the prediction reliability and the amount of information underling the prediction themselves. Three examples of FPPPs are presented to illustrate their principal features. In one example, the reliability becomes independent, over a certain threshold, of the amount of data used to predict protein features and the intrinsic reliability of the predictor can be estimated. In the other two cases, on the contrary, the reliability strongly depends on the amount of data used to make the predictions and, thus, the intrinsic reliability of the two predictors cannot be determined. Only in the first example it is thus possible to fully quantify the prediction performance.</p> <p>Conclusion</p> <p>It is thus highly advisable to use FPPPs to determine the performance of any new bioinformatics prediction protocol, in order to fully quantify its prediction power and to allow comparisons between two or more predictors based on different types of data.</p

    Identification of epitopes recognised by mucosal CD4+ T-cell populations from cattle experimentally colonised with Escherichia coli O157:H7

    Get PDF
    Additional file 5. Sequence alignment of Intimin epitopes against Intimin sequences from non-O157 EHEC serotypes. Alignment of Intimin CD4+ T-cell epitope sequences with representative Intimin sequences from EHEC serotypes O145, O127, O26, O103, O121, O45 and O111. Percentage values indicate % similarity to the EHEC O157:H7 reference sequence

    Selective Enrichment and Sequencing of Whole Mitochondrial Genomes in the Presence of Nuclear Encoded Mitochondrial Pseudogenes (Numts)

    Get PDF
    Numts are an integral component of many eukaryote genomes offering a snapshot of the evolutionary process that led from the incorporation of an α-proteobacterium into a larger eukaryotic cell some 1.8 billion years ago. Although numt sequence can be harnessed as molecular marker, these sequences often remain unidentified and are mistaken for genuine mtDNA leading to erroneous interpretation of mtDNA data sets. It is therefore indispensable that during the process of amplifying and sequencing mitochondrial genes, preventive measures are taken to ensure the exclusion of numts to guarantee the recovery of genuine mtDNA. This applies to mtDNA analyses in general but especially to studies where mtDNAs are sequenced de novo as the launch pad for subsequent mtDNA-based research. By using a combination of dilution series and nested rolling circle amplification (RCA), we present a novel strategy to selectively amplify mtDNA and exclude the amplification of numt sequence. We have successfully applied this strategy to de novo sequence the mtDNA of the Black Field Cricket Teleogryllus commodus, a species known to contain numts. Aligning our assembled sequence to the reference genome of Teleogryllus emma (GenBank EU557269.1) led to the identification of a numt sequence in the reference sequence. This unexpected result further highlights the need of a reliable and accessible strategy to eliminate this source of error

    C-Terminal Extension of the Yeast Mitochondrial DNA Polymerase Determines the Balance between Synthesis and Degradation

    Get PDF
    Saccharomyces cerevisiae mitochondrial DNA polymerase (Mip1) contains a C-terminal extension (CTE) of 279 amino acid residues. The CTE is required for mitochondrial DNA maintenance in yeast but is absent in higher eukaryotes. Here we use recombinant Mip1 C-terminal deletion mutants to investigate functional importance of the CTE. We show that partial removal of the CTE in Mip1Δ216 results in strong preference for exonucleolytic degradation rather than DNA polymerization. This disbalance in exonuclease and polymerase activities is prominent at suboptimal dNTP concentrations and in the absence of correctly pairing nucleotide. Mip1Δ216 also displays reduced ability to synthesize DNA through double-stranded regions. Full removal of the CTE in Mip1Δ279 results in complete loss of Mip1 polymerase activity, however the mutant retains its exonuclease activity. These results allow us to propose that CTE functions as a part of Mip1 polymerase domain that stabilizes the substrate primer end at the polymerase active site, and is therefore required for efficient mitochondrial DNA replication in vivo

    Improved accuracy of multiple ncRNA alignment by incorporating structural information into a MAFFT-based framework

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Structural alignment of RNAs is becoming important, since the discovery of functional non-coding RNAs (ncRNAs). Recent studies, mainly based on various approximations of the Sankoff algorithm, have resulted in considerable improvement in the accuracy of pairwise structural alignment. In contrast, for the cases with more than two sequences, the practical merit of structural alignment remains unclear as compared to traditional sequence-based methods, although the importance of multiple structural alignment is widely recognized.</p> <p>Results</p> <p>We took a different approach from a straightforward extension of the Sankoff algorithm to the multiple alignments from the viewpoints of accuracy and time complexity. As a new option of the MAFFT alignment program, we developed a multiple RNA alignment framework, X-INS-i, which builds a multiple alignment with an iterative method incorporating structural information through two components: (1) pairwise structural alignments by an external pairwise alignment method such as SCARNA or LaRA and (2) a new objective function, Four-way Consistency, derived from the base-pairing probability of every sub-aligned group at every multiple alignment stage.</p> <p>Conclusion</p> <p>The BRAliBASE benchmark showed that X-INS-i outperforms other methods currently available in the sum-of-pairs score (SPS) criterion. As a basis for predicting common secondary structure, the accuracy of the present method is comparable to or rather higher than those of the current leading methods such as RNA Sampler. The X-INS-i framework can be used for building a multiple RNA alignment from any combination of algorithms for pairwise RNA alignment and base-pairing probability. The source code is available at the webpage found in the Availability and requirements section.</p

    B-RAF Mutant Alleles Associated with Langerhans Cell Histiocytosis, a Granulomatous Pediatric Disease

    Get PDF
    Langerhans cell histiocytosis (LCH) features inflammatory granuloma characterised by the presence of CD1a+ dendritic cells or 'LCH cells'. Badalian-Very et al. recently reported the presence of a canonical (V600E)B-RAF mutation in 57% of paraffin-embedded biopsies from LCH granuloma. Here we confirm their findings and report the identification of two novel B-RAF mutations detected in LCH patients.Mutations of B-RAF were observed in granuloma samples from 11 out of 16 patients using 'next generation' pyrosequencing. In 9 cases the mutation identified was (V600E)B-RAF. In 2 cases novel polymorphisms were identified. A somatic (600DLAT)B-RAF insertion mimicked the structural and functional consequences of the (V600E)B-RAF mutant. It destabilized the inactive conformation of the B-RAF kinase and resulted in increased ERK activation in 293 T cells. The (600DLAT)B-RAF and (V600E)B-RAF mutations were found enriched in DNA and mRNA from the CD1a+ fraction of granuloma. They were absent from the blood and monocytes of 58 LCH patients, with a lower threshold of sequencing sensitivity of 1%-2% relative mutation abundance. A novel germ line (T599A)B-RAF mutant allele was detected in one patient, at a relative mutation abundance close to 50% in the LCH granuloma, blood monocytes and lymphocytes. However, (T599A)B-RAF did not destabilize the inactive conformation of the B-RAF kinase, and did not induce increased ERK phosphorylation or C-RAF transactivation.Our data confirmed presence of the (V600E)B-RAF mutation in LCH granuloma of some patients, and identify two novel B-RAF mutations. They indicate that (V600E)B-RAF and (600DLAT)B-RAF mutations are somatic mutants enriched in LCH CD1a(+) cells and absent from the patient blood. Further studies are needed to assess the functional consequences of the germ-line (T599A)B-RAF allele

    Improving the Alignment Quality of Consistency Based Aligners with an Evaluation Function Using Synonymous Protein Words

    Get PDF
    Most sequence alignment tools can successfully align protein sequences with higher levels of sequence identity. The accuracy of corresponding structure alignment, however, decreases rapidly when considering distantly related sequences (<20% identity). In this range of identity, alignments optimized so as to maximize sequence similarity are often inaccurate from a structural point of view. Over the last two decades, most multiple protein aligners have been optimized for their capacity to reproduce structure-based alignments while using sequence information. Methods currently available differ essentially in the similarity measurement between aligned residues using substitution matrices, Fourier transform, sophisticated profile-profile functions, or consistency-based approaches, more recently

    The Opportunistic Pathogen Propionibacterium acnes: Insights into Typing, Human Disease, Clonal Diversification and CAMP Factor Evolution

    Get PDF
    We previously described a Multilocus Sequence Typing (MLST) scheme based on eight genes that facilitates population genetic and evolutionary analysis of P. acnes. While MLST is a portable method for unambiguous typing of bacteria, it is expensive and labour intensive. Against this background, we now describe a refined version of this scheme based on two housekeeping (aroE; guaA) and two putative virulence (tly; camp2) genes (MLST4) that correctly predicted the phylogroup (IA1, IA2, IB, IC, II, III), clonal complex (CC) and sequence type (ST) (novel or described) status for 91% isolates (n = 372) via cross-referencing of the four gene allelic profiles to the full eight gene versions available in the MLST database (http:// pubmlst.org/pacnes/). Even in the small number of cases where specific STs were not completely resolved, the MLST4 method still correctly determined phylogroup and CC membership. Examination of nucleotide changes within all the MLST loci provides evidence that point mutations generate new alleles approximately 1.5 times as frequently as recombination; although the latter still plays an important role in the bacterium’s evolution. The secreted/cell-associated ‘virulence’ factors tly and camp2 show no clear evidence of episodic or pervasive positive selection and have diversified at a rate similar to housekeeping loci. The co-evolution of these genes with the core genome might also indicate a role in commensal/normal existence constraining their diversity and preventing their loss from the P. acnes population. The possibility that members of the expanded CAMP factor protein family, including camp2, may have been lost from other propionibacteria, but not P. acnes, would further argue for a possible role in niche/host adaption leading to their retention within the genome. These evolutionary insights may prove important for discussions surrounding camp2 as an immunotherapy target for acne, and the effect such treatments may have on commensal lineages

    Genome-wide screens identify Toxoplasma gondii determinants of parasite fitness in IFNγ-activated murine macrophages

    Get PDF
    Macrophages play an essential role in the early immune response against Toxoplasma and are the cell type preferentially infected by the parasite in vivo. Interferon gamma (IFNγ) elicits a variety of anti-Toxoplasma activities in macrophages. Using a genome-wide CRISPR screen we identify 353 Toxoplasma genes that determine parasite fitness in naїve or IFNγ-activated murine macrophages, seven of which are further confirmed. We show that one of these genes encodes dense granule protein GRA45, which has a chaperone-like domain, is critical for correct localization of GRAs into the PVM and secretion of GRA effectors into the host cytoplasm. Parasites lacking GRA45 are more susceptible to IFNγ-mediated growth inhibition and have reduced virulence in mice. Together, we identify and characterize an important chaperone-like GRA in Toxoplasma and provide a resource for the community to further explore the function of Toxoplasma genes that determine fitness in IFNγ-activated macrophages
    corecore