27 research outputs found
Coval: Improving Alignment Quality and Variant Calling Accuracy for Next-Generation Sequencing Data
<div><p>Accurate identification of DNA polymorphisms using next-generation sequencing technology is challenging because of a high rate of sequencing error and incorrect mapping of reads to reference genomes. Currently available short read aligners and DNA variant callers suffer from these problems. We developed the Coval software to improve the quality of short read alignments. Coval is designed to minimize the incidence of spurious alignment of short reads, by filtering mismatched reads that remained in alignments after local realignment and error correction of mismatched reads. The error correction is executed based on the base quality and allele frequency at the non-reference positions for an individual or pooled sample. We demonstrated the utility of Coval by applying it to simulated genomes and experimentally obtained short-read data of rice, nematode, and mouse. Moreover, we found an unexpectedly large number of incorrectly mapped reads in ‘targeted’ alignments, where the whole genome sequencing reads had been aligned to a local genomic segment, and showed that Coval effectively eliminated such spurious alignments. We conclude that Coval significantly improves the quality of short-read sequence alignments, thereby increasing the calling accuracy of currently available tools for SNP and indel identification. Coval is available at <a href="http://sourceforge.net/projects/coval105/" target="_blank">http://sourceforge.net/projects/coval105/</a>.</p></div
Improvement of SNP/indel calling accuracy by Coval-Refine in targeted alignment.
<p>The whole chromosomes (All chr), chromosome 10 (Chr10), a 1 Mb fragment of chromosome 10 (Chr10-1M: positions 1000001 to 2000000 of Chr10) from the simulated rice genome were aligned with 75-bp paired-end reads sequenced from the whole rice genome using BWA. The alignments were filtered (+, bars in dark- and middle-red and in dark- and middle-blue) or not filtered (–, bars in light red and in light blue) with Coval-Refine in the basic mode. Two different filtering conditions of Coval-Refine for mismatch reads were applied; one is the default option for removing reads with three or more mismatches (middle-red and middle-blue bars), the other removing the second paired-end mate read when the first mate is filtered and removing a read pair that contained more than two total mismatches (dark red and dark blue bars). The mean coverage of read depth before and after (indicated with parentheses) the Coval-Refine treatment is indicated under the reference chromosome name. Homozygous SNPs and indels were called as in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0075402#pone-0075402-g001" target="_blank">Figure 1</a>. TPR and FPR for the called SNPs are shown with red and blue bars, respectively.</p
Number of mismatches in aligned reads.
a<p>The percentage of reads with mismatches, out of the total number of aligned reads for each species or simulated reads. Aligned reads are paired-end reads of 100 bp for nematode, 76 bp for mouse, and 75 bp for the others. Artificial reads reflecting the error tendency of the rice reads were generated with a dwgsim. The total error rates (%) are indicated in the last line.</p
Calling accuracy of SNPs from alignment data containing multiple samples.
<p>The experimentally obtained rice reads (60, 30, and 15 millions) were mixed with the simulated 75 bp paired-end reads (60, 90, and 105 millions) generated by dwgsim with the rice simulated genome as template, respectively, yielding 120 millions of reads. The read mixtures were aligned to the rice simulated genome, resulting in alignments with average read depth of 24×, and each read set (sample) in the read mixtures was discriminated from the other read set using the RG tag. The SNPs were called using Coval-Call with a maximum of 80 reads covering the called positions, a minimum allele frequency at the called position of 0.2 (for 50% homozygous sample), 0.1 (for 50% heterozygous and 25% homozygous samples), or 0.05 (for 25% heterozygous and 12.5% homozygous samples), a minimum of three reads (for 50% homozygous sample) or two reads (for the others) supporting the called allele.</p>a<p>Percentage of the experimentally obtained rice read sample in the read mixture.</p>b<p>Heterozygosity of the experimentally obtained rice read sample (Homo: 0% heterozygosity, Hetero: 50% heterozygosity).</p
Improvement by Coval-Refine of SNP/indel calling accuracy of variant calling tools for mouse alignment data.
<p>(A) SNP calling accuracy with or without Coval-Refine. (B) Indel calling accuracy with or without Coval-Refine. A simulated mouse genome was aligned with real mouse read data using BWA. The alignments were filtered (+, striped bars) or not filtered (–, plain bars) with Coval-Refine. Homozygous SNPs and indels were called with the indicated variant callers under the same conditions as in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0075402#pone-0075402-g001" target="_blank">Figure 1</a>.</p
Improvement of SNP/indel calling accuracies of various DNA variant callers by Coval-Refine.
<p>(A) SNP calling accuracy with or without Coval-Refine. (B) Indel calling accuracy with or without Coval-Refine. The simulated rice genome was aligned with reads of the real rice genome (experimental reads) using BWA. Alignment data were filtered (+, red striped and blue striped bars) or not filtered (–, light red and light blue bars) with the Coval-Refine component (Coval-Refine, error correction mode), and homozygous SNPs and indels were called using the indicated variant callers. The SNPs and indels extracted by all the callers were further filtered under the same conditions, as described in the text. True positive rate (TPR, the number of successfully called SNPs or indels divided with the number of SNPs or indels introduced into the simulated genome, followed by multiplying with 100) is shown with light red and red striped bars, and false positive rate (FPR, the number of wrongly called SNPs or indels divided with the number of the totally called SNPs or indels, followed by multiplying with 100) with light blue and blue striped bars. The GATK pileline was carried out with (GATK BQSR) or without (GATK) the base quality score recalibration. A variant quality score recalibration in the GATK pipeline was omitted because of its unsuitability for our data. Instead it was replaced by simple filtering: a minimum allele frequency of 0.8 and a minimum allelic read depth of 2 (see Materials and Methods for details).</p
MOESM1 of Grouping of multicopper oxidases in Lentinula edodes by sequence similarities and expression patterns
Additional file 1. Additional information of L. edodes multicopper oxidases, such as signal peptide prediction, enzyme purificaition, expression patterns, primer design, data set for phylogenetic analysis, and sequence infromation
Additional file 6: Table S5. of Genome analysis of the foxtail millet pathogen Sclerospora graminicola reveals the complex effector repertoire of graminicolous downy mildews
TPM values of DEGs encoding putative secreted proteins and cluster numbers from clustering analyses. (XLSX 68 kb
Uso de inhibidores de tak1 en la prevención y tratamiento del fracaso de la membrana peritoneal
[EN] The invention relates to the use of TAKl inhibitors for the preparation of a drug for the prevention and treatment of peritoneal membrane failure, said use preventing and reversing the mesenchymal-epithelial transition experienced by the mesothelial cells of the peritoneum during peritoneal dialysis treatment. In addition, the invention relates to the use of a TAKl
expression product or of its activity as a biomarker for determining peritoneal fibrosis. The invention further relates to the method for obtaining data that can be used in the diagnosis and/or prognosis of peritoneal fibrosis and to a method for predicting the progression ofperitoneal fibrosis. The invention also relates to the use of a kit comprising the sequence that codes for TAKl or
the protein for the diagnosis of peritoneal fibrosis.[ES] La presente invención se refiere al uso de inhibidores de TAKl para la preparación de un medicamento para la
prevención y tratamiento del fracaso de la membrana peritoneal. Donde dicho uso previene y revierte la transición epitelio mesénquima que sufren las células mesoteliales del peritoneo durante el tratamiento de diálisis peritoneal. También, al uso de un producto de expresión de TAKl o de su actividad como biomarcador para la detenninación de fibrosis peritoneal. Al método de
obtención de datos útiles en el diagnóstico y/o pronóstico de la fibrosis peritoneal; un método para predecir la progresión de la fibrosis peritoneal. Así como el uso de un kit que comprende la secuencia que codifica para TAKl o la proteína para el diagnóstico de la fibrosis peritoneal.Peer reviewedCentro Nacional de Investigaciones Cardiovasculares, Consejo Superior de Investigaciones Científicas, Centro de Investigación Biomédica en Red:Enfermedades Hepáticas y DigestivasA2 Solicitud de patente sin informe sobre el estado de la técnic
Additional file 16: of Genome analysis of the foxtail millet pathogen Sclerospora graminicola reveals the complex effector repertoire of graminicolous downy mildews
RXLR-like genes predicted in genome of Sclerospora graminicola and their expression levels during infection. (XLSX 82 kb