11 research outputs found

    Preferred analysis methods for single genomic regions in RNA sequencing revealed by processing the shape of coverage

    Get PDF
    The informational content of RNA sequencing is currently far from being completely explored. Most of the analyses focus on processing tables of counts or finding isoform deconvolution via exon junctions. This article presents a comparison of several techniques that can be used to estimate differential expression of exons or small genomic regions of expression, based on their coverage function shapes. The problem is defined as finding the differentially expressed exons between two samples using local expression profile normalization and statistical measures to spot the differences between two profile shapes. Initial experiments have been done using synthetic data, and real data modified with synthetically created differential patterns. Then, 160 pipelines (5 types of generator × 4 normalizations × 8 difference measures) are compared. As a result, the best analysis pipelines are selected based on linearity of the differential expression estimation and the area under the ROC curve. These platform-independent techniques have been implemented in the Bioconductor package rnaSeqMap. They point out the exons with differential expression or internal splicing, even if the counts of reads may not show this. The areas of application include significant difference searches, splicing identification algorithms and finding suitable regions for QPCR primer

    Preferred analysis methods for single genomic regions in RNA sequencing revealed by processing the shape of coverage

    Get PDF
    The informational content of RNA sequencing is currently far from being completely explored. Most of the analyses focus on processing tables of counts or finding isoform deconvolution via exon junctions. This article presents a comparison of several techniques that can be used to estimate differential expression of exons or small genomic regions of expression, based on their coverage function shapes. The problem is defined as finding the differentially expressed exons between two samples using local expression profile normalization and statistical measures to spot the differences between two profile shapes. Initial experiments have been done using synthetic data, and real data modified with synthetically created differential patterns. Then, 160 pipelines (5 types of generator × 4 normalizations × 8 difference measures) are compared. As a result, the best analysis pipelines are selected based on linearity of the differential expression estimation and the area under the ROC curve. These platform-independent techniques have been implemented in the Bioconductor package rnaSeqMap. They point out the exons with differential expression or internal splicing, even if the counts of reads may not show this. The areas of application include significant difference searches, splicing identification algorithms and finding suitable regions for QPCR primers

    Ambiguous genes due to aligners and their impact on RNA-seq data analysis

    No full text
    The main scope of the study is ambiguous genes, i.e. genes whose expression is difficult to estimate from the data produced by next-generation sequencing technologies. We focused on the RNA sequencing (RNA-Seq) type of experiment performed on the Illumina platform. It is crucial to identify such genes and understand the cause of their difficulty, as these genes may be involved in some diseases. By giving misleading results, they could contribute to a misunderstanding of the cause of certain diseases, which could lead to inappropriate treatment. We thought that the ambiguous genes would be difficult to map because of their complex structure. So we looked at RNA-seq analysis using different mappers to find genes that would have different measurements from the aligners. We were able to identify such genes using a generalized linear model with two factors: mappers and groups introduced by the experiment. A large proportion of ambiguous genes are pseudogenes. High sequence similarity of pseudogenes to functional genes may indicate problems in alignment procedures. In addition, predictive analysis verified the performance of difficult genes in classification. The effectiveness of classifying samples into specific groups was compared, including the expression of difficult and not difficult genes as covariates. In almost all cases considered, ambiguous genes have less predictive power.ISSN:2045-232

    Genes sharing the protein family domain decrease the performance of classification with RNA-seq genomic signatures

    No full text
    Background The experience with running various types of classification on the CAMDA neuroblastoma dataset have led us to the conclusion that the results are not always obvious and may differ depending on type of analysis and selection of genes used for classification. This paper aims in pointing out several factors that may influence the downstream machine learning analysis. In particular those factors are: type of the primary analysis, type of the classifier and increased correlation between the genes sharing a protein domain. They influence the analysis directly, but also interplay between them may be important. We have compiled the gene-domain database and used it for analysis to see the differences between the genes that share a domain versus the rest of the genes in the datasets. Results The major findings are: pairs of genes that share a domain have an increased Spearman’s correlation coefficients of counts; genes sharing a domain are expected to have a lower predictive power due to increased correlation. For most of the cases it can be seen with the higher number of misclassified samples; classifiers performance may vary depending on a method, still in most cases using genes sharing a domain in the training set results in a higher misclassification rate; increased correlation in genes sharing a domain results most often in worse performance of the classifiers regardless of the primary analysis tools used, even if the primary analysis alignment yield varies. Conclusions The effect of sharing a domain is likely more a results of real biological co-expression than just sequence similarity and artifacts of mapping and counting. Still, this is more difficult to conclude and needs further research. The effect is interesting itself, but we also point out some practical aspects in which it may influence the RNA sequencing analysis and RNA biomarker use. In particular it means that a gene signature biomarker set build out of RNA-sequencing results should be depleted for genes sharing common domains. It may cause to perform better when applying classification

    Effect of Zearalenone and Hormone Regulators on Microspore Embryogenesis in Anther Culture of Wheat

    No full text
    The purpose of this work was to assess the impact of zearalenone (ZEN) and selected hormone regulators on the effectiveness of microspore embryogenesis in anther culture of wheat. The plant material comprised F1 hybrids of winter and spring wheat. Six combinations of media inducing microspore proliferation and formation of embryogenic structures were investigated: two combinations of growth regulators (D - 2,4-D + dicamba, K - 2,4-D + kinetin), each with three ZEN concentrations (0 mL/L, 0.1 mL/L, 0.2 mL/L). A significant increase in microspore embryogenesis effectiveness on media with the addition of ZEN was observed both at the stages of its induction and the formation of green plants in some genotypes. In case of both combinations of growth regulators, an increased concentration of ZEN resulted in more effective induction of microspore embryogenesis. The most effective induction medium was the D medium supplemented with 0.2 mL/L ZEN. As a result of the use of zearalenone together with two combinations of growth regulators, all genotypes tested produced androgenic structures, which indicates the breakdown of genotypic recalcitrant in the analysed hybrids. In addition, green plants were obtained from 18 out of 19 tested hybrids. The addition of ZEN to the medium did not affect the number of regenerated albino plants nor the number of spontaneous genome doublings proportion

    The Defense Response of Nicotiana benthamiana to Peanut Stunt Virus Infection in the Presence of Symptom Exacerbating Satellite RNA

    No full text
    Peanut stunt virus (PSV) is a widespread disease infecting legumes. The PSV strains are classified into four subgroups and some are defined by the association of satellite RNAs (satRNAs). In the case of PSV, the presence of satRNAs alters the symptoms of disease in infected plants. In this study, we elucidated the plant response to PSV-G strain, which occurs in natural conditions without satRNA. However, it was found that it might easily acquire satRNA, which exacerbated pathogenesis in Nicotiana benthamiana. To explain the mechanisms underlying PSV infection and symptoms exacerbation caused by satRNA, we carried out transcriptome profiling of N. benthamiana challenged by PSV-G and satRNA using species-specific microarrays. Co-infection of plants with PSV-G + satRNA increased the number of identified differentially expressed genes (DEGs) compared with the number identified in PSV-G-infected plants. In both treatments, the majority of up-regulated DEGs were engaged in translation, ribosome biogenesis, RNA metabolism, and response to stimuli, while the down-regulated DEGs were required for photosynthesis. The presence of satRNA in PSV-G-infected plants caused different trends in expression of DEGs associated with phosphorylation, ATP binding, and plasma membrane

    Transcriptome Remodeling in Gradual Development of Inverse Resistance between Paclitaxel and Cisplatin in Ovarian Cancer Cells

    No full text
    Resistance to anti-cancer drugs is the main challenge in oncology. In pre-clinical studies, established cancer cell lines are primary tools in deciphering molecular mechanisms of this phenomenon. In this study, we proposed a new, transcriptome-focused approach, utilizing a model of isogenic cancer cell lines with gradually changing resistance. We analyzed trends in gene expression in the aim to find out a scaffold of resistance development process. The ovarian cancer cell line A2780 was treated with stepwise increased concentrations of paclitaxel (PTX) to generate a series of drug resistant sublines. To monitor transcriptome changes we submitted them to mRNA-sequencing, followed by the identification of differentially expressed genes (DEGs), principal component analysis (PCA), and hierarchical clustering. Functional interactions of proteins, encoded by DEGs, were analyzed by building protein-protein interaction (PPI) networks. We obtained human ovarian cancer cell lines with gradually developed resistance to PTX and collateral sensitivity to cisplatin (CDDP) (inverse resistance). In their transcriptomes, we identified two groups of DEGs: (1) With fluctuations in expression in the course of resistance acquiring; and (2) with a consistently changed expression at each stage of resistance development, constituting a scaffold of the process. In the scaffold PPI network, the cell cycle regulator—polo-like kinase 2 (PLK2); proteins belonging to the tumor necrosis factor (TNF) ligand and receptor family, as well as to the ephrin receptor family were found, and moreover, proteins linked to osteo- and chondrogenesis and the nervous system development. Our cellular model of drug resistance allowed for keeping track of trends in gene expression and studying this phenomenon as a process of evolution, reflected by global transcriptome remodeling. This approach enabled us to explore novel candidate genes and surmise that abrogation of the osteomimic phenotype in ovarian cancer cells might occur during the development of inverse resistance between PTX and CDDP

    CellMixS: quantifying and visualizing batch effects in single-cell RNA-seq data

    Full text link
    A key challenge in single-cell RNA-sequencing (scRNA-seq) data analysis is batch effects that can obscure the biological signal of interest. Although there are various tools and methods to correct for batch effects, their performance can vary. Therefore, it is important to understand how batch effects manifest to adjust for them. Here, we systematically explore batch effects across various scRNA-seq datasets according to magnitude, cell type specificity, and complexity. We developed a cell-specific mixing score (cms) that quantifies mixing of cells from multiple batches. By considering distance distributions, the score is able to detect local batch bias as well as differentiate between unbalanced batches and systematic differences between cells of the same cell type. We compare metrics in scRNA-seq data using real and synthetic datasets and whereas these metrics target the same question and are used interchangeably, we find differences in scalability, sensitivity, and ability to handle differentially abundant cell types. We find that cell-specific metrics outperform cell type–specific and global metrics and recommend them for both method benchmarks and batch exploration

    Multiparametric Evaluation of Post-MI Small Animal Models Using Metabolic ([18F]FDG) and Perfusion-Based (SYN1) Heart Viability Tracers

    No full text
    Cardiovascular diseases (CVD), with myocardial infarction (MI) being one of the crucial components, wreak havoc in developed countries. Advanced imaging technologies are required to obtain quick and widely available diagnostic data. This paper describes a multimodal approach to in vivo perfusion imaging using the novel SYN1 tracer based on the fluorine-18 isotope. The NOD-SCID mice were injected intravenously with SYN1 or [18F] fluorodeoxyglucose ([18F]-FDG) radiotracers after induction of the MI. In all studies, the positron emission tomography–computed tomography (PET/CT) technique was used. To obtain hemodynamic data, mice were subjected to magnetic resonance imaging (MRI). Finally, the biodistribution of the SYN1 compound was performed using Wistar rat model. SYN1 showed normal accumulation in mouse and rat hearts, and MI hearts correctly indicated impaired cardiac segments when compared to [18F]-FDG uptake. In vivo PET/CT and MRI studies showed statistical convergence in terms of the size of the necrotic zone and cardiac function. This was further supported with RNAseq molecular analyses to correlate the candidate function genes’ expression, with Serpinb1c, Tnc and Nupr1, with Trem2 and Aldolase B functional correlations showing statistical significance in both SYN1 and [18F]-FDG. Our manuscript presents a new fluorine-18-based perfusion radiotracer for PET/CT imaging that may have importance in clinical applications. Future research should focus on confirmation of the data elucidated here to prepare SYN1 for first-in-human trials
    corecore