123 research outputs found

    Integrating omics datasets with the OmicsPLS package

    Get PDF
    Background: With the exponential growth in available biomedical data, there is a need for data integration methods that can extract information about relationships between the data sets. However, these data sets might have very different characteristics. For interpretable results, data-specific variation needs to be quantified. For this task, Two-way Orthogonal Partial Least Squares (O2PLS) has been proposed. To facilitate application and development of the methodology, free and open-source software is required. However, this is not the case with O2PLS. Results: We introduce OmicsPLS, an open-source implementation of the O2PLS method in R. It can handle both low- and high-dimensional datasets efficiently. Generic methods for inspecting and visualizing results are implemented. Both a standard and faster alternative cross-validation methods are available to determine the number of components. A simulation study shows good performance of OmicsPLS compared to alternatives, in terms of accuracy and CPU runtime. We demonstrate OmicsPLS by integrating genetic and glycomic data. Conclusions: We propose the OmicsPLS R package: a free and open-source implementation of O2PLS for statistical data integration. OmicsPLS is available at https://cran.r-project.org/package=OmicsPLSand can be installed in R via install.packages("OmicsPLS")

    Inflammatory and tolerogenic myeloid cells determine outcome following human allergen challenge

    Get PDF
    Innate mononuclear phagocytic system (MPS) cells preserve mucosal immune homeostasis. We investigated their role at nasal mucosa following allergen challenge with house dust mite. We combined single-cell proteome and transcriptome profiling on nasal immune cells from nasal biopsies cells from 30 allergic rhinitis and 27 non-allergic subjects before and after repeated nasal allergen challenge. Biopsies of patients showed infiltrating inflammatory HLA-DRhi/CD14+ and CD16+ monocytes and proallergic transcriptional changes in resident CD1C+/CD1A+ conventional dendritic cells (cDC)2 following challenge. In contrast, non-allergic individuals displayed distinct innate MPS responses to allergen challenge: predominant infiltration of myeloid-derived suppressor cells (MDSC: HLA-DRlow/CD14+ monocytes) and cDC2 expressing inhibitory/tolerogenic transcripts. These divergent patterns were confirmed in ex vivo stimulated MPS nasal biopsy cells. Thus, we identified not only MPS cell clusters involved in airway allergic inflammation but also highlight novel roles for non-inflammatory innate MPS responses by MDSC to allergens in non-allergic individuals. Future therapies should address MDSC activity as treatment for inflammatory airway diseases.</p

    Gentle Masking of Low-Complexity Sequences Improves Homology Search

    Get PDF
    Detection of sequences that are homologous, i.e. descended from a common ancestor, is a fundamental task in computational biology. This task is confounded by low-complexity tracts (such as atatatatatat), which arise frequently and independently, causing strong similarities that are not homologies. There has been much research on identifying low-complexity tracts, but little research on how to treat them during homology search. We propose to find homologies by aligning sequences with “gentle” masking of low-complexity tracts. Gentle masking means that the match score involving a masked letter is , where is the unmasked score. Gentle masking slightly but noticeably improves the sensitivity of homology search (compared to “harsh” masking), without harming specificity. We show examples in three useful homology search problems: detection of NUMTs (nuclear copies of mitochondrial DNA), recruitment of metagenomic DNA reads to reference genomes, and pseudogene detection. Gentle masking is currently the best way to treat low-complexity tracts during homology search

    Occupational exposure to gases/fumes and mineral dust affect DNA methylation levels of genes regulating expression

    Get PDF
    Many workers are daily exposed to occupational agents like gases/fumes, mineral dust or biological dust, which could induce adverse health effects. Epigenetic mechanisms, such as DNA methylation, have been suggested to play a role. We therefore aimed to identify differentially methylated regions (DMRs) upon occupational exposures in never-smokers and investigated if these DMRs associated with gene expression levels. To determine the effects of occupational exposures independent of smoking, 903 never-smokers of the LifeLines cohort study were included. We performed three genome-wide methylation analyses (Illumina 450 K), one per occupational exposure being gases/fumes, mineral dust and biological dust, using robust linear regression adjusted for appropriate confounders. DMRs were identified using comb-p in Python. Results were validated in the Rotterdam Study (233 never-smokers) and methylation-expression associations were assessed using Biobank-based Integrative Omics Study data (n = 2802). Of the total 21 significant DMRs, 14 DMRs were associated with gases/fumes and 7 with mineral dust. Three of these DMRs were associated with both exposures (RPLP1 and LINC02169 (2x)) and 11 DMRs were located within transcript start sites of gene expression regulating genes. We replicated two DMRs with gases/fumes (VTRNA2-1 and GNAS) and one with mineral dust (CCDC144NL). In addition, nine gases/fumes DMRs and six mineral dust DMRs significantly associated with gene expression levels. Our data suggest that occupational exposures may induce differential methylation of gene expression regulating genes and thereby may induce adverse health effects. Given the millions of workers that are exposed daily to occupational exposures, further studies on this epigenetic mechanism and health outcomes are warranted

    Chromosomal-level assembly of the Asian Seabass genome using long sequence reads and multi-layered scaffolding

    Get PDF
    We report here the ~670 Mb genome assembly of the Asian seabass (Lates calcarifer), a tropical marine teleost. We used long-read sequencing augmented by transcriptomics, optical and genetic mapping along with shared synteny from closely related fish species to derive a chromosome-level assembly with a contig N50 size over 1 Mb and scaffold N50 size over 25 Mb that span ~90% of the genome. The population structure of L. calcarifer species complex was analyzed by re-sequencing 61 individuals representing various regions across the species' native range. SNP analyses identified high levels of genetic diversity and confirmed earlier indications of a population stratification comprising three clades with signs of admixture apparent in the South-East Asian population. The quality of the Asian seabass genome assembly far exceeds that of any other fish species, and will serve as a new standard for fish genomics

    New function of the myostatin/activin type I receptor (ALK4) as a mediator of muscle atrophy and muscle regeneration

    Get PDF
    Skeletal muscle fibrosis and impaired muscle regeneration are major contributors to muscle wasting in Duchenne muscular dystrophy (DMD). Muscle growth is negatively regulated by myostatin (MSTN) and activins. Blockage of these pathways may improve muscle quality and function in DMD. Antisense oligonucleotides (AONs) were designed specifically to block the function of ALK4, a key receptor for the MSTN/activin pathway in skeletal muscle. AON-induced exon skipping resulted in specific Alk4 down-regulation, inhibition of MSTN activity, and increased myoblast differentiation in vitro Unexpectedly, a marked decrease in muscle mass (10%) was found after Alk4 AON treatment in mdx mice. In line with in vitro results, muscle regeneration was stimulated, and muscle fiber size decreased markedly. Notably, when Alk4 was down-regulated in adult wild-type mice, muscle mass decreased even more. RNAseq analysis revealed dysregulated metabolic functions and signs of muscle atrophy. We conclude that ALK4 inhibition increases myogenesis but also regulates the tight balance of protein synthesis and degradation. Therefore, caution must be used when developing therapies that interfere with MSTN/activin pathways

    Quaking promotes monocyte differentiation into pro-atherogenic macrophages by controlling pre-mRNA splicing and gene expression

    Get PDF
    A hallmark of inflammatory diseases is the excessive recruitment and influx of monocytes to sites of tissue damage and their ensuing differentiation into macrophages. Numerous stimuli are known to induce transcriptional changes associated with macrophage phenotype, but posttranscriptional control of human macrophage differentiation is less well understood. Here we show that expression levels of the RNA-binding protein Quaking (QKI) are low in monocytes and early human atherosclerotic lesions, but are abundant in macrophages of advanced plaques. Depletion of QKI protein impairs monocyte adhesion, migration, differentiation into macrophages and foam cell formation in vitro and in vivo. RNA-seq and microarray analysis of human monocyte and macrophage transcriptomes, including those of a unique QKI haploinsufficient patient, reveal striking changes in QKI-dependent messenger RNA levels and splicing of RNA transcripts. The biological importance of these transcripts and requirement for QKI during differentiation illustrates a central role for QKI in posttranscriptionally guiding macrophage identity and function.No sponso

    Comprehensive diagnostics of acute myeloid leukemia by whole transcriptome RNA sequencing

    Get PDF
    Acute myeloid leukemia (AML) is caused by genetic aberrations that also govern the prognosis of patients and guide risk-adapted and targeted therapy. Genetic aberrations in AML are structurally diverse and currently detected by different diagnostic assays. This study sought to establish whole transcriptome RNA sequencing as single, comprehensive, and flexible platform for AML diagnostics. We developed HAMLET (Human AML Expedited Transcriptomics) as bioinformatics pipeline for simultaneous detection of fusion genes, small variants, tandem duplications, and gene expression with all information assembled in an annotated, user-friendly output file. Whole transcriptome RNA sequencing was performed on 100 AML cases and HAMLET results were validated by reference assays and targeted resequencing. The data showed that HAMLET accurately detected all fusion genes and overexpression of EVI1 irrespective of 3q26 aberrations. In addition, small variants in 13 genes that are often mutated in AML were called with 99.2% sensitivity and 100% specificity, and tandem duplications in FLT3 and KMT2A were detected by a novel algorithm based on soft-clipped reads with 100% sensitivity and 97.1% specificity. In conclusion, HAMLET has the potential to provide accurate comprehensive diagnostic information relevant for AML classification, risk assessment and targeted therapy on a single technology platform

    Transcription factor site dependencies in human, mouse and rat genomes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>It is known that transcription factors frequently act together to regulate gene expression in eukaryotes. In this paper we describe a computational analysis of transcription factor site dependencies in human, mouse and rat genomes.</p> <p>Results</p> <p>Our approach for quantifying tendencies of transcription factor binding sites to co-occur is based on a binding site scoring function which incorporates dependencies between positions, the use of information about the structural class of each transcription factor (major/minor groove binder), and also considered the possible implications of varying GC content of the sequences. Significant tendencies (dependencies) have been detected by non-parametric statistical methodology (permutation tests). Evaluation of obtained results has been performed in several ways: reports from literature (many of the significant dependencies between transcription factors have previously been confirmed experimentally); dependencies between transcription factors are not biased due to similarities in their DNA-binding sites; the number of dependent transcription factors that belong to the same functional and structural class is significantly higher than would be expected by chance; supporting evidence from GO clustering of targeting genes. Based on dependencies between two transcription factor binding sites (second-order dependencies), it is possible to construct higher-order dependencies (networks). Moreover results about transcription factor binding sites dependencies can be used for prediction of groups of dependent transcription factors on a given promoter sequence. Our results, as well as a scanning tool for predicting groups of dependent transcription factors binding sites are available on the Internet.</p> <p>Conclusion</p> <p>We show that the computational analysis of transcription factor site dependencies is a valuable complement to experimental approaches for discovering transcription regulatory interactions and networks. Scanning promoter sequences with dependent groups of transcription factor binding sites improve the quality of transcription factor predictions.</p
    corecore