101 research outputs found

    DolphinNext: a distributed data processing platform for high throughput genomics

    Get PDF
    BACKGROUND: The emergence of high throughput technologies that produce vast amounts of genomic data, such as next-generation sequencing (NGS) is transforming biological research. The dramatic increase in the volume of data, the variety and continuous change of data processing tools, algorithms and databases make analysis the main bottleneck for scientific discovery. The processing of high throughput datasets typically involves many different computational programs, each of which performs a specific step in a pipeline. Given the wide range of applications and organizational infrastructures, there is a great need for highly parallel, flexible, portable, and reproducible data processing frameworks. Several platforms currently exist for the design and execution of complex pipelines. Unfortunately, current platforms lack the necessary combination of parallelism, portability, flexibility and/or reproducibility that are required by the current research environment. To address these shortcomings, workflow frameworks that provide a platform to develop and share portable pipelines have recently arisen. We complement these new platforms by providing a graphical user interface to create, maintain, and execute complex pipelines. Such a platform will simplify robust and reproducible workflow creation for non-technical users as well as provide a robust platform to maintain pipelines for large organizations. RESULTS: To simplify development, maintenance, and execution of complex pipelines we created DolphinNext. DolphinNext facilitates building and deployment of complex pipelines using a modular approach implemented in a graphical interface that relies on the powerful Nextflow workflow framework by providing 1. A drag and drop user interface that visualizes pipelines and allows users to create pipelines without familiarity in underlying programming languages. 2. Modules to execute and monitor pipelines in distributed computing environments such as high-performance clusters and/or cloud 3. Reproducible pipelines with version tracking and stand-alone versions that can be run independently. 4. Modular process design with process revisioning support to increase reusability and pipeline development efficiency. 5. Pipeline sharing with GitHub and automated testing 6. Extensive reports with R-markdown and shiny support for interactive data visualization and analysis. CONCLUSION: DolphinNext is a flexible, intuitive, web-based data processing and analysis platform that enables creating, deploying, sharing, and executing complex Nextflow pipelines with extensive revisioning and interactive reporting to enhance reproducible results

    A resampling-based meta-analysis for detection of differential gene expression in breast cancer

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Accuracy in the diagnosis of breast cancer and classification of cancer subtypes has improved over the years with the development of well-established immunohistopathological criteria. More recently, diagnostic gene-sets at the mRNA expression level have been tested as better predictors of disease state. However, breast cancer is heterogeneous in nature; thus extraction of differentially expressed gene-sets that stably distinguish normal tissue from various pathologies poses challenges. Meta-analysis of high-throughput expression data using a collection of statistical methodologies leads to the identification of robust tumor gene expression signatures.</p> <p>Methods</p> <p>A resampling-based meta-analysis strategy, which involves the use of resampling and application of distribution statistics in combination to assess the degree of significance in differential expression between sample classes, was developed. Two independent microarray datasets that contain normal breast, invasive ductal carcinoma (IDC), and invasive lobular carcinoma (ILC) samples were used for the meta-analysis. Expression of the genes, selected from the gene list for classification of normal breast samples and breast tumors encompassing both the ILC and IDC subtypes were tested on 10 independent primary IDC samples and matched non-tumor controls by real-time qRT-PCR. Other existing breast cancer microarray datasets were used in support of the resampling-based meta-analysis.</p> <p>Results</p> <p>The two independent microarray studies were found to be comparable, although differing in their experimental methodologies (Pearson correlation coefficient, R = 0.9389 and R = 0.8465 for ductal and lobular samples, respectively). The resampling-based meta-analysis has led to the identification of a highly stable set of genes for classification of normal breast samples and breast tumors encompassing both the ILC and IDC subtypes. The expression results of the selected genes obtained through real-time qRT-PCR supported the meta-analysis results.</p> <p>Conclusion</p> <p>The proposed meta-analysis approach has the ability to detect a set of differentially expressed genes with the least amount of within-group variability, thus providing highly stable gene lists for class prediction. Increased statistical power and stringent filtering criteria used in the present study also make identification of novel candidate genes possible and may provide further insight to improve our understanding of breast cancer development.</p

    A MSFD complementary approach for the assessment of pressures, knowledge and data gaps in Southern European Seas : the PERSEUS experience

    Get PDF
    PERSEUS project aims to identify the most relevant pressures exerted on the ecosystems of the Southern European Seas (SES), highlighting knowledge and data gaps that endanger the achievement of SES Good Environmental Status (GES) as mandated by the Marine Strategy Framework Directive (MSFD). A complementary approach has been adopted, by a meta-analysis of existing literature on pressure/impact/knowledge gaps summarized in tables related to the MSFD descriptors, discriminating open waters from coastal areas. A comparative assessment of the Initial Assessments (IAs) for five SES countries has been also independently performed. The comparison between meta-analysis results and IAs shows similarities for coastal areas only. Major knowledge gaps have been detected for the biodiversity, marine food web, marine litter and underwater noise descriptors. The meta-analysis also allowed the identification of additional research themes targeting research topics that are requested to the achievement of GES. 2015 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license.peer-reviewe

    Analysis of the common genetic component of large-vessel vasculitides through a meta- Immunochip strategy

    Get PDF
    Giant cell arteritis (GCA) and Takayasu's arteritis (TAK) are major forms of large-vessel vasculitis (LVV) that share clinical features. To evaluate their genetic similarities, we analysed Immunochip genotyping data from 1,434 LVV patients and 3,814 unaffected controls. Genetic pleiotropy was also estimated. The HLA region harboured the main disease-specific associations. GCA was mostly associated with class II genes (HLA-DRB1/HLA-DQA1) whereas TAK was mostly associated with class I genes (HLA-B/MICA). Both the statistical significance and effect size of the HLA signals were considerably reduced in the cross-disease meta-analysis in comparison with the analysis of GCA and TAK separately. Consequently, no significant genetic correlation between these two diseases was observed when HLA variants were tested. Outside the HLA region, only one polymorphism located nearby the IL12B gene surpassed the study-wide significance threshold in the meta-analysis of the discovery datasets (rs755374, P?=?7.54E-07; ORGCA?=?1.19, ORTAK?=?1.50). This marker was confirmed as novel GCA risk factor using four additional cohorts (PGCA?=?5.52E-04, ORGCA?=?1.16). Taken together, our results provide evidence of strong genetic differences between GCA and TAK in the HLA. Outside this region, common susceptibility factors were suggested, especially within the IL12B locus

    Search for dark matter produced in association with bottom or top quarks in √s = 13 TeV pp collisions with the ATLAS detector

    Get PDF
    A search for weakly interacting massive particle dark matter produced in association with bottom or top quarks is presented. Final states containing third-generation quarks and miss- ing transverse momentum are considered. The analysis uses 36.1 fb−1 of proton–proton collision data recorded by the ATLAS experiment at √s = 13 TeV in 2015 and 2016. No significant excess of events above the estimated backgrounds is observed. The results are in- terpreted in the framework of simplified models of spin-0 dark-matter mediators. For colour- neutral spin-0 mediators produced in association with top quarks and decaying into a pair of dark-matter particles, mediator masses below 50 GeV are excluded assuming a dark-matter candidate mass of 1 GeV and unitary couplings. For scalar and pseudoscalar mediators produced in association with bottom quarks, the search sets limits on the production cross- section of 300 times the predicted rate for mediators with masses between 10 and 50 GeV and assuming a dark-matter mass of 1 GeV and unitary coupling. Constraints on colour- charged scalar simplified models are also presented. Assuming a dark-matter particle mass of 35 GeV, mediator particles with mass below 1.1 TeV are excluded for couplings yielding a dark-matter relic density consistent with measurements

    Measurements of top-quark pair differential cross-sections in the eμe\mu channel in pppp collisions at s=13\sqrt{s} = 13 TeV using the ATLAS detector

    Get PDF

    Search for single production of vector-like quarks decaying into Wb in pp collisions at s=8\sqrt{s} = 8 TeV with the ATLAS detector

    Get PDF

    Measurement of the W boson polarisation in ttˉt\bar{t} events from pp collisions at s\sqrt{s} = 8 TeV in the lepton + jets channel with ATLAS

    Get PDF

    Measurement of the charge asymmetry in top-quark pair production in the lepton-plus-jets final state in pp collision data at s=8TeV\sqrt{s}=8\,\mathrm TeV{} with the ATLAS detector

    Get PDF

    Charged-particle distributions at low transverse momentum in s=13\sqrt{s} = 13 TeV pppp interactions measured with the ATLAS detector at the LHC

    Get PDF
    corecore