143 research outputs found

    Comparison of leading parallel NAS file systems on commodity hardware

    Get PDF
    High performance computing has experienced tremendous gains in system performance over the past 20 years. Unfortunately other system capabilities, such as file I/O, have not grown commensurately. In this activity, we present the results of our tests of two leading file systems (GPFS and Lustre) on the same physical hardware. This hardware is the standard commodity storage solution in use at LLNL and, while much smaller in size, is intended to enable us to learn about differences between the two systems in terms of performance, ease of use and resilience. This work represents the first hardware consistent study of the two leading file systems that the authors are aware of

    Empirical Bayes models for multiple probe type microarrays at the probe level

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>When analyzing microarray data a primary objective is often to find differentially expressed genes. With empirical Bayes and penalized t-tests the sample variances are adjusted towards a global estimate, producing more stable results compared to ordinary t-tests. However, for Affymetrix type data a clear dependency between variability and intensity-level generally exists, even for logged intensities, most clearly for data at the probe level but also for probe-set summarizes such as the MAS5 expression index. As a consequence, adjustment towards a global estimate results in an intensity-level dependent false positive rate.</p> <p>Results</p> <p>We propose two new methods for finding differentially expressed genes, Probe level Locally moderated Weighted median-t (PLW) and Locally Moderated Weighted-t (LMW). Both methods use an empirical Bayes model taking the dependency between variability and intensity-level into account. A global covariance matrix is also used allowing for differing variances between arrays as well as array-to-array correlations. PLW is specially designed for Affymetrix type arrays (or other multiple-probe arrays). Instead of making inference on probe-set summaries, comparisons are made separately for each perfect-match probe and are then summarized into one score for the probe-set.</p> <p>Conclusion</p> <p>The proposed methods are compared to 14 existing methods using five spike-in data sets. For RMA and GCRMA processed data, PLW has the most accurate ranking of regulated genes in four out of the five data sets, and LMW consistently performs better than all examined moderated t-tests when used on RMA, GCRMA, and MAS5 expression indexes.</p

    Discovering collectively informative descriptors from high-throughput experiments

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Improvements in high-throughput technology and its increasing use have led to the generation of many highly complex datasets that often address similar biological questions. Combining information from these studies can increase the reliability and generalizability of results and also yield new insights that guide future research.</p> <p>Results</p> <p>This paper describes a novel algorithm called BLANKET for symmetric analysis of two experiments that assess informativeness of descriptors. The experiments are required to be related only in that their descriptor sets intersect substantially and their definitions of case and control are consistent. From resulting lists of n descriptors ranked by informativeness, BLANKET determines <b>shortlists </b>of descriptors from each experiment, generally of different lengths p and q. For any pair of shortlists, four numbers are evident: the number of descriptors appearing in both shortlists, in exactly one shortlist, or in neither shortlist. From the associated contingency table, BLANKET computes Right Fisher Exact Test (RFET) values used as scores over a plane of possible pairs of shortlist lengths <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr></abbrgrp>. BLANKET then chooses a pair or pairs with RFET score less than a threshold; the threshold depends upon n and shortlist length limits and represents a quality of intersection achieved by less than 5% of random lists.</p> <p>Conclusions</p> <p>Researchers seek within a universe of descriptors some minimal subset that collectively and efficiently predicts experimental outcomes. Ideally, any smaller subset should be insufficient for reliable prediction and any larger subset should have little additional accuracy. As a method, BLANKET is easy to conceptualize and presents only moderate computational complexity. Many existing databases could be mined using BLANKET to suggest optimal sets of predictive descriptors.</p

    Comparative Functional Genomics Analysis of NNK Tobacco-Carcinogen Induced Lung Adenocarcinoma Development in Gprc5a-Knockout Mice

    Get PDF
    Background: Improved understanding of lung cancer development and progression, including insights from studies of animal models, are needed to combat this fatal disease. Previously, we found that mice with a knockout (KO) of G-protein coupled receptor 5A (Gprc5a) develop lung tumors after a long latent period (12 to 24 months). Methodology/Principal Findings: To determine whether a tobacco carcinogen will enhance tumorigenesis in this model, we administered 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanone (NNK) i.p. to 2-months old Gprc5a-KO mice and sacrificed groups (n = 5) of mice at 6, 9, 12, and 18 months later. Compared to control Gprc5a-KO mice, NNK-treated mice developed lung tumors at least 6 months earlier, exhibited 2- to 4-fold increased tumor incidence and multiplicity, and showed a dramatic increase in lesion size. A gene expression signature, NNK-ADC, of differentially expressed genes derived by transcriptome analysis of epithelial cell lines from normal lungs of Gprc5a-KO mice and from NNK-induced adenocarcinoma was highly similar to differential expression patterns observed between normal and tumorigenic human lung cells. The NNK-ADC expression signature also separated both mouse and human adenocarcinomas from adjacent normal lung tissues based on publicly available microarray datasets. A key feature of the signature, up-regulation of Ube2c, Mcm2, and Fen1, was validated in mouse normal lung and adenocarcinoma tissues and cells by immunohistochemistry and western blotting, respectively

    An expression meta-analysis of predicted microRNA targets identifies a diagnostic signature for lung cancer

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Patients diagnosed with lung adenocarcinoma (AD) and squamous cell carcinoma (SCC), two major histologic subtypes of lung cancer, currently receive similar standard treatments, but resistance to adjuvant chemotherapy is prevalent. Identification of differentially expressed genes marking AD and SCC may prove to be of diagnostic value and help unravel molecular basis of their histogenesis and biologies, and deliver more effective and specific systemic therapy.</p> <p>Methods</p> <p>MiRNA target genes were predicted by union of miRanda, TargetScan, and PicTar, followed by screening for matched gene symbols in NCBI human sequences and Gene Ontology (GO) terms using the PANTHER database that was also used for analyzing the significance of biological processes and pathways within each ontology term. Microarray data were extracted from Gene Expression Omnibus repository, and tumor subtype prediction by gene expression used Prediction Analysis of Microarrays.</p> <p>Results</p> <p>Computationally predicted target genes of three microRNAs, miR-34b/34c/449, that were detected in human lung, testis, and fallopian tubes but not in other normal tissues, were filtered by representation of GO terms and their ability to classify lung cancer subtypes, followed by a meta-analysis of microarray data to classify AD and SCC. Expression of a minimal set of 17 predicted miR-34b/34c/449 target genes derived from the developmental process GO category was identified from a training set to classify 41 AD and 17 SCC, and correctly predicted in average 87% of 354 AD and 82% of 282 SCC specimens from total 9 independent published datasets. The accuracy of prediction still remains comparable when classifying 103 AD and 79 SCC samples from another 4 published datasets that have only 14 to 16 of the 17 genes available for prediction (84% and 85% for AD and SCC, respectively). Expression of this signature in two published datasets of epithelial cells obtained at bronchoscopy from cigarette smokers, if combined with cytopathology of the cells, yielded 89–90% sensitivity of lung cancer detection and 87–90% negative predictive value to non-cancer patients.</p> <p>Conclusion</p> <p>This study focuses on predicted targets of three lung-enriched miRNAs, compares their expression patterns in lung cancer by their GO terms, and identifies a minimal set of genes differentially expressed in AD and SCC, followed by validating this gene signature in multiple published datasets. Expression of this gene signature in bronchial epithelial cells of cigarette smokers also has a great sensitivity to predict the patients having lung cancer if combined with cytopathology of the cells.</p

    Meta-analysis of muscle transcriptome data using the MADMuscle database reveals biologically relevant gene patterns

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>DNA microarray technology has had a great impact on muscle research and microarray gene expression data has been widely used to identify gene signatures characteristic of the studied conditions. With the rapid accumulation of muscle microarray data, it is of great interest to understand how to compare and combine data across multiple studies. Meta-analysis of transcriptome data is a valuable method to achieve it. It enables to highlight conserved gene signatures between multiple independent studies. However, using it is made difficult by the diversity of the available data: different microarray platforms, different gene nomenclature, different species studied, etc.</p> <p>Description</p> <p>We have developed a system tool dedicated to muscle transcriptome data. This system comprises a collection of microarray data as well as a query tool. This latter allows the user to extract similar clusters of co-expressed genes from the database, using an input gene list. Common and relevant gene signatures can thus be searched more easily. The dedicated database consists in a large compendium of public data (more than 500 data sets) related to muscle (skeletal and heart). These studies included seven different animal species from invertebrates (<it>Drosophila melanogaster, Caenorhabditis elegans</it>) and vertebrates (<it>Homo sapiens, Mus musculus, Rattus norvegicus, Canis familiaris, Gallus gallus</it>). After a renormalization step, clusters of co-expressed genes were identified in each dataset. The lists of co-expressed genes were annotated using a unified re-annotation procedure. These gene lists were compared to find significant overlaps between studies.</p> <p>Conclusions</p> <p>Applied to this large compendium of data sets, meta-analyses demonstrated that conserved patterns between species could be identified. Focusing on a specific pathology (Duchenne Muscular Dystrophy) we validated results across independent studies and revealed robust biomarkers and new pathways of interest. The meta-analyses performed with MADMuscle show the usefulness of this approach. Our method can be applied to all public transcriptome data.</p

    Immuno-Therapy with Anti-CTLA4 Antibodies in Tolerized and Non-Tolerized Mouse Tumor Models

    Get PDF
    Monoclonal antibodies specific for cytotoxic T lymphocyte-associated antigen 4 (anti-CTLA4) are a novel form of cancer immunotherapy. While preclinical studies in mouse tumor models have shown anti-tumor efficacy of anti-CTLA4 injection or expression, anti-CTLA4 treatment in patients with advanced cancers had disappointing therapeutic benefit. These discrepancies have to be addressed in more adequate pre-clinical models. We employed two tumor models. The first model is based on C57Bl/6 mice and syngeneic TC-1 tumors expressing HPV16 E6/E7. In this model, the HPV antigens are neo-antigens, against which no central tolerance exists. The second model involves mice transgenic for the proto-oncogen neu and syngeneic mouse mammary carcinoma (MMC) cells. In this model tolerance to Neu involves both central and peripheral mechanisms. Anti-CTLA4 delivery as a protein or expression from gene-modified tumor cells were therapeutically efficacious in the non-tolerized TC-1 tumor model, but had no effect in the MMC-model. We also used the two tumor models to test an immuno-gene therapy approach for anti-CTLA4. Recently, we used an approach based on hematopoietic stem cells (HSC) to deliver the relaxin gene to tumors and showed that this approach facilitates pre-existing anti-tumor T-cells to control tumor growth in the MMC tumor model. However, unexpectedly, when used for anti-CTLA4 gene delivery in this study, the HSC-based approach was therapeutically detrimental in both the TC-1 and MMC models. Anti-CTLA4 expression in these models resulted in an increase in the number of intratumoral CD1d+ NKT cells and in the expression of TGF-β1. At the same time, levels of pro-inflammatory cytokines and chemokines, which potentially can support anti-tumor T-cell responses, were lower in tumors of mice that received anti-CTLA4-HSC therapy. The differences in outcomes between the tolerized and non-tolerized models also provide a potential explanation for the low efficacy of CTLA4 blockage approaches in cancer immunotherapy trials

    Protein Signature of Lung Cancer Tissues

    Get PDF
    Lung cancer remains the most common cause of cancer-related mortality. We applied a highly multiplexed proteomic technology (SOMAscan) to compare protein expression signatures of non small-cell lung cancer (NSCLC) tissues with healthy adjacent and distant tissues from surgical resections. In this first report of SOMAscan applied to tissues, we highlight 36 proteins that exhibit the largest expression differences between matched tumor and non-tumor tissues. The concentrations of twenty proteins increased and sixteen decreased in tumor tissue, thirteen of which are novel for NSCLC. NSCLC tissue biomarkers identified here overlap with a core set identified in a large serum-based NSCLC study with SOMAscan. We show that large-scale comparative analysis of protein expression can be used to develop novel histochemical probes. As expected, relative differences in protein expression are greater in tissues than in serum. The combined results from tissue and serum present the most extensive view to date of the complex changes in NSCLC protein expression and provide important implications for diagnosis and treatment

    A Critical Review on the Structural Health Monitoring Methods of the Composite Wind Turbine Blades

    Get PDF
    With increasing turbine size, monitoring of blades becomes increasingly im-portant, in order to prevent catastrophic damages and unnecessary mainte-nance, minimize the downtime and labor cost and improving the safety is-sues and reliability. The present work provides a review and classification of various structural health monitoring (SHM) methods as strain measurement utilizing optical fiber sensors and Fiber Bragg Gratings (FBG’s), active/ pas-sive acoustic emission method, vibration‒based method, thermal imaging method and ultrasonic methods, based on the recent investigations and prom-ising novel techniques. Since accuracy, comprehensiveness and cost-effectiveness are the fundamental parameters in selecting the SHM method, a systematically summarized investigation encompassing methods capabilities/ limitations and sensors types, is needed. Furthermore, the damages which are included in the present work are fiber breakage, matrix cracking, delamina-tion, fiber debonding, crack opening at leading/ trailing edge and ice accre-tion. Taking into account the types of the sensors relevant to different SHM methods, the advantages/ capabilities and disadvantages/ limitations of repre-sented methods are nominated and analyzed

    Large-scale integration of cancer microarray data identifies a robust common cancer signature

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>There is a continuing need to develop molecular diagnostic tools which complement histopathologic examination to increase the accuracy of cancer diagnosis. DNA microarrays provide a means for measuring gene expression signatures which can then be used as components of genomic-based diagnostic tests to determine the presence of cancer.</p> <p>Results</p> <p>In this study, we collect and integrate ~ 1500 microarray gene expression profiles from 26 published cancer data sets across 21 major human cancer types. We then apply a statistical method, referred to as the <it>T</it>op-<it>S</it>coring <it>P</it>air of <it>G</it>roups (TSPG) classifier, and a repeated random sampling strategy to the integrated training data sets and identify a common cancer signature consisting of 46 genes. These 46 genes are naturally divided into two distinct groups; those in one group are typically expressed less than those in the other group for cancer tissues. Given a new expression profile, the classifier discriminates cancer from normal tissues by ranking the expression values of the 46 genes in the cancer signature and comparing the average ranks of the two groups. This signature is then validated by applying this decision rule to independent test data.</p> <p>Conclusion</p> <p>By combining the TSPG method and repeated random sampling, a robust common cancer signature has been identified from large-scale microarray data integration. Upon further validation, this signature may be useful as a robust and objective diagnostic test for cancer.</p
    corecore