604 research outputs found

    PAC-Bayesian Learning of Optimization Algorithms

    Full text link
    We apply the PAC-Bayes theory to the setting of learning-to-optimize. To the best of our knowledge, we present the first framework to learn optimization algorithms with provable generalization guarantees (PAC-bounds) and explicit trade-off between a high probability of convergence and a high convergence speed. Even in the limit case, where convergence is guaranteed, our learned optimization algorithms provably outperform related algorithms based on a (deterministic) worst-case analysis. Our results rely on PAC-Bayes bounds for general, unbounded loss-functions based on exponential families. By generalizing existing ideas, we reformulate the learning procedure into a one-dimensional minimization problem and study the possibility to find a global minimum, which enables the algorithmic realization of the learning procedure. As a proof-of-concept, we learn hyperparameters of standard optimization algorithms to empirically underline our theory.Comment: Accepted to AISTATS 202

    LS-NMF: A modified non-negative matrix factorization algorithm utilizing uncertainty estimates

    Get PDF
    BACKGROUND: Non-negative matrix factorisation (NMF), a machine learning algorithm, has been applied to the analysis of microarray data. A key feature of NMF is the ability to identify patterns that together explain the data as a linear combination of expression signatures. Microarray data generally includes individual estimates of uncertainty for each gene in each condition, however NMF does not exploit this information. Previous work has shown that such uncertainties can be extremely valuable for pattern recognition. RESULTS: We have created a new algorithm, least squares non-negative matrix factorization, LS-NMF, which integrates uncertainty measurements of gene expression data into NMF updating rules. While the LS-NMF algorithm maintains the advantages of original NMF algorithm, such as easy implementation and a guaranteed locally optimal solution, the performance in terms of linking functionally related genes has been improved. LS-NMF exceeds NMF significantly in terms of identifying functionally related genes as determined from annotations in the MIPS database. CONCLUSION: Uncertainty measurements on gene expression data provide valuable information for data analysis, and use of this information in the LS-NMF algorithm significantly improves the power of the NMF technique

    Determination of strongly overlapping signaling activity from microarray data

    Get PDF
    BACKGROUND: As numerous diseases involve errors in signal transduction, modern therapeutics often target proteins involved in cellular signaling. Interpretation of the activity of signaling pathways during disease development or therapeutic intervention would assist in drug development, design of therapy, and target identification. Microarrays provide a global measure of cellular response, however linking these responses to signaling pathways requires an analytic approach tuned to the underlying biology. An ongoing issue in pattern recognition in microarrays has been how to determine the number of patterns (or clusters) to use for data interpretation, and this is a critical issue as measures of statistical significance in gene ontology or pathways rely on proper separation of genes into groups. RESULTS: Here we introduce a method relying on gene annotation coupled to decompositional analysis of global gene expression data that allows us to estimate specific activity on strongly coupled signaling pathways and, in some cases, activity of specific signaling proteins. We demonstrate the technique using the Rosetta yeast deletion mutant data set, decompositional analysis by Bayesian Decomposition, and annotation analysis using ClutrFree. We determined from measurements of gene persistence in patterns across multiple potential dimensionalities that 15 basis vectors provides the correct dimensionality for interpreting the data. Using gene ontology and data on gene regulation in the Saccharomyces Genome Database, we identified the transcriptional signatures of several cellular processes in yeast, including cell wall creation, ribosomal disruption, chemical blocking of protein synthesis, and, criticially, individual signatures of the strongly coupled mating and filamentation pathways. CONCLUSION: This works demonstrates that microarray data can provide downstream indicators of pathway activity either through use of gene ontology or transcription factor databases. This can be used to investigate the specificity and success of targeted therapeutics as well as to elucidate signaling activity in normal and disease processes

    Pathway-specific differences between tumor cell lines and normal and tumor tissue cells

    Get PDF
    BACKGROUND: Cell lines are used in experimental investigation of cancer but their capacity to represent tumor cells has yet to be quantified. The aim of the study was to identify significant alterations in pathway usage in cell lines in comparison with normal and tumor tissue. METHODS: This study utilized a pathway-specific enrichment analysis of publicly accessible microarray data and quantified the gene expression differences between cell lines, tumor, and normal tissue cells for six different tissue types. KEGG pathways that are significantly different between cell lines and tumors, cell lines and normal tissues and tumor and normal tissue were identified through enrichment tests on gene lists obtained using Significance Analysis of Microarrays (SAM). RESULTS: Cellular pathways that were significantly upregulated in cell lines compared to tumor cells and normal cells of the same tissue type included ATP synthesis, cell communication, cell cycle, oxidative phosphorylation, purine, pyrimidine and pyruvate metabolism, and proteasome. Results on metabolic pathways suggested an increase in the velocity nucleotide metabolism and RNA production. Pathways that were downregulated in cell lines compared to tumor and normal tissue included cell communication, cell adhesion molecules (CAMs), and ECM-receptor interaction. Only a fraction of the significantly altered genes in tumor-to-normal comparison had similar expressions in cancer cell lines and tumor cells. These genes were tissue-specific and were distributed sparsely among multiple pathways. CONCLUSION: Significantly altered genes in tumors compared to normal tissue were largely tissue specific. Among these genes downregulation was a major trend. In contrast, cell lines contained large sets of significantly upregulated genes that were common to multiple tissue types. Pathway upregulation in cell lines was most pronounced over metabolic pathways including cell nucleotide metabolism and oxidative phosphorylation. Signaling pathways involved in adhesion and communication of cultured cancer cells were downregulated. The three way pathways comparison presented in this study brings light into the differences in the use of cellular pathways by tumor cells and cancer cell lines

    Hybrid Modeling of Cell Signaling and Transcriptional Reprogramming and Its Application in C. elegans Development

    Get PDF
    Modeling of signal driven transcriptional reprogramming is critical for understanding of organism development, human disease, and cell biology. Many current modeling techniques discount key features of the biological sub-systems when modeling multiscale, organism-level processes. We present a mechanistic hybrid model, GESSA, which integrates a novel pooled probabilistic Boolean network model of cell signaling and a stochastic simulation of transcription and translation responding to a diffusion model of extracellular signals. We apply the model to simulate the well studied cell fate decision process of the vulval precursor cells (VPCs) in C. elegans, using experimentally derived rate constants wherever possible and shared parameters to avoid overfitting. We demonstrate that GESSA recovers (1) the effects of varying scaffold protein concentration on signal strength, (2) amplification of signals in expression, (3) the relative external ligand concentration in a known geometry, and (4) feedback in biochemical networks. We demonstrate that setting model parameters based on wild-type and LIN-12 loss-of-function mutants in C. elegans leads to correct prediction of a wide variety of mutants including partial penetrance of phenotypes. Moreover, the model is relatively insensitive to parameters, retaining the wild-type phenotype for a wide range of cell signaling rate parameters

    Ceramic matrix composite turbine engine vane

    Get PDF
    A vane has an airfoil shell and a spar within the shell. The vane has an outboard shroud at an outboard end of the shell and an inboard platform at an inboard end of the shell. The spar has a first chamber essentially along the suction side and a second chamber along the pressure side opposite the first chamber

    Gene expression signatures modulated by epidermal growth factor receptor activation and their relationship to cetuximab resistance in head and neck squamous cell carcinoma.

    Get PDF
    BACKGROUND: Aberrant activation of signaling pathways downstream of epidermal growth factor receptor (EGFR) has been hypothesized to be one of the mechanisms of cetuximab (a monoclonal antibody against EGFR) resistance in head and neck squamous cell carcinoma (HNSCC). To infer relevant and specific pathway activation downstream of EGFR from gene expression in HNSCC, we generated gene expression signatures using immortalized keratinocytes (HaCaT) subjected to ligand stimulation and transfected with EGFR, RELA/p65, or HRASVal12D. RESULTS: The gene expression patterns that distinguished the HaCaT variants and conditions were inferred using the Markov chain Monte Carlo (MCMC) matrix factorization algorithm Coordinated Gene Activity in Pattern Sets (CoGAPS). This approach inferred gene expression signatures with greater relevance to cell signaling pathway activation than the expression signatures inferred with standard linear models. Furthermore, the pathway signature generated using HaCaT-HRASVal12D further associated with the cetuximab treatment response in isogenic cetuximab-sensitive (UMSCC1) and -resistant (1CC8) cell lines. CONCLUSIONS: Our data suggest that the CoGAPS algorithm can generate gene expression signatures that are pertinent to downstream effects of receptor signaling pathway activation and potentially be useful in modeling resistance mechanisms to targeted therapies
    corecore