604 research outputs found
PAC-Bayesian Learning of Optimization Algorithms
We apply the PAC-Bayes theory to the setting of learning-to-optimize. To the
best of our knowledge, we present the first framework to learn optimization
algorithms with provable generalization guarantees (PAC-bounds) and explicit
trade-off between a high probability of convergence and a high convergence
speed. Even in the limit case, where convergence is guaranteed, our learned
optimization algorithms provably outperform related algorithms based on a
(deterministic) worst-case analysis. Our results rely on PAC-Bayes bounds for
general, unbounded loss-functions based on exponential families. By
generalizing existing ideas, we reformulate the learning procedure into a
one-dimensional minimization problem and study the possibility to find a global
minimum, which enables the algorithmic realization of the learning procedure.
As a proof-of-concept, we learn hyperparameters of standard optimization
algorithms to empirically underline our theory.Comment: Accepted to AISTATS 202
LS-NMF: A modified non-negative matrix factorization algorithm utilizing uncertainty estimates
BACKGROUND: Non-negative matrix factorisation (NMF), a machine learning algorithm, has been applied to the analysis of microarray data. A key feature of NMF is the ability to identify patterns that together explain the data as a linear combination of expression signatures. Microarray data generally includes individual estimates of uncertainty for each gene in each condition, however NMF does not exploit this information. Previous work has shown that such uncertainties can be extremely valuable for pattern recognition. RESULTS: We have created a new algorithm, least squares non-negative matrix factorization, LS-NMF, which integrates uncertainty measurements of gene expression data into NMF updating rules. While the LS-NMF algorithm maintains the advantages of original NMF algorithm, such as easy implementation and a guaranteed locally optimal solution, the performance in terms of linking functionally related genes has been improved. LS-NMF exceeds NMF significantly in terms of identifying functionally related genes as determined from annotations in the MIPS database. CONCLUSION: Uncertainty measurements on gene expression data provide valuable information for data analysis, and use of this information in the LS-NMF algorithm significantly improves the power of the NMF technique
Recommended from our members
Association of hydrophobic organic compounds with dissolved soil organic carbon
Volatile hydrophobic compounds (HOCs) brought into soil and sediment
systems represent a serious threat to the environment. Dissolved
organic carbon (DOC) may exert an important influence over the total
aqueous solubility and mobility of organic pollutants through their
incorporation into micelles or the formation of soluble complexes.
To date, however, it is not clear whether DOC interacts with nonionic
volatile HOCs under natural conditions. Also, very little is known
about the influence of DOC-type and other environmental parameters.
A static headspace analysis technique was used in combination with
gas chromatography to determine the extent of interactions between
DOC and benzene, toluene, and chlorobenzene. Three different types
of DOC were compared: a natural water-soluble extract (WSE) obtained
from a Cumulic Humaquept (Labish Series), a commercial humic acid
(HA), and a high purity fulvic acid (FA). In factorial experiments,
the effect of type and concentration of DOC, solution pH, preparative
treatment of DOC solutions, and hydrophobicity of HOCs was examined.
DOC concentrations in the experiments ranged from 1.3 to 36.5
mmol-L⁻¹, the ionic background was made up by 0.10 M KC1, and the temperature was held constant at 25 °C. The solution pH in different
experiments was 4.0, 6.5, and 9.0. The acidity of classes of DOC
functional groups, the influence of solution pH on the degree of
DOC-protonation, and the maximum number of protons complexed per mol
DOC were determined for WSE, HA, and FA through continuous
potentiometric titrations. Basic solutions 9.36 mM in DOC were
titrated with HC1 under N₂ in a 0.1 M KC1 ionic background at 25 °C.
At DOC concentrations ≥ 9.3 mmol-L⁻¹, a small percentage of HOC
molecules associated with DOC. However, no statistically significant
linear relation between DOC concentration and DOC-HOC interaction was
observed. Differences in the association of benzene, toluene, and
chlorobenzene with DOC were inconsistent with different DOC-sources.
Toluene exhibited a higher affinity for unfiltered and aged WSE- and
HA-solutions when compared with freshly prepared, filtered solutions
of the same materials. Most HOC molecules associated with DOC at pH
6.5. HA and FA showed a significantly greater affinity for HOCs than
WSE. This reflects the acidic functional group properties of the
three DOC-sources.
WSE displayed the greatest capacity to complex protons,
although the FA and HA functional groups were more acidic. WSE, FA,
and HA complexed a maximum number of 0.196, 0.158, and 0.136 mol
H-mol⁻¹ DOC, respectively. Formation functions of WSE, HA, and FA
calculated from titration data were fitted to a chemical model
through a non-linear least-squares minimization program in order to
obtain conditional protonation constants (cK) for classes of
functional groups. Log cK values were 5.52 and 9.12 for WSE, 4.87 and 8.77 for FA, and 4.75, 7.62, and 9.39 for HA. These values were
in agreement with protonation constants obtained through a graphical
procedure. A comparison of the data obtained for WSE, HA, and FA
from association and titration experiments shows that the capacity to
complex protons and the affinity for HOCs are inversely related.
The relatively high functional group content of WSE corresponds to a
relatively low hydrophobicity, as expressed in the small extent of
WSE-HOC interactions when compared with HA and FA
Determination of strongly overlapping signaling activity from microarray data
BACKGROUND: As numerous diseases involve errors in signal transduction, modern therapeutics often target proteins involved in cellular signaling. Interpretation of the activity of signaling pathways during disease development or therapeutic intervention would assist in drug development, design of therapy, and target identification. Microarrays provide a global measure of cellular response, however linking these responses to signaling pathways requires an analytic approach tuned to the underlying biology. An ongoing issue in pattern recognition in microarrays has been how to determine the number of patterns (or clusters) to use for data interpretation, and this is a critical issue as measures of statistical significance in gene ontology or pathways rely on proper separation of genes into groups. RESULTS: Here we introduce a method relying on gene annotation coupled to decompositional analysis of global gene expression data that allows us to estimate specific activity on strongly coupled signaling pathways and, in some cases, activity of specific signaling proteins. We demonstrate the technique using the Rosetta yeast deletion mutant data set, decompositional analysis by Bayesian Decomposition, and annotation analysis using ClutrFree. We determined from measurements of gene persistence in patterns across multiple potential dimensionalities that 15 basis vectors provides the correct dimensionality for interpreting the data. Using gene ontology and data on gene regulation in the Saccharomyces Genome Database, we identified the transcriptional signatures of several cellular processes in yeast, including cell wall creation, ribosomal disruption, chemical blocking of protein synthesis, and, criticially, individual signatures of the strongly coupled mating and filamentation pathways. CONCLUSION: This works demonstrates that microarray data can provide downstream indicators of pathway activity either through use of gene ontology or transcription factor databases. This can be used to investigate the specificity and success of targeted therapeutics as well as to elucidate signaling activity in normal and disease processes
Pathway-specific differences between tumor cell lines and normal and tumor tissue cells
BACKGROUND: Cell lines are used in experimental investigation of cancer but their capacity to represent tumor cells has yet to be quantified. The aim of the study was to identify significant alterations in pathway usage in cell lines in comparison with normal and tumor tissue. METHODS: This study utilized a pathway-specific enrichment analysis of publicly accessible microarray data and quantified the gene expression differences between cell lines, tumor, and normal tissue cells for six different tissue types. KEGG pathways that are significantly different between cell lines and tumors, cell lines and normal tissues and tumor and normal tissue were identified through enrichment tests on gene lists obtained using Significance Analysis of Microarrays (SAM). RESULTS: Cellular pathways that were significantly upregulated in cell lines compared to tumor cells and normal cells of the same tissue type included ATP synthesis, cell communication, cell cycle, oxidative phosphorylation, purine, pyrimidine and pyruvate metabolism, and proteasome. Results on metabolic pathways suggested an increase in the velocity nucleotide metabolism and RNA production. Pathways that were downregulated in cell lines compared to tumor and normal tissue included cell communication, cell adhesion molecules (CAMs), and ECM-receptor interaction. Only a fraction of the significantly altered genes in tumor-to-normal comparison had similar expressions in cancer cell lines and tumor cells. These genes were tissue-specific and were distributed sparsely among multiple pathways. CONCLUSION: Significantly altered genes in tumors compared to normal tissue were largely tissue specific. Among these genes downregulation was a major trend. In contrast, cell lines contained large sets of significantly upregulated genes that were common to multiple tissue types. Pathway upregulation in cell lines was most pronounced over metabolic pathways including cell nucleotide metabolism and oxidative phosphorylation. Signaling pathways involved in adhesion and communication of cultured cancer cells were downregulated. The three way pathways comparison presented in this study brings light into the differences in the use of cellular pathways by tumor cells and cancer cell lines
Hybrid Modeling of Cell Signaling and Transcriptional Reprogramming and Its Application in C. elegans Development
Modeling of signal driven transcriptional reprogramming is critical for understanding of organism development, human disease, and cell biology. Many current modeling techniques discount key features of the biological sub-systems when modeling multiscale, organism-level processes. We present a mechanistic hybrid model, GESSA, which integrates a novel pooled probabilistic Boolean network model of cell signaling and a stochastic simulation of transcription and translation responding to a diffusion model of extracellular signals. We apply the model to simulate the well studied cell fate decision process of the vulval precursor cells (VPCs) in C. elegans, using experimentally derived rate constants wherever possible and shared parameters to avoid overfitting. We demonstrate that GESSA recovers (1) the effects of varying scaffold protein concentration on signal strength, (2) amplification of signals in expression, (3) the relative external ligand concentration in a known geometry, and (4) feedback in biochemical networks. We demonstrate that setting model parameters based on wild-type and LIN-12 loss-of-function mutants in C. elegans leads to correct prediction of a wide variety of mutants including partial penetrance of phenotypes. Moreover, the model is relatively insensitive to parameters, retaining the wild-type phenotype for a wide range of cell signaling rate parameters
Ceramic matrix composite turbine engine vane
A vane has an airfoil shell and a spar within the shell. The vane has an outboard shroud at an outboard end of the shell and an inboard platform at an inboard end of the shell. The spar has a first chamber essentially along the suction side and a second chamber along the pressure side opposite the first chamber
Gene expression signatures modulated by epidermal growth factor receptor activation and their relationship to cetuximab resistance in head and neck squamous cell carcinoma.
BACKGROUND: Aberrant activation of signaling pathways downstream of epidermal growth factor receptor (EGFR) has been hypothesized to be one of the mechanisms of cetuximab (a monoclonal antibody against EGFR) resistance in head and neck squamous cell carcinoma (HNSCC). To infer relevant and specific pathway activation downstream of EGFR from gene expression in HNSCC, we generated gene expression signatures using immortalized keratinocytes (HaCaT) subjected to ligand stimulation and transfected with EGFR, RELA/p65, or HRASVal12D.
RESULTS: The gene expression patterns that distinguished the HaCaT variants and conditions were inferred using the Markov chain Monte Carlo (MCMC) matrix factorization algorithm Coordinated Gene Activity in Pattern Sets (CoGAPS). This approach inferred gene expression signatures with greater relevance to cell signaling pathway activation than the expression signatures inferred with standard linear models. Furthermore, the pathway signature generated using HaCaT-HRASVal12D further associated with the cetuximab treatment response in isogenic cetuximab-sensitive (UMSCC1) and -resistant (1CC8) cell lines.
CONCLUSIONS: Our data suggest that the CoGAPS algorithm can generate gene expression signatures that are pertinent to downstream effects of receptor signaling pathway activation and potentially be useful in modeling resistance mechanisms to targeted therapies
- …