1,099 research outputs found

    Reliability Enhancement of Predictive Computational Models in Hydroscience and Engineering

    Get PDF
    Source: ICHE Conference Archive - https://mdi-de.baw.de/icheArchiv

    Evolutionary nonnegative matrix factorization for data compression

    Get PDF
    This paper aims at improving non-negative matrix factor- ization (NMF) to facilitate data compression. An evolutionary updat- ing strategy is proposed to solve the NMF problem iteratively based on three sets of updating rules including multiplicative, firefly and sur- vival of the fittest rules. For data compression application, the quality of the factorized matrices can be evaluated by measurements such as spar- sity, orthogonality and factorization error to assess compression quality in terms of storage space consumption, redundancy in data matrix and data approximation accuracy. Thus, the fitness score function that drives the evolving procedure is designed as a composite score that takes into account all these measurements. A hybrid initialization scheme is per- formed to improve the rate of convergence, allowing multiple initial can- didates generated by different types of NMF initialization approaches. Effectiveness of the proposed method is demonstrated using Yale and ORL image datasets

    Quantifying the Informativeness of Similarity Measurements

    Get PDF
    In this paper, we describe an unsupervised measure for quantifying the 'informativeness' of correlation matrices formed from the pairwise similarities or relationships among data instances. The measure quantifies the heterogeneity of the correlations and is defined as the distance between a correlation matrix and the nearest correlation matrix with constant off-diagonal entries. This non-parametric notion generalizes existing test statistics for equality of correlation coefficients by allowing for alternative distance metrics, such as the Bures and other distances from quantum information theory. For several distance and dissimilarity metrics, we derive closed-form expressions of informativeness, which can be applied as objective functions for machine learning applications. Empirically, we demonstrate that informativeness is a useful criterion for selecting kernel parameters, choosing the dimension for kernel-based nonlinear dimensionality reduction, and identifying structured graphs. We also consider the problem of finding a maximally informative correlation matrix around a target matrix, and explore parameterizing the optimization in terms of the coordinates of the sample or through a lower-dimensional embedding. In the latter case, we find that maximizing the Bures-based informativeness measure, which is maximal for centered rank-1 correlation matrices, is equivalent to minimizing a specific matrix norm, and present an algorithm to solve the minimization problem using the norm's proximal operator. The proposed correlation denoising algorithm consistently improves spectral clustering. Overall, we find informativeness to be a novel and useful criterion for identifying non-trivial correlation structure.

    Numerical Simulation of Chemical Spills and Assessment of Environmental Impacts

    Get PDF
    Source: ICHE Conference Archive - https://mdi-de.baw.de/icheArchiv

    Mutation signature analysis identifies increased mutation caused by tobacco smoke associated DNA adducts in larynx squamous cell carcinoma compared with oral cavity and oropharynx.

    Get PDF
    Squamous cell carcinomas of the head and neck (HNSCC) arise from mucosal keratinocytes of the upper aero-digestive tract. Despite a common cell of origin and similar driver-gene mutations which divert cell fate from differentiation to proliferation, HNSCC are considered a heterogeneous group of tumors categorized by site of origin within the aero-digestive mucosa, and the presence or absence of HPV infection. Tobacco use is a major driver of carcinogenesis in HNSCC and is a poor prognosticator that has previously been associated with poor immune cell infiltration and higher mutation numbers. Here, we study patterns of mutations in HNSCC that are derived from the specific nucleotide changes and their surrounding nucleotide context (also known as mutation signatures). We identify that mutations linked to DNA adducts associated with tobacco smoke exposure are predominantly found in the larynx. Presence of this class of mutation, termed COSMIC signature 4, is responsible for the increased burden of mutation in this anatomical sub-site. In addition, we show that another mutation pattern, COSMIC signature 5, is positively associated with age in HNSCC from non-smokers and that larynx SCC from non-smokers have a greater number of signature 5 mutations compared with other HNSCC sub-sites. Immunohistochemistry demonstrates a significantly lower Ki-67 proliferation index in size matched larynx SCC compared with oral cavity SCC and oropharynx SCC. Collectively, these observations support a model where larynx SCC are characterized by slower growth and increased susceptibility to mutations from tobacco carcinogen DNA adducts
    corecore