1,474 research outputs found

    Automated Gene Classification using Nonnegative Matrix Factorization on Biomedical Literature

    Get PDF
    Understanding functional gene relationships is a challenging problem for biological applications. High-throughput technologies such as DNA microarrays have inundated biologists with a wealth of information, however, processing that information remains problematic. To help with this problem, researchers have begun applying text mining techniques to the biological literature. This work extends previous work based on Latent Semantic Indexing (LSI) by examining Nonnegative Matrix Factorization (NMF). Whereas LSI incorporates the singular value decomposition (SVD) to approximate data in a dense, mixed-sign space, NMF produces a parts-based factorization that is directly interpretable. This space can, in theory, be used to augment existing ontologies and annotations by identifying themes within the literature. Of course, performing NMF does not come without a price—namely, the large number of parameters. This work attempts to analyze the effects of some of the NMF parameters on both convergence and labeling accuracy. Since there is a dearth of automated label evaluation techniques as well as “gold standard” hierarchies, a method to produce “correct” trees is proposed as well as a technique to label trees and to evaluate those labels

    Finding Functional Gene Relationships Using the Semantic Gene Organizer (SGO)

    Get PDF
    Understanding functional gene relationships is a major challenge in bioninformatics and computational biology. Currently, many approaches extract gene relationships via term co-occurrence models from the biomedical literature. Unfortunately, however, many genes that are experimentally identified to be related have not been previously studied together. As a result, many automated models fail to help researchers understand the nature of the relationships. In this work, the particular schema used tomine genomic data is called LatentSemantic Indexing (LSI). LSI performs a singular-value decomposition (SVD) to produce a low-rank approximation of the data set. Effectively, it allows queries to be interpreted in a more concept-based space and can allow for gene relationships to be discovered that would ordinarily be overlooked by other models

    Swim Search: An Online Sports Management Information Retrieval System

    Get PDF

    From splashing to bouncing: the influence of viscosity on the impact of suspension droplets on a solid surface

    Full text link
    We experimentally investigated the splashing of dense suspension droplets impacting a solid surface, extending prior work to the regime where the viscosity of the suspending liquid becomes a significant parameter. The overall behavior can be described by a combination of two trends. The first one is that the splashing becomes favored when the kinetic energy of individual particles at the surface of a droplet overcomes the confinement produced by surface tension. This is expressed by a particle-based Weber number WepWe_p. The second is that splashing is suppressed by increasing the viscosity of the solvent. This is expressed by the Stokes number StSt, which influences the effective coefficient of restitution of colliding particles. We developed a phase diagram where the splashing onset is delineated as a function of both WepWe_p and StSt. A surprising result occurs at very small Stokes number, where not only splashing is suppressed but also plastic deformation of the droplet. This leads to a situation where droplets can bounce back after impact, an observation we are able to reproduce using discrete particle numerical simulations that take into account viscous interaction between particles and elastic energy

    Gene Tree Labeling Using Nonnegative Matrix Factorization on Biomedical Literature

    Get PDF
    Identifying functional groups of genes is a challenging problem for biological applications. Text mining approaches can be used to build hierarchical clusters or trees from the information in the biological literature. In particular, the nonnegative matrix factorization (NMF) is examined as one approach to label hierarchical trees. A generic labeling algorithm as well as an evaluation technique is proposed, and the effects of different NMF parameters with regard to convergence and labeling accuracy are discussed. The primary goals of this study are to provide a qualitative assessment of the NMF and its various parameters and initialization, to provide an automated way to classify biomedical data, and to provide a method for evaluating labeled data assuming a static input tree. As a byproduct, a method for generating gold standard trees is proposed
    corecore