316 research outputs found

    OWL2Vec*: Embedding of OWL Ontologies

    Get PDF
    Semantic embedding of knowledge graphs has been widely studied and used for prediction and statistical analysis tasks across various domains such as Natural Language Processing and the Semantic Web. However, less attention has been paid to developing robust methods for embedding OWL (Web Ontology Language) ontologies. In this paper, we propose a language model based ontology embedding method named OWL2Vec*, which encodes the semantics of an ontology by taking into account its graph structure, lexical information and logic constructors. Our empirical evaluation with three real world datasets suggests that OWL2Vec* benefits from these three different aspects of an ontology in class membership prediction and class subsumption prediction tasks. Furthermore, OWL2Vec* often significantly outperforms the state-of-the-art methods in our experiments

    The Iterative Signature Algorithm for the analysis of large scale gene expression data

    Full text link
    We present a new approach for the analysis of genome-wide expression data. Our method is designed to overcome the limitations of traditional techniques, when applied to large-scale data. Rather than alloting each gene to a single cluster, we assign both genes and conditions to context-dependent and potentially overlapping transcription modules. We provide a rigorous definition of a transcription module as the object to be retrieved from the expression data. An efficient algorithm, that searches for the modules encoded in the data by iteratively refining sets of genes and conditions until they match this definition, is established. Each iteration involves a linear map, induced by the normalized expression matrix, followed by the application of a threshold function. We argue that our method is in fact a generalization of Singular Value Decomposition, which corresponds to the special case where no threshold is applied. We show analytically that for noisy expression data our approach leads to better classification due to the implementation of the threshold. This result is confirmed by numerical analyses based on in-silico expression data. We discuss briefly results obtained by applying our algorithm to expression data from the yeast S. cerevisiae.Comment: Latex, 36 pages, 8 figure

    Weak pairwise correlations imply strongly correlated network states in a neural population

    Get PDF
    Biological networks have so many possible states that exhaustive sampling is impossible. Successful analysis thus depends on simplifying hypotheses, but experiments on many systems hint that complicated, higher order interactions among large groups of elements play an important role. In the vertebrate retina, we show that weak correlations between pairs of neurons coexist with strongly collective behavior in the responses of ten or more neurons. Surprisingly, we find that this collective behavior is described quantitatively by models that capture the observed pairwise correlations but assume no higher order interactions. These maximum entropy models are equivalent to Ising models, and predict that larger networks are completely dominated by correlation effects. This suggests that the neural code has associative or error-correcting properties, and we provide preliminary evidence for such behavior. As a first test for the generality of these ideas, we show that similar results are obtained from networks of cultured cortical neurons.Comment: Full account of work presented at the conference on Computational and Systems Neuroscience (COSYNE), 17-20 March 2005, in Salt Lake City, Utah (http://cosyne.org

    Numerical investigation of geogrid back-anchored sheet pile walls

    Get PDF
    peer reviewedIn the last decades, geosynthetic reinforcement has been widely used in geotech-nical applications. Recently, geogrid has also been used to back-anchor sheet pile walls. However, this system has not received sufficient attention neither in research nor in construction. Due to the complex interactions between soil, geogrid and sheet pile wall, the applicability of common design guidelines for conventionally back-anchored walls to this particular system has to be proven. To develop a fundamental understanding about the influence of various components of the system on its behaviour, numerical investigations have been conducted within this study. In this paper the influence of geogrid inclination, design of geogrid-sheet pile connection including prestressing and geogrid position on the earth pressure distribution and wall deformation is discussed. The numerical results revealed that the position of geogrid and design of geogrid-sheet pile connection significantly affect the earth pressure distribution. The wall deformations are mainly influenced by the geogrid position

    svdPPCS: an effective singular value decomposition-based method for conserved and divergent co-expression gene module identification

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Comparative analysis of gene expression profiling of multiple biological categories, such as different species of organisms or different kinds of tissue, promises to enhance the fundamental understanding of the universality as well as the specialization of mechanisms and related biological themes. Grouping genes with a similar expression pattern or exhibiting co-expression together is a starting point in understanding and analyzing gene expression data. In recent literature, gene module level analysis is advocated in order to understand biological network design and system behaviors in disease and life processes; however, practical difficulties often lie in the implementation of existing methods.</p> <p>Results</p> <p>Using the singular value decomposition (SVD) technique, we developed a new computational tool, named svdPPCS (<b>SVD</b>-based <b>P</b>attern <b>P</b>airing and <b>C</b>hart <b>S</b>plitting), to identify conserved and divergent co-expression modules of two sets of microarray experiments. In the proposed methods, gene modules are identified by splitting the two-way chart coordinated with a pair of left singular vectors factorized from the gene expression matrices of the two biological categories. Importantly, the cutoffs are determined by a data-driven algorithm using the well-defined statistic, SVD-p. The implementation was illustrated on two time series microarray data sets generated from the samples of accessory gland (ACG) and malpighian tubule (MT) tissues of the line W<sup>118 </sup>of <it>M. drosophila</it>. Two conserved modules and six divergent modules, each of which has a unique characteristic profile across tissue kinds and aging processes, were identified. The number of genes contained in these models ranged from five to a few hundred. Three to over a hundred GO terms were over-represented in individual modules with FDR < 0.1. One divergent module suggested the tissue-specific relationship between the expressions of mitochondrion-related genes and the aging process. This finding, together with others, may be of biological significance. The validity of the proposed SVD-based method was further verified by a simulation study, as well as the comparisons with regression analysis and cubic spline regression analysis plus PAM based clustering.</p> <p>Conclusions</p> <p>svdPPCS is a novel computational tool for the comparative analysis of transcriptional profiling. It especially fits the comparison of time series data of related organisms or different tissues of the same organism under equivalent or similar experimental conditions. The general scheme can be directly extended to the comparisons of multiple data sets. It also can be applied to the integration of data sets from different platforms and of different sources.</p

    Six Human-Centered Artificial Intelligence Grand Challenges

    Get PDF
    Widespread adoption of artificial intelligence (AI) technologies is substantially affecting the human condition in ways that are not yet well understood. Negative unintended consequences abound including the perpetuation and exacerbation of societal inequalities and divisions via algorithmic decision making. We present six grand challenges for the scientific community to create AI technologies that are human-centered, that is, ethical, fair, and enhance the human condition. These grand challenges are the result of an international collaboration across academia, industry and government and represent the consensus views of a group of 26 experts in the field of human-centered artificial intelligence (HCAI). In essence, these challenges advocate for a human-centered approach to AI that (1) is centered in human well-being, (2) is designed responsibly, (3) respects privacy, (4) follows human-centered design principles, (5) is subject to appropriate governance and oversight, and (6) interacts with individuals while respecting human’s cognitive capacities. We hope that these challenges and their associated research directions serve as a call for action to conduct research and development in AI that serves as a force multiplier towards more fair, equitable and sustainable societies

    Microarray gene expression profiling and analysis in renal cell carcinoma

    Get PDF
    BACKGROUND: Renal cell carcinoma (RCC) is the most common cancer in adult kidney. The accuracy of current diagnosis and prognosis of the disease and the effectiveness of the treatment for the disease are limited by the poor understanding of the disease at the molecular level. To better understand the genetics and biology of RCC, we profiled the expression of 7,129 genes in both clear cell RCC tissue and cell lines using oligonucleotide arrays. METHODS: Total RNAs isolated from renal cell tumors, adjacent normal tissue and metastatic RCC cell lines were hybridized to affymatrix HuFL oligonucleotide arrays. Genes were categorized into different functional groups based on the description of the Gene Ontology Consortium and analyzed based on the gene expression levels. Gene expression profiles of the tissue and cell line samples were visualized and classified by singular value decomposition. Reverse transcription polymerase chain reaction was performed to confirm the expression alterations of selected genes in RCC. RESULTS: Selected genes were annotated based on biological processes and clustered into functional groups. The expression levels of genes in each group were also analyzed. Seventy-four commonly differentially expressed genes with more than five-fold changes in RCC tissues were identified. The expression alterations of selected genes from these seventy-four genes were further verified using reverse transcription polymerase chain reaction (RT-PCR). Detailed comparison of gene expression patterns in RCC tissue and RCC cell lines shows significant differences between the two types of samples, but many important expression patterns were preserved. CONCLUSIONS: This is one of the initial studies that examine the functional ontology of a large number of genes in RCC. Extensive annotation, clustering and analysis of a large number of genes based on the gene functional ontology revealed many interesting gene expression patterns in RCC. Most notably, genes involved in cell adhesion were dominantly up-regulated whereas genes involved in transport were dominantly down-regulated. This study reveals significant gene expression alterations in key biological pathways and provides potential insights into understanding the molecular mechanism of renal cell carcinogenesis

    Multivariate curve resolution of time course microarray data

    Get PDF
    BACKGROUND: Modeling of gene expression data from time course experiments often involves the use of linear models such as those obtained from principal component analysis (PCA), independent component analysis (ICA), or other methods. Such methods do not generally yield factors with a clear biological interpretation. Moreover, implicit assumptions about the measurement errors often limit the application of these methods to log-transformed data, destroying linear structure in the untransformed expression data. RESULTS: In this work, a method for the linear decomposition of gene expression data by multivariate curve resolution (MCR) is introduced. The MCR method is based on an alternating least-squares (ALS) algorithm implemented with a weighted least squares approach. The new method, MCR-WALS, extracts a small number of basis functions from untransformed microarray data using only non-negativity constraints. Measurement error information can be incorporated into the modeling process and missing data can be imputed. The utility of the method is demonstrated through its application to yeast cell cycle data. CONCLUSION: Profiles extracted by MCR-WALS exhibit a strong correlation with cell cycle-associated genes, but also suggest new insights into the regulation of those genes. The unique features of the MCR-WALS algorithm are its freedom from assumptions about the underlying linear model other than the non-negativity of gene expression, its ability to analyze non-log-transformed data, and its use of measurement error information to obtain a weighted model and accommodate missing measurements
    corecore