14 research outputs found

    Improving Word Association Measures in Repetitive Corpora with Context Similarity Weighting

    Get PDF
    Peer reviewe

    Keyphrases analysis of BIM standards through occurrence of most common BIM uses

    Get PDF
    The 8th PSU-UNS International Conference on Engineering and Technology (ICET-2017), Novi Sad, Serbia, June 8-10, 2017 University of Novi Sad, Faculty of Technical Sciences Abstract: Building Information Modeling (BIM) does not represent only the virtual model of the facility but a comperhensive approach consisting of technology, processes, stakeholders' behavior and accompanying standards.Given the fast evolution of BIM, this paper is analysing trends of development of BIM standards throughout the years by applying the keyphrases analysis method, for some of the most common BIM uses and recognizable phrases in BIM industry

    Word Knowledge and Word Usage

    Get PDF
    Word storage and processing define a multi-factorial domain of scientific inquiry whose thorough investigation goes well beyond the boundaries of traditional disciplinary taxonomies, to require synergic integration of a wide range of methods, techniques and empirical and experimental findings. The present book intends to approach a few central issues concerning the organization, structure and functioning of the Mental Lexicon, by asking domain experts to look at common, central topics from complementary standpoints, and discuss the advantages of developing converging perspectives. The book will explore the connections between computational and algorithmic models of the mental lexicon, word frequency distributions and information theoretical measures of word families, statistical correlations across psycho-linguistic and cognitive evidence, principles of machine learning and integrative brain models of word storage and processing. Main goal of the book will be to map out the landscape of future research in this area, to foster the development of interdisciplinary curricula and help single-domain specialists understand and address issues and questions as they are raised in other disciplines

    A study of model parameters for scaling up word to sentence similarity tasks in distributional semantics

    Get PDF
    PhDRepresentation of sentences that captures semantics is an essential part of natural language processing systems, such as information retrieval or machine translation. The representation of a sentence is commonly built by combining the representations of the words that the sentence consists of. Similarity between words is widely used as a proxy to evaluate semantic representations. Word similarity models are well-studied and are shown to positively correlate with human similarity judgements. Current evaluation of models of sentential similarity builds on the results obtained in lexical experiments. The main focus is how the lexical representations are used, rather than what they should be. It is often assumed that the optimal representations for word similarity are also optimal for sentence similarity. This work discards this assumption and systematically looks for lexical representations that are optimal for similarity measurement between sentences. We find that the best representation for word similarity is not always the best for sentence similarity and vice versa. The best models in word similarity tasks perform best with additive composition. However, the best result on compositional tasks is achieved with Kroneckerbased composition. There are representations that are equally good in both tasks when used with multiplicative composition. The systematic study of the parameters of similarity models reveals that the more information lexical representations contain, the more attention should be paid to noise. In particular, the word vectors in models with the feature size at the magnitude of the vocabulary size should be sparse, but if a small number of context features is used then the vectors should be dense. Given the right lexical representations, compositional operators achieve state-of-the-art performance, improving over models that use neural-word embeddings. To avoid overfitting, either several test datasets should be used or parameter selection should be based on parameters’ average behaviours.EPSRC grant EP/J002607/1

    Computational Analysis of Metabolic Reprogramming in Tumors

    Get PDF
    BACKGROUND Cancer is a direct consequence of genomic aberrations, such as somatic copy number alterations that frequently occur in the cancer genome affecting not only oncogenic genes, but also multiple passenger and potential co-driver genes. An intrinsic feature resulting from such a disruption of the genome is deregulation of the tumor metabolic landscape, as a result of which, multiple metabolic genes have been identified as oncogenes, tumor suppressor genes or targets of oncogenic signaling. RESULTS Here we elucidate that linear proximity of metabolic and cancer-causing genes in the genome can lead to metabolic remodeling through copy number co-alterations. We observed that cancer-metabolic gene pairs are unexpectedly often proximally positioned in the chromosomes and share loci with altered copy number, thus being either co-deleted or co-amplified across all cancers analyzed (19 cancer types from The Cancer Genome Atlas). We have developed an analysis pipeline - Identification of Metabolic Cancer Genes (iMetCG), to infer the functional impact on oncogenic metabolism from such co-alteration events and delineate genes truly driving cancer metabolism from those that are neutral. Using this approach, we have identified novel and well known metabolic genes that target crucial pathways relevant for tumors. Moreover, using these identified metabolic genes we were able to classify tumors based on its tissue and developmental origins. We further observed that these putative metabolic cancer genes (identified across cancers) had higher network connectivity, were indicators for patient survival, had significant overlap with known cancer metabolic genes and shared similar features with known cancer genes in terms of their isoform diversity, evolutionary rate and selection pressure. CONCLUSIONS This thesis provides novel insights into the functional mechanism of metabolic regulation and rewiring of the metabolic landscape in cancer cells. Our pan-cancer, genomic data driven approach revealed a hitherto unknown generic mechanism for large scale metabolic reprogramming in cancer cells based on linear gene proximities and identified 119 new metabolic cancer genes likely to be involved in remodeling tumor cell metabolism. Furthermore, our newly identified metabolic cancer genes will serve as a vital resource to the experimental community engaged in tumor metabolism and genomics research to further expand the scope of this field

    A comparison of windowless and window-based computational association measures as predictors of syntagmatic human associations

    No full text
    corecore