156,097 research outputs found
NOUS: Construction and Querying of Dynamic Knowledge Graphs
The ability to construct domain specific knowledge graphs (KG) and perform
question-answering or hypothesis generation is a transformative capability.
Despite their value, automated construction of knowledge graphs remains an
expensive technical challenge that is beyond the reach for most enterprises and
academic institutions. We propose an end-to-end framework for developing custom
knowledge graph driven analytics for arbitrary application domains. The
uniqueness of our system lies A) in its combination of curated KGs along with
knowledge extracted from unstructured text, B) support for advanced trending
and explanatory questions on a dynamic KG, and C) the ability to answer queries
where the answer is embedded across multiple data sources.Comment: Codebase: https://github.com/streaming-graphs/NOU
Dependence relationships between Gene Ontology terms based on TIGR gene product annotations
The Gene Ontology is an important tool for the representation and processing of information about gene products and functions. It provides controlled vocabularies for the designations of cellular components, molecular functions, and biological processes used in the annotation of genes and gene products. These constitute
three separate ontologies, of cellular components), molecular functions and biological processes, respectively. The question we address here is: how are the terms in these three separate ontologies related to each other? We use statistical methods and formal ontological principles as a first step towards finding answers to this question
Redefining part-of-speech classes with distributional semantic models
This paper studies how word embeddings trained on the British National Corpus
interact with part of speech boundaries. Our work targets the Universal PoS tag
set, which is currently actively being used for annotation of a range of
languages. We experiment with training classifiers for predicting PoS tags for
words based on their embeddings. The results show that the information about
PoS affiliation contained in the distributional vectors allows us to discover
groups of words with distributional patterns that differ from other words of
the same part of speech.
This data often reveals hidden inconsistencies of the annotation process or
guidelines. At the same time, it supports the notion of `soft' or `graded' part
of speech affiliations. Finally, we show that information about PoS is
distributed among dozens of vector components, not limited to only one or two
features
Distributed Representations of Words and Phrases and their Compositionality
The recently introduced continuous Skip-gram model is an efficient method for
learning high-quality distributed vector representations that capture a large
number of precise syntactic and semantic word relationships. In this paper we
present several extensions that improve both the quality of the vectors and the
training speed. By subsampling of the frequent words we obtain significant
speedup and also learn more regular word representations. We also describe a
simple alternative to the hierarchical softmax called negative sampling. An
inherent limitation of word representations is their indifference to word order
and their inability to represent idiomatic phrases. For example, the meanings
of "Canada" and "Air" cannot be easily combined to obtain "Air Canada".
Motivated by this example, we present a simple method for finding phrases in
text, and show that learning good vector representations for millions of
phrases is possible
Management of insomnia in sleep disordered breathing
Both obstructive sleep apnoea (OSA) and chronic insomnia disorder are highly prevalent in the general population. Whilst both disorders may occur together by mere coincidence, it appears that they share clinical features and that they may aggravate each other as a result of reciprocally adverse pathogenetic mechanisms. Comorbidity between chronic insomnia disorder and OSA is a clinically relevant condition that may confront practitioners with serious diagnostic and therapeutic challenges. Current data, while still scarce, advocate an integrated and multidisciplinary approach that seems superior over the isolated treatment of each sleep disorder alone
Fast Search for Dynamic Multi-Relational Graphs
Acting on time-critical events by processing ever growing social media or
news streams is a major technical challenge. Many of these data sources can be
modeled as multi-relational graphs. Continuous queries or techniques to search
for rare events that typically arise in monitoring applications have been
studied extensively for relational databases. This work is dedicated to answer
the question that emerges naturally: how can we efficiently execute a
continuous query on a dynamic graph? This paper presents an exact subgraph
search algorithm that exploits the temporal characteristics of representative
queries for online news or social media monitoring. The algorithm is based on a
novel data structure called the Subgraph Join Tree (SJ-Tree) that leverages the
structural and semantic characteristics of the underlying multi-relational
graph. The paper concludes with extensive experimentation on several real-world
datasets that demonstrates the validity of this approach.Comment: SIGMOD Workshop on Dynamic Networks Management and Mining (DyNetMM),
201
Efficient Estimation of Word Representations in Vector Space
We propose two novel model architectures for computing continuous vector
representations of words from very large data sets. The quality of these
representations is measured in a word similarity task, and the results are
compared to the previously best performing techniques based on different types
of neural networks. We observe large improvements in accuracy at much lower
computational cost, i.e. it takes less than a day to learn high quality word
vectors from a 1.6 billion words data set. Furthermore, we show that these
vectors provide state-of-the-art performance on our test set for measuring
syntactic and semantic word similarities
- …