478 research outputs found
{YAGO}2: A Spatially and Temporally Enhanced Knowledge Base from {Wikipedia}
We present YAGO2, an extension of the YAGO knowledge base, in which entities, facts, and events are anchored in both time and space. YAGO2 is built automatically from Wikipedia, GeoNames, and WordNet. It contains 80 million facts about 9.8 million entities. Human evaluation confirmed an accuracy of 95\% of the facts in YAGO2. In this paper, we present the extraction methodology, the integration of the spatio-temporal dimension, and our knowledge representation SPOTL, an extension of the original SPO-triple model to time and space
Massive-Scale RDF Processing Using Compressed Bitmap Indexes
The Resource Description Framework (RDF) is a popular data model for representing linked data sets arising from the web, as well as large scienti#12;c data repositories such as UniProt. RDF data intrinsically represents a labeled and directed multi-graph. SPARQL is a query language for RDF that expresses subgraph pattern-#12;nding queries on this implicit multigraph in a SQL- like syntax. SPARQL queries generate complex intermediate join queries; to compute these joins e#14;ciently, we propose a new strategy based on bitmap indexes. We store the RDF data in column-oriented structures as compressed bitmaps along with two dictionaries. This paper makes three new contributions. (i) We present an e#14;cient parallel strategy for parsing the raw RDF data, building dictionaries of unique entities, and creating compressed bitmap indexes of the data. (ii) We utilize the constructed bitmap indexes to e#14;ciently answer SPARQL queries, simplifying the join evaluations. (iii) To quantify the performance impact of using bitmap indexes, we compare our approach to the state-of-the-art triple-store RDF-3X. We #12;nd that our bitmap index-based approach to answering queries is up to an order of magnitude faster for a variety of SPARQL queries, on gigascale RDF data sets
Knowledge Questions from Knowledge Graphs
We address the novel problem of automatically generating quiz-style knowledge questions from a knowledge graph such as DBpedia. Questions of this kind have ample applications, for instance, to educate users about or to evaluate their knowledge in a specific domain. To solve the problem, we propose an end-to-end approach. The approach first selects a named entity from the knowledge graph as an answer. It then generates a structured triple-pattern query, which yields the answer as its sole result. If a multiple-choice question is desired, the approach selects alternative answer options. Finally, our approach uses a template-based method to verbalize the structured query and yield a natural language question. A key challenge is estimating how difficult the generated question is to human users. To do this, we make use of historical data from the Jeopardy! quiz show and a semantically annotated Web-scale document collection, engineer suitable features, and train a logistic regression classifier to predict question difficulty. Experiments demonstrate the viability of our overall approach
Limits of minimal models and continuous orbifolds
The lambda=0 't Hooft limit of the 2d W_N minimal models is shown to be
equivalent to the singlet sector of a free boson theory, thus paralleling
exactly the structure of the free theory in the Klebanov-Polyakov proposal. In
2d, the singlet sector does not describe a consistent theory by itself since
the corresponding partition function is not modular invariant. However, it can
be interpreted as the untwisted sector of a continuous orbifold, and this point
of view suggests that it can be made consistent by adding in the appropriate
twisted sectors. We show that these twisted sectors account for the `light
states' that were not included in the original 't Hooft limit. We also show
that, for the Virasoro minimal models (N=2), the twisted sector of our orbifold
agrees precisely with the limit theory of Runkel & Watts. In particular, this
implies that our construction satisfies crossing symmetry.Comment: 33 pages; v2: minor improvements and references added, published
versio
On "Dotsenko-Fateev" representation of the toric conformal blocks
We demonstrate that the recent ansatz of arXiv:1009.5553, inspired by the
original remark due to R.Dijkgraaf and C.Vafa, reproduces the toric conformal
blocks in the same sense that the spherical blocks are given by the integral
representation of arXiv:1001.0563 with a peculiar choice of open integration
contours for screening insertions. In other words, we provide some evidence
that the toric conformal blocks are reproduced by appropriate beta-ensembles
not only in the large-N limit, but also at finite N. The check is explicitly
performed at the first two levels for the 1-point toric functions.
Generalizations to higher genera are briefly discussed.Comment: 10 page
Challenges of beta-deformation
A brief review of problems, arising in the study of the beta-deformation,
also known as "refinement", which appears as a central difficult element in a
number of related modern subjects: beta \neq 1 is responsible for deviation
from free fermions in 2d conformal theories, from symmetric omega-backgrounds
with epsilon_2 = - epsilon_1 in instanton sums in 4d SYM theories, from
eigenvalue matrix models to beta-ensembles, from HOMFLY to super-polynomials in
Chern-Simons theory, from quantum groups to elliptic and hyperbolic algebras
etc. The main attention is paid to the context of AGT relation and its possible
generalizations.Comment: 20 page
Cross-tissue immune cell analysis reveals tissue-specific adaptations and clonal architecture in humans
Despite their crucial role in health and disease, our knowledge of immune cells within human tissues remains limited. Here, we surveyed the immune compartment of 15 tissues of six deceased adult donors by single-cell RNA sequencing and paired VDJ sequencing. To systematically resolve immune cell heterogeneity across tissues, we developed CellTypist, a machine learning tool for rapid and precise cell type annotation. Using this approach, combined with detailed curation, we determined the tissue distribution of 45 finely phenotyped immune cell types and states, revealing hitherto unappreciated tissue-specific features and clonal architecture of T and B cells. In summary, our multi-tissue approach lays the foundation for identifying highly resolved immune cell types by leveraging a common reference dataset, tissue-integrated expression analysis and antigen receptor sequencing. One Sentence Summary We provide an immune cell atlas, including antigen receptor repertoire profiling, across lymphoid and non-lymphoid human tissues
- …