1,451 research outputs found
Schule und Alkoholfrage : Vortrag, gehalten auf dem "12. Internationalen KongreĂź gegen den Alkoholismus" im Juli 1909 zu London
Entropy and Graph Based Modelling of Document Coherence using Discourse Entities: An Application
We present two novel models of document coherence and their application to
information retrieval (IR). Both models approximate document coherence using
discourse entities, e.g. the subject or object of a sentence. Our first model
views text as a Markov process generating sequences of discourse entities
(entity n-grams); we use the entropy of these entity n-grams to approximate the
rate at which new information appears in text, reasoning that as more new words
appear, the topic increasingly drifts and text coherence decreases. Our second
model extends the work of Guinaudeau & Strube [28] that represents text as a
graph of discourse entities, linked by different relations, such as their
distance or adjacency in text. We use several graph topology metrics to
approximate different aspects of the discourse flow that can indicate
coherence, such as the average clustering or betweenness of discourse entities
in text. Experiments with several instantiations of these models show that: (i)
our models perform on a par with two other well-known models of text coherence
even without any parameter tuning, and (ii) reranking retrieval results
according to their coherence scores gives notable performance gains, confirming
a relation between document coherence and relevance. This work contributes two
novel models of document coherence, the application of which to IR complements
recent work in the integration of document cohesiveness or comprehensibility to
ranking [5, 56]
Deep Learning Relevance: Creating Relevant Information (as Opposed to Retrieving it)
What if Information Retrieval (IR) systems did not just retrieve relevant
information that is stored in their indices, but could also "understand" it and
synthesise it into a single document? We present a preliminary study that makes
a first step towards answering this question. Given a query, we train a
Recurrent Neural Network (RNN) on existing relevant information to that query.
We then use the RNN to "deep learn" a single, synthetic, and we assume,
relevant document for that query. We design a crowdsourcing experiment to
assess how relevant the "deep learned" document is, compared to existing
relevant documents. Users are shown a query and four wordclouds (of three
existing relevant documents and our deep learned synthetic document). The
synthetic document is ranked on average most relevant of all.Comment: Neu-IR '16 SIGIR Workshop on Neural Information Retrieval, July 21,
2016, Pisa, Ital
Near-optimal adjacency labeling scheme for power-law graphs
An adjacency labeling scheme is a method that assigns labels to the vertices
of a graph such that adjacency between vertices can be inferred directly from
the assigned label, without using a centralized data structure. We devise
adjacency labeling schemes for the family of power-law graphs. This family that
has been used to model many types of networks, e.g. the Internet AS-level
graph. Furthermore, we prove an almost matching lower bound for this family. We
also provide an asymptotically near- optimal labeling scheme for sparse graphs.
Finally, we validate the efficiency of our labeling scheme by an experimental
evaluation using both synthetic data and real-world networks of up to hundreds
of thousands of vertices
Inquiry-based learning in the Humanities:Moving from topics to problems using "the Humanities Imagination"
- …