409 research outputs found
Maximizing the Diversity of Exposure in a Social Network
Social-media platforms have created new ways for citizens to stay informed
and participate in public debates. However, to enable a healthy environment for
information sharing, social deliberation, and opinion formation, citizens need
to be exposed to sufficiently diverse viewpoints that challenge their
assumptions, instead of being trapped inside filter bubbles. In this paper, we
take a step in this direction and propose a novel approach to maximize the
diversity of exposure in a social network. We formulate the problem in the
context of information propagation, as a task of recommending a small number of
news articles to selected users. We propose a realistic setting where we take
into account content and user leanings, and the probability of further sharing
an article. This setting allows us to capture the balance between maximizing
the spread of information and ensuring the exposure of users to diverse
viewpoints.
The resulting problem can be cast as maximizing a monotone and submodular
function subject to a matroid constraint on the allocation of articles to
users. It is a challenging generalization of the influence maximization
problem. Yet, we are able to devise scalable approximation algorithms by
introducing a novel extension to the notion of random reverse-reachable sets.
We experimentally demonstrate the efficiency and scalability of our algorithm
on several real-world datasets
The Minimum Description Length Principle for Pattern Mining: A Survey
This is about the Minimum Description Length (MDL) principle applied to
pattern mining. The length of this description is kept to the minimum.
Mining patterns is a core task in data analysis and, beyond issues of
efficient enumeration, the selection of patterns constitutes a major challenge.
The MDL principle, a model selection method grounded in information theory, has
been applied to pattern mining with the aim to obtain compact high-quality sets
of patterns. After giving an outline of relevant concepts from information
theory and coding, as well as of work on the theory behind the MDL and similar
principles, we review MDL-based methods for mining various types of data and
patterns. Finally, we open a discussion on some issues regarding these methods,
and highlight currently active related data analysis problems
Phrase table pruning for Statistical Machine Translation
Phrase-Based Statistical Machine Translation systems model the translation process using pairs of corresponding sequences of words extracted from parallel corpora. These biphrases are stored in phrase tables that typically contain several millions such entries, making it di cult to assess their quality without going to the end of the translation process. Our work is based on the examplifying study of phrase tables generated from the Europarl data, from French to English. We give some statistical information about the biphrases contained in the phrase table, evaluate the coverage of previously unseen sentences and analyse the e ects of pruning on the translation
Speech intelligibility of English, Polish, Arabic and Mandarin under different room acoustic conditions
- …