Search CORE

9 research outputs found

Time Dynamic Topic Models

Author: Jähnichen Patrick
Publication venue
Publication date: 22/03/2016
Field of study

Information extraction from large corpora can be a useful tool for many applications in industry and academia. For instance, political communication science has just recently begun to use the opportunities that come with the availability of massive amounts of information available through the Internet and the computational tools that natural language processing can provide. We give a linguistically motivated interpretation of topic modeling, a state-of-the-art algorithm for extracting latent semantic sets of words from large text corpora, and extend this interpretation to cover issues and issue-cycles as theoretical constructs coming from political communication science. We build on a dynamic topic model, a model whose semantic sets of words are allowed to evolve over time governed by a Brownian motion stochastic process and apply a new form of analysis to its result. Generally this analysis is based on the notion of volatility as in the rate of change of stocks or derivatives known from econometrics. We claim that the rate of change of sets of semantically related words can be interpreted as issue-cycles, the word sets as describing the underlying issue. Generalizing over the existing work, we introduce dynamic topic models that are driven by general (Brownian motion is a special case of our model) Gaussian processes, a family of stochastic processes defined by the function that determines their covariance structure. We use the above assumption and apply a certain class of covariance functions to allow for an appropriate rate of change in word sets while preserving the semantic relatedness among words. Applying our findings to a large newspaper data set, the New York Times Annotated corpus (all articles between 1987 and 2007), we are able to identify sub-topics in time, \\\\textit{time-localized topics} and find patterns in their behavior over time. However, we have to drop the assumption of semantic relatedness over all available time for any one topic. Time-localized topics are consistent in themselves but do not necessarily share semantic meaning between each other. They can, however, be interpreted to capture the notion of issues and their behavior that of issue-cycles

Qucosa - Publikationsserver der Universität Leipzig

Time Dynamic Topic Models

Author: Jähnichen Patrick
Publication venue
Publication date: 22/03/2016
Field of study

Qucosa

HSSS - Hochschulschriftenserver der SLUB

Qucosa - Publikationsserver der Universität Leipzig

Time Dynamic Topic Models

Author: Jähnichen Patrick
Publication venue
Publication date
Field of study

HSSS - Hochschulschriftenserver der SLUB

The praxis of social knowledge federation

Author: Bleier Arnim
Jähnichen Patrick
Maicher Lutz
Schulze Uta
Publication venue
Publication date
Field of study

There are currently two streams that dominate the research on knowledge federation: The one is the trend towards Linked Data, leading to ﬁne-grained structuring of information that is machine readable; The other is the reuse and co-creation of information that spreads the burden of its creation to the public and enables the availability of large knowledge corpora. In this contribution we outline the design principles and architecture of a prototype platform harnessing the praxis of user behavior to tackle issues of signal noise ratio as well as corrupting bits of information slashing the machine readability of such distributed generated contend

Fraunhofer-ePrints

VIST - a Variant-Information Search Tool for precision oncology

Author: A Singhal
BJ Ainscough
C-H Wei
C-H Wei
C-H Wei
D Chakravarty
Damian Rieke
David Luis Wiegandt
F Pedregosa
H Yuan
J Kim
J Köhler
J Pfeiffer
J Starlinger
J Ševa
Johannes Starlinger
Julian Götze
Jurica Ševa
K Roberts
KD Doig
Kelsy C Cotto
KS Hughes
L Huang
LA Garraway
M Griffith
M Habibi
Madeleine Kittner
Mario Lamping
MJ Landrum
N Fiorini
P Ernst
P Liu
P Thomas
Patrick Jähnichen
R Leaman
Reinhold Schäfer
Simon Baker
SL Topalian
Steffen Pallarz
Ulf Leser
Ulrich Keilholz
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref