Search CORE

47,646 research outputs found

Recommended from our members

Contemporary American cartographic research: a review and prospective

Author: Clarke Keith C
Johnson J Michael
Trainor Tim
Publication venue: eScholarship, University of California
Publication date: 04/05/2019
Field of study

eScholarship - University of California

Exploratory topic modeling with distributional semantics

Author: A Treisman
DA Keim
DM Blei
J Risch
L Barth
M Bostock
S Fortunato
S Lohmann
S Palmer
Y Bengio
Publication venue
Publication date: 16/07/2015
Field of study

As we continue to collect and store textual data in a multitude of domains, we are regularly confronted with material whose largely unknown thematic structure we want to uncover. With unsupervised, exploratory analysis, no prior knowledge about the content is required and highly open-ended tasks can be supported. In the past few years, probabilistic topic modeling has emerged as a popular approach to this problem. Nevertheless, the representation of the latent topics as aggregations of semi-coherent terms limits their interpretability and level of detail. This paper presents an alternative approach to topic modeling that maps topics as a network for exploration, based on distributional semantics using learned word vectors. From the granular level of terms and their semantic similarity relations global topic structures emerge as clustered regions and gradients of concepts. Moreover, the paper discusses the visual interactive representation of the topic map, which plays an important role in supporting its exploration.Comment: Conference: The Fourteenth International Symposium on Intelligent Data Analysis (IDA 2015

arXiv.org e-Print Archive

Crossref

Topic Map Generation Using Text Mining

Author: Böhm Karsten
Heyer Gerhard
Quasthoff Uwe
Wolff Christian
Publication venue: Springer Verlag
Publication date: 28/06/2002
Field of study

Starting from text corpus analysis with linguistic and statistical analysis algorithms, an infrastructure for text mining is described which uses collocation analysis as a central tool. This text mining method may be applied to different domains as well as languages. Some examples taken form large reference databases motivate the applicability to knowledge management using declarative standards of information structuring and description. The ISO/IEC Topic Map standard is introduced as a candidate for rich metadata description of information resources and it is shown how text mining can be used for automatic topic map generation

University of Regensburg Publication Server

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Mapping Topics and Topic Bursts in PNAS

Author: Börner Katy
Mane Ketan
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 13/02/2004
Field of study

Scientific research is highly dynamic. New areas of science continually evolve;others gain or lose importance, merge or split. Due to the steady increase in the number of scientific publications it is hard to keep an overview of the structure and dynamic development of one's own field of science, much less all scientific domains. However, knowledge of hot topics, emergent research frontiers, or change of focus in certain areas is a critical component of resource allocation decisions in research labs, governmental institutions, and corporations. This paper demonstrates the utilization of Kleinberg's burst detection algorithm, co-word occurrence analysis, and graph layout techniques to generate maps that support the identification of major research topics and trends. The approach was applied to analyze and map the complete set of papers published in the Proceedings of the National Academy of Sciences (PNAS) in the years 1982-2001. Six domain experts examined and commented on the resulting maps in an attempt to reconstruct the evolution of major research areas covered by PNAS

arXiv.org e-Print Archive

Crossref

PubMed Central

Analyzing the Language of Food on Social Media

Author: Bell Dane
Fried Daniel
Hingle Melanie
Kobourov Stephen
Surdeanu Mihai
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/09/2014
Field of study

We investigate the predictive power behind the language of food on social media. We collect a corpus of over three million food-related posts from Twitter and demonstrate that many latent population characteristics can be directly predicted from this data: overweight rate, diabetes rate, political leaning, and home geographical location of authors. For all tasks, our language-based models significantly outperform the majority-class baselines. Performance is further improved with more complex natural language processing, such as topic modeling. We analyze which textual features have most predictive power for these datasets, providing insight into the connections between the language of food, geographic locale, and community characteristics. Lastly, we design and implement an online system for real-time query and visualization of the dataset. Visualization tools, such as geo-referenced heatmaps, semantics-preserving wordclouds and temporal histograms, allow us to discover more complex, global patterns mirrored in the language of food.Comment: An extended abstract of this paper will appear in IEEE Big Data 201

arXiv.org e-Print Archive

Crossref

Automated construction and analysis of political networks via open government and media sources

Author: Arias Vicente Marta
García-Olano Diego
Larriba Pey Josep
Publication venue
Publication date: 01/01/2016
Field of study

We present a tool to generate real world political networks from user provided lists of politicians and news sites. Additional output includes visualizations, interactive tools and maps that allow a user to better understand the politicians and their surrounding environments as portrayed by the media. As a case study, we construct a comprehensive list of current Texas politicians, select news sites that convey a spectrum of political viewpoints covering Texas politics, and examine the results. We propose a ”Combined” co-occurrence distance metric to better reflect the relationship between two entities. A topic modeling technique is also proposed as a novel, automated way of labeling communities that exist within a politician’s ”extended” network.Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC