Search CORE

408 research outputs found

Analyzing Large Collections of Electronic Text Using OLAP

Author: Kaser Owen
Keith Steven
Lemire Daniel
Publication venue
Publication date: 01/01/2005
Field of study

Computer-assisted reading and analysis of text has various applications in the humanities and social sciences. The increasing size of many electronic text archives has the advantage of a more complete analysis but the disadvantage of taking longer to obtain results. On-Line Analytical Processing is a method used to store and quickly analyze multidimensional data. By storing text analysis information in an OLAP system, a user can obtain solutions to inquiries in a matter of seconds as opposed to minutes, hours, or even days. This analysis is user-driven allowing various users the freedom to pursue their own direction of research

arXiv.org e-Print Archive

CiteSeerX

Recommended from our members

Semantics-Space-Time Cube. A Conceptual Framework for Systematic Analysis of Texts in Space and Time

Author: Andrienko G.
Andrienko N.
Chen S.
Chen W.
Li J.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2018
Field of study

We propose an approach to analyzing data in which texts are associated with spatial and temporal references with the aim to understand how the text semantics vary over space and time. To represent the semantics, we apply probabilistic topic modeling. After extracting a set of topics and representing the texts by vectors of topic weights, we aggregate the data into a data cube with the dimensions corresponding to the set of topics, the set of spatial locations (e.g., regions), and the time divided into suitable intervals according to the scale of the planned analysis. Each cube cell corresponds to a combination (topic, location, time interval) and contains aggregate measures characterizing the subset of the texts concerning this topic and having the spatial and temporal references within these location and interval. Based on this structure, we systematically describe the space of analysis tasks on exploring the interrelationships among the three heterogeneous information facets, semantics, space, and time. We introduce the operations of projecting and slicing the cube, which are used to decompose complex tasks into simpler subtasks. We then present a design of a visual analytics system intended to support these subtasks. To reduce the complexity of the user interface, we apply the principles of structural, visual, and operational uniformity while respecting the specific properties of each facet. The aggregated data are represented in three parallel views corresponding to the three facets and providing different complementary perspectives on the data. The views have similar look-and-feel to the extent allowed by the facet specifics. Uniform interactive operations applicable to any view support establishing links between the facets. The uniformity principle is also applied in supporting the projecting and slicing operations on the data cube. We evaluate the feasibility and utility of the approach by applying it in two analysis scenarios using geolocated social media data for studying people's reactions to social and natural events of different spatial and temporal scales

City Research Online

Crossref

Fraunhofer-ePrints

A Descriptive Framework for Temporal Data Visualizations Based on Generalized Space-Time Cubes

Author: Ahn
Aigner
Aigner
Andrienko
Andrienko
Andrienko
Andrienko
Archambault
Archambault
Archambault
Bach
Bach
Bach
Bach
Baker
Bartoli
Becker
Borgo
Boyandin
Brandes
Brandes
Brunsdon
Burch
DemÅ¡ar
Dwyer
Elzen
Elzen
Farrugia
Feldman
Ghani
Gray
Griffen
Hadlak
Harris
Holman
Hurter
Hurter
Hurter
Jansen
Kristensson
MacEachren
Marey
Matthews
McCloud
McDonnel
Misue
Munzner
Muybridge
Petrovic
Rensink
Robertson
Roth
Rufiange
Schroeder
Shneiderman
Shneiderman
Smith
Thakur
Thudt
Todd
Tominski
Truong
Tufte
Turdukulov
Tversky
Vrotsou
Wanger
Ware
Ware
Publication venue: 'Wiley'
Publication date: 01/01/2016
Field of study

International audienceWe present the generalized space-time cube, a descriptive model for visualizations of temporal data. Visualizations are described as operations on the cube, which transform the cube's 3D shape into readable 2D visualizations. Operations include extracting subparts of the cube, flattening it across space or time or transforming the cubes geometry and content. We introduce a taxonomy of elementary space-time cube operations and explain how these operations can be combined and parameterized. The generalized space-time cube has two properties: (1) it is purely conceptual without the need to be implemented, and (2) it applies to all datasets that can be represented in two dimensions plus time (e.g. geo-spatial, videos, networks, multivariate data). The proper choice of space-time cube operations depends on many factors, for example, density or sparsity of a cube. Hence, we propose a characterization of structures within space-time cubes, which allows us to discuss strengths and limitations of operations. We finally review interactive systems that support multiple operations, allowing a user to customize his view on the data. With this framework, we hope to facilitate the description, criticism and comparison of temporal data visualizations, as well as encourage the exploration of new techniques and systems. This paper is an extension of Bach et al.'s (2014) work

HAL-CentraleSupelec

Crossref

INRIA a CCSD electronic archive server

Cronfa at Swansea University

Hal-Diderot

HAL - UPEC / UPEM

HAL-Rennes 1

Integration of Data Mining and Data Warehousing: a practical methodology

Author: Pears R
Usman M
Publication venue: Advanced Institute of Convergence IT (AICIT)
Publication date: 12/08/2011
Field of study

The ever growing repository of data in all fields poses new challenges to the modern analytical systems. Real-world datasets, with mixed numeric and nominal variables, are difficult to analyze and require effective visual exploration that conveys semantic relationships of data. Traditional data mining techniques such as clustering clusters only the numeric data. Little research has been carried out in tackling the problem of clustering high cardinality nominal variables to get better insight of underlying dataset. Several works in the literature proved the likelihood of integrating data mining with warehousing to discover knowledge from data. For the seamless integration, the mined data has to be modeled in form of a data warehouse schema. Schema generation process is complex manual task and requires domain and warehousing familiarity. Automated techniques are required to generate warehouse schema to overcome the existing dependencies. To fulfill the growing analytical needs and to overcome the existing limitations, we propose a novel methodology in this paper that permits efficient analysis of mixed numeric and nominal data, effective visual data exploration, automatic warehouse schema generation and integration of data mining and warehousing. The proposed methodology is evaluated by performing case study on real-world data set. Results show that multidimensional analysis can be performed in an easier and flexible way to discover meaningful knowledge from large datasets

AUT Scholarly Commons