39,664 research outputs found
General Latent Feature Modeling for Data Exploration Tasks
This paper introduces a general Bayesian non- parametric latent feature model
suitable to per- form automatic exploratory analysis of heterogeneous datasets,
where the attributes describing each object can be either discrete, continuous
or mixed variables. The proposed model presents several important properties.
First, it accounts for heterogeneous data while can be inferred in linear time
with respect to the number of objects and attributes. Second, its Bayesian
nonparametric nature allows us to automatically infer the model complexity from
the data, i.e., the number of features necessary to capture the latent
structure in the data. Third, the latent features in the model are
binary-valued variables, easing the interpretability of the obtained latent
features in data exploration tasks
Recommended from our members
Semantics-Space-Time Cube. A Conceptual Framework for Systematic Analysis of Texts in Space and Time
We propose an approach to analyzing data in which texts are associated with spatial and temporal references with the aim to understand how the text semantics vary over space and time. To represent the semantics, we apply probabilistic topic modeling. After extracting a set of topics and representing the texts by vectors of topic weights, we aggregate the data into a data cube with the dimensions corresponding to the set of topics, the set of spatial locations (e.g., regions), and the time divided into suitable intervals according to the scale of the planned analysis. Each cube cell corresponds to a combination (topic, location, time interval) and contains aggregate measures characterizing the subset of the texts concerning this topic and having the spatial and temporal references within these location and interval. Based on this structure, we systematically describe the space of analysis tasks on exploring the interrelationships among the three heterogeneous information facets, semantics, space, and time. We introduce the operations of projecting and slicing the cube, which are used to decompose complex tasks into simpler subtasks. We then present a design of a visual analytics system intended to support these subtasks. To reduce the complexity of the user interface, we apply the principles of structural, visual, and operational uniformity while respecting the specific properties of each facet. The aggregated data are represented in three parallel views corresponding to the three facets and providing different complementary perspectives on the data. The views have similar look-and-feel to the extent allowed by the facet specifics. Uniform interactive operations applicable to any view support establishing links between the facets. The uniformity principle is also applied in supporting the projecting and slicing operations on the data cube. We evaluate the feasibility and utility of the approach by applying it in two analysis scenarios using geolocated social media data for studying people's reactions to social and natural events of different spatial and temporal scales
- …