18,384 research outputs found
Collaborative OLAP with Tag Clouds: Web 2.0 OLAP Formalism and Experimental Evaluation
Increasingly, business projects are ephemeral. New Business Intelligence
tools must support ad-lib data sources and quick perusal. Meanwhile, tag clouds
are a popular community-driven visualization technique. Hence, we investigate
tag-cloud views with support for OLAP operations such as roll-ups, slices,
dices, clustering, and drill-downs. As a case study, we implemented an
application where users can upload data and immediately navigate through its ad
hoc dimensions. To support social networking, views can be easily shared and
embedded in other Web sites. Algorithmically, our tag-cloud views are
approximate range top-k queries over spontaneous data cubes. We present
experimental evidence that iceberg cuboids provide adequate online
approximations. We benchmark several browser-oblivious tag-cloud layout
optimizations.Comment: Software at https://github.com/lemire/OLAPTagClou
Interactive Data Exploration with Smart Drill-Down
We present {\em smart drill-down}, an operator for interactively exploring a
relational table to discover and summarize "interesting" groups of tuples. Each
group of tuples is described by a {\em rule}. For instance, the rule tells us that there are a thousand tuples with value in the
first column and in the second column (and any value in the third column).
Smart drill-down presents an analyst with a list of rules that together
describe interesting aspects of the table. The analyst can tailor the
definition of interesting, and can interactively apply smart drill-down on an
existing rule to explore that part of the table. We demonstrate that the
underlying optimization problems are {\sc NP-Hard}, and describe an algorithm
for finding the approximately optimal list of rules to display when the user
uses a smart drill-down, and a dynamic sampling scheme for efficiently
interacting with large tables. Finally, we perform experiments on real datasets
on our experimental prototype to demonstrate the usefulness of smart drill-down
and study the performance of our algorithms
Using Fuzzy Linguistic Representations to Provide Explanatory Semantics for Data Warehouses
A data warehouse integrates large amounts of extracted and summarized data from multiple sources for direct querying and analysis. While it provides decision makers with easy access to such historical and aggregate data, the real meaning of the data has been ignored. For example, "whether a total sales amount 1,000 items indicates a good or bad sales performance" is still unclear. From the decision makers' point of view, the semantics rather than raw numbers which convey the meaning of the data is very important. In this paper, we explore the use of fuzzy technology to provide this semantics for the summarizations and aggregates developed in data warehousing systems. A three layered data warehouse semantic model, consisting of quantitative (numerical) summarization, qualitative (categorical) summarization, and quantifier summarization, is proposed for capturing and explicating the semantics of warehoused data. Based on the model, several algebraic operators are defined. We also extend the SQL language to allow for flexible queries against such enhanced data warehouses
A model for Business Intelligence Systems’ Development
Often, Business Intelligence Systems (BIS) require historical data or data collected from var-ious sources. The solution is found in data warehouses, which are the main technology used to extract, transform, load and store data in the organizational Business Intelligence projects. The development cycle of a data warehouse involves lots of resources, time, high costs and above all, it is built only for some specific tasks. In this paper, we’ll present some of the aspects of the BI systems’ development such as: architecture, lifecycle, modeling techniques and finally, some evaluation criteria for the system’s performance.BIS (Business Intelligence Systems), Data Warehouses, OLAP (On-Line Analytical Processing), Object-Oriented Modeling
Data Management and Mining in Astrophysical Databases
We analyse the issues involved in the management and mining of astrophysical
data. The traditional approach to data management in the astrophysical field is
not able to keep up with the increasing size of the data gathered by modern
detectors. An essential role in the astrophysical research will be assumed by
automatic tools for information extraction from large datasets, i.e. data
mining techniques, such as clustering and classification algorithms. This asks
for an approach to data management based on data warehousing, emphasizing the
efficiency and simplicity of data access; efficiency is obtained using
multidimensional access methods and simplicity is achieved by properly handling
metadata. Clustering and classification techniques, on large datasets, pose
additional requirements: computational and memory scalability with respect to
the data size, interpretability and objectivity of clustering or classification
results. In this study we address some possible solutions.Comment: 10 pages, Late
Attribute oriented induction with star schema
This paper will propose a novel star schema attribute induction as a new
attribute induction paradigm and as improving from current attribute oriented
induction. A novel star schema attribute induction will be examined with
current attribute oriented induction based on characteristic rule and using non
rule based concept hierarchy by implementing both of approaches. In novel star
schema attribute induction some improvements have been implemented like
elimination threshold number as maximum tuples control for generalization
result, there is no ANY as the most general concept, replacement the role
concept hierarchy with concept tree, simplification for the generalization
strategy steps and elimination attribute oriented induction algorithm. Novel
star schema attribute induction is more powerful than the current attribute
oriented induction since can produce small number final generalization tuples
and there is no ANY in the results.Comment: 23 Pages, IJDM
- …