3,994 research outputs found
A Framework for Developing Real-Time OLAP algorithm using Multi-core processing and GPU: Heterogeneous Computing
The overwhelmingly increasing amount of stored data has spurred researchers
seeking different methods in order to optimally take advantage of it which
mostly have faced a response time problem as a result of this enormous size of
data. Most of solutions have suggested materialization as a favourite solution.
However, such a solution cannot attain Real- Time answers anyhow. In this paper
we propose a framework illustrating the barriers and suggested solutions in the
way of achieving Real-Time OLAP answers that are significantly used in decision
support systems and data warehouses
Mining Event Logs to Support Workflow Resource Allocation
Workflow technology is widely used to facilitate the business process in
enterprise information systems (EIS), and it has the potential to reduce design
time, enhance product quality and decrease product cost. However, significant
limitations still exist: as an important task in the context of workflow, many
present resource allocation operations are still performed manually, which are
time-consuming. This paper presents a data mining approach to address the
resource allocation problem (RAP) and improve the productivity of workflow
resource management. Specifically, an Apriori-like algorithm is used to find
the frequent patterns from the event log, and association rules are generated
according to predefined resource allocation constraints. Subsequently, a
correlation measure named lift is utilized to annotate the negatively
correlated resource allocation rules for resource reservation. Finally, the
rules are ranked using the confidence measures as resource allocation rules.
Comparative experiments are performed using C4.5, SVM, ID3, Na\"ive Bayes and
the presented approach, and the results show that the presented approach is
effective in both accuracy and candidate resource recommendations.Comment: T. Liu et al., Mining event logs to support workflow resource
allocation, Knowl. Based Syst. (2012), http://dx.doi.org/
10.1016/j.knosys.2012.05.01
Efficient Evaluation of Sparse Data Cubes
Computing data cubes requires the aggregation of measures over arbitrary combinations of dimensions in a data set. Efficient data cube evaluation remains challenging because of the potentially very large sizes of input datasets (e.g., in the data warehousing context), the well-known curse of dimensionality, and the complexity of queries that need to be supported. This paper proposes a new dynamic data structure called SST (Sparse Statistics Trees) and a novel, in-teractive, and fast cube evaluation algorithm called CUPS (Cubing by Pruning SST), which is especially well suitable for computing aggregates in cubes whose data sets are sparse. SST only stores the aggregations of non-empty cube cells instead of the detailed records. Furthermore, it retains in memory the dense cubes (a.k.a. iceberg cubes) whose aggregate values are above a threshold. Sparse cubes are stored on disks. This allows a fast, accurate approximation for queries. If users desire more refined answers, related sparse cubes are aggregated. SST is incrementally maintainable, which makes CUPS suitable for data warehousing and analysis of streaming data. Experiment results demonstrate the excellent performance and good scalability of our approach
Attribute Value Reordering For Efficient Hybrid OLAP
The normalization of a data cube is the ordering of the attribute values. For
large multidimensional arrays where dense and sparse chunks are stored
differently, proper normalization can lead to improved storage efficiency. We
show that it is NP-hard to compute an optimal normalization even for 1x3
chunks, although we find an exact algorithm for 1x2 chunks. When dimensions are
nearly statistically independent, we show that dimension-wise attribute
frequency sorting is an optimal normalization and takes time O(d n log(n)) for
data cubes of size n^d. When dimensions are not independent, we propose and
evaluate several heuristics. The hybrid OLAP (HOLAP) storage mechanism is
already 19%-30% more efficient than ROLAP, but normalization can improve it
further by 9%-13% for a total gain of 29%-44% over ROLAP
Integration of Data Mining and Data Warehousing: a practical methodology
The ever growing repository of data in all fields poses new challenges to the modern analytical
systems. Real-world datasets, with mixed numeric and nominal variables, are difficult to analyze and require effective visual exploration that conveys semantic relationships of data. Traditional data mining techniques such as clustering clusters only the numeric data. Little research has been carried out in tackling the problem of clustering high cardinality nominal variables to get better insight of underlying dataset. Several works in the literature proved the likelihood of integrating data mining with warehousing to discover knowledge from data. For the seamless integration, the mined data has
to be modeled in form of a data warehouse schema. Schema generation process is complex manual
task and requires domain and warehousing familiarity. Automated techniques are required to generate warehouse schema to overcome the existing dependencies. To fulfill the growing analytical needs and to overcome the existing limitations, we propose a novel methodology in this paper that permits efficient analysis of mixed numeric and nominal data, effective visual data exploration, automatic warehouse schema generation and integration of data mining and warehousing. The proposed methodology is evaluated by performing case study on real-world data set. Results show that multidimensional analysis can be performed in an easier and flexible way to discover meaningful
knowledge from large datasets
Multi-agent evolutionary systems for the generation of complex virtual worlds
Modern films, games and virtual reality applications are dependent on
convincing computer graphics. Highly complex models are a requirement for the
successful delivery of many scenes and environments. While workflows such as
rendering, compositing and animation have been streamlined to accommodate
increasing demands, modelling complex models is still a laborious task. This
paper introduces the computational benefits of an Interactive Genetic Algorithm
(IGA) to computer graphics modelling while compensating the effects of user
fatigue, a common issue with Interactive Evolutionary Computation. An
intelligent agent is used in conjunction with an IGA that offers the potential
to reduce the effects of user fatigue by learning from the choices made by the
human designer and directing the search accordingly. This workflow accelerates
the layout and distribution of basic elements to form complex models. It
captures the designer's intent through interaction, and encourages playful
discovery
- …