Search CORE

1,418 research outputs found

A Framework for Developing Real-Time OLAP algorithm using Multi-core processing and GPU: Heterogeneous Computing

Author: Alzeini H I
Habaebi M H
Hameed Sh A
Publication venue
Publication date: 01/12/2013
Field of study

The overwhelmingly increasing amount of stored data has spurred researchers seeking different methods in order to optimally take advantage of it which mostly have faced a response time problem as a result of this enormous size of data. Most of solutions have suggested materialization as a favourite solution. However, such a solution cannot attain Real- Time answers anyhow. In this paper we propose a framework illustrating the barriers and suggested solutions in the way of achieving Real-Time OLAP answers that are significantly used in decision support systems and data warehouses

arXiv.org e-Print Archive

Crossref

The International Islamic University Malaysia Repository

A high-accuracy optical linear algebra processor for finite element applications

Author: Casasent D.
Taylor B. K.
Publication venue
Publication date
Field of study

Optical linear processors are computationally efficient computers for solving matrix-matrix and matrix-vector oriented problems. Optical system errors limit their dynamic range to 30-40 dB, which limits their accuray to 9-12 bits. Large problems, such as the finite element problem in structural mechanics (with tens or hundreds of thousands of variables) which can exploit the speed of optical processors, require the 32 bit accuracy obtainable from digital machines. To obtain this required 32 bit accuracy with an optical processor, the data can be digitally encoded, thereby reducing the dynamic range requirements of the optical system (i.e., decreasing the effect of optical errors on the data) while providing increased accuracy. This report describes a new digitally encoded optical linear algebra processor architecture for solving finite element and banded matrix-vector problems. A linear static plate bending case study is described which quantities the processor requirements. Multiplication by digital convolution is explained, and the digitally encoded optical processor architecture is advanced

NASA Technical Reports Server

The End of Slow Networks: It's Time for a Redesign

Author: Binnig Carsten
Crotty Andrew
Galakatos Alex
Kraska Tim
Zamanian Erfan
Publication venue
Publication date: 19/12/2015
Field of study

Next generation high-performance RDMA-capable networks will require a fundamental rethinking of the design and architecture of modern distributed DBMSs. These systems are commonly designed and optimized under the assumption that the network is the bottleneck: the network is slow and "thin", and thus needs to be avoided as much as possible. Yet this assumption no longer holds true. With InfiniBand FDR 4x, the bandwidth available to transfer data across network is in the same ballpark as the bandwidth of one memory channel, and it increases even further with the most recent EDR standard. Moreover, with the increasing advances of RDMA, the latency improves similarly fast. In this paper, we first argue that the "old" distributed database design is not capable of taking full advantage of the network. Second, we propose architectural redesigns for OLTP, OLAP and advanced analytical frameworks to take better advantage of the improved bandwidth, latency and RDMA capabilities. Finally, for each of the workload categories, we show that remarkable performance improvements can be achieved

arXiv.org e-Print Archive

TUbiblio

Integrating E-Commerce and Data Mining: Architecture and Challenges

Author: Ansari Suhail
Kohavi Ron
Mason Llew
Zheng Zijian
Publication venue
Publication date: 01/01/2000
Field of study

We show that the e-commerce domain can provide all the right ingredients for successful data mining and claim that it is a killer domain for data mining. We describe an integrated architecture, based on our expe-rience at Blue Martini Software, for supporting this integration. The architecture can dramatically reduce the pre-processing, cleaning, and data understanding effort often documented to take 80% of the time in knowledge discovery projects. We emphasize the need for data collection at the application server layer (not the web server) in order to support logging of data and metadata that is essential to the discovery process. We describe the data transformation bridges required from the transaction processing systems and customer event streams (e.g., clickstreams) to the data warehouse. We detail the mining workbench, which needs to provide multiple views of the data through reporting, data mining algorithms, visualization, and OLAP. We con-clude with a set of challenges.Comment: KDD workshop: WebKDD 200

arXiv.org e-Print Archive

CiteSeerX

Efficient Multi-way Theta-Join Processing Using MapReduce

Author: Chen Lei
Wang Min
Zhang Xiaofei
Publication venue
Publication date: 01/01/2012
Field of study

Multi-way Theta-join queries are powerful in describing complex relations and therefore widely employed in real practices. However, existing solutions from traditional distributed and parallel databases for multi-way Theta-join queries cannot be easily extended to fit a shared-nothing distributed computing paradigm, which is proven to be able to support OLAP applications over immense data volumes. In this work, we study the problem of efficient processing of multi-way Theta-join queries using MapReduce from a cost-effective perspective. Although there have been some works using the (key,value) pair-based programming model to support join operations, efficient processing of multi-way Theta-join queries has never been fully explored. The substantial challenge lies in, given a number of processing units (that can run Map or Reduce tasks), mapping a multi-way Theta-join query to a number of MapReduce jobs and having them executed in a well scheduled sequence, such that the total processing time span is minimized. Our solution mainly includes two parts: 1) cost metrics for both single MapReduce job and a number of MapReduce jobs executed in a certain order; 2) the efficient execution of a chain-typed Theta-join with only one MapReduce job. Comparing with the query evaluation strategy proposed in [23] and the widely adopted Pig Latin and Hive SQL solutions, our method achieves significant improvement of the join processing efficiency.Comment: VLDB201

arXiv.org e-Print Archive

University of Memphis Digital Commons

CiteSeerX

Hong Kong University of Science and Technology Institutional Repository

HaoLap: a Hadoop based OLAP system for big data

Author: Guo Chaopeng
Pierson Jean-Marc
Song Jie
Wang Zhi
Yu Ge
Zhang Yichan
Publication venue: 'Elsevier BV'
Publication date: 01/04/2015
Field of study

International audienceIn recent years, facing information explosion, industry and academia have adopted distributed file system and MapReduce programming model to address new challenges the big data has brought. Based on these technologies, this paper presents HaoLap (Hadoop based oLap), an OLAP (OnLine Analytical Processing) system for big data. Drawing on the experience of Multidimensional OLAP (MOLAP), HaoLap adopts the specified multidimensional model to map the dimensions and the measures; the dimension coding and traverse algorithm to achieve the roll up operation on dimension hierarchy; the partition and linearization algorithm to store dimensions and measures; the chunk selection algorithm to optimize OLAP performance; and MapReduce to execute OLAP. The paper illustrates the key techniques of HaoLap including system architecture, dimension definition, dimension coding and traversing, partition, data storage, OLAP and data loading algorithm. We evaluated HaoLap on a real application and compared it with Hive, HadoopDB, HBaseLattice, and Olap4Cloud. The experiment results show that HaoLap boost the efficiency of data loading, and has a great advantage in the OLAP performance of the data set size and query complexity, and meanwhile HaoLap also completely support dimension operations

Crossref

Scientific Publications of the University of Toulouse II Le Mirail

Open Archive Toulouse Archive Ouverte

Multidimensional database modelling with differentiated multiple aggregations

Author: HASSAN Ali
Ravat Franck
Teste Olivier
Tournier Ronan
Zurfluh Gilles
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2014
Field of study

International audienceMany solutions have been defined for multidimensional database modelling. These propositions consider the same aggregation function to determine the values of an indicator according to different levels of granularity into the multidimensional space. We provide a more flexible conceptual model that supports multiple differentiated aggregations. Multiple aggregations allow associating different aggregation functions to the same measure for each dimension and for each hierarchy. Differentiated aggregation allows specific aggregations at each level (parameter). Our model is based on a double graphical formalism, expressive enough to control the validity of aggregation functions. We also study the consequences of this conceptual modelling for building lattices of pre-computed aggregates in a relational online analytical processing (R-OLAP) environment

Crossref

Scientific Publications of the University of Toulouse II Le Mirail

Open Archive Toulouse Archive Ouverte

Toulouse Capitole Publications

Toulouse 1 Capitole Publications

Towards an agent based traffic regulation and recommendation system for the on-road air quality control

Author: Abdelaziz El Fazziki
Abderrahmane Sadiq
Jamal Ouarzazi
Mohamed Sadgal
Publication venue: Springer Nature
Publication date: 01/01/2016
Field of study

Springer - Publisher Connector