Search CORE

11,490 research outputs found

Database Vs Data Warehouse

Author: Gheorghe MATEI
Manole VELICANU
Publication venue
Publication date
Field of study

Data warehouse technology includes a set of concepts and methods that offer the users useful information for decision making. The necessity to build a data warehouse arises from the necessity to improve the quality of information in the organization. The date proceeding from different sources, having a variety of forms - both structured and unstructured, are filtered according to business rules and are integrated in a single large data collection. Using informatics solutions, managers have understood that data stored in operational systems - including databases, are an informational gold mine that must be exploited. Data warehouses have been developed to answer the increasing demands for complex analysis, which could not be properly achieved with operational databases. The present paper emphasizes some of the criteria that information application developers can use in order to choose between a database solution or a data warehouse one.data warehouse, database, database management systems, information systems, data organisation in externe memory, business intelligence

Research Papers in Economics

Scalable Model-Based Management of Correlated Dimensional Time Series in ModelarDB+

Author: Jensen Søren Kejser
Pedersen Torben Bach
Thomsen Christian
Publication venue
Publication date: 01/01/2019
Field of study

To monitor critical infrastructure, high quality sensors sampled at a high frequency are increasingly used. However, as they produce huge amounts of data, only simple aggregates are stored. This removes outliers and fluctuations that could indicate problems. As a remedy, we present a model-based approach for managing time series with dimensions that exploits correlation in and among time series. Specifically, we propose compressing groups of correlated time series using an extensible set of model types within a user-defined error bound (possibly zero). We name this new category of model-based compression methods for time series Multi-Model Group Compression (MMGC). We present the first MMGC method GOLEMM and extend model types to compress time series groups. We propose primitives for users to effectively define groups for differently sized data sets, and based on these, an automated grouping method using only the time series dimensions. We propose algorithms for executing simple and multi-dimensional aggregate queries on models. Last, we implement our methods in the Time Series Management System (TSMS) ModelarDB (ModelarDB+). Our evaluation shows that compared to widely used formats, ModelarDB+ provides up to 13.7 times faster ingestion due to high compression, 113 times better compression due to the adaptivity of GOLEMM, 630 times faster aggregates by using models, and close to linear scalability. It is also extensible and supports online query processing.Comment: 12 Pages, 28 Figures, and 1 Tabl

arXiv.org e-Print Archive

Crossref

VBN

ShenZhen transportation system (SZTS): a novel big data benchmark suite

Author: Bei Zhengdong
Eeckhout Lieven
Xiong Wen
Xu Chengzhong
Yu Zhibin
Zhang Fan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Data analytics is at the core of the supply chain for both products and services in modern economies and societies. Big data workloads, however, are placing unprecedented demands on computing technologies, calling for a deep understanding and characterization of these emerging workloads. In this paper, we propose ShenZhen Transportation System (SZTS), a novel big data Hadoop benchmark suite comprised of real-life transportation analysis applications with real-life input data sets from Shenzhen in China. SZTS uniquely focuses on a specific and real-life application domain whereas other existing Hadoop benchmark suites, such as HiBench and CloudRank-D, consist of generic algorithms with synthetic inputs. We perform a cross-layer workload characterization at the microarchitecture level, the operating system (OS) level, and the job level, revealing unique characteristics of SZTS compared to existing Hadoop benchmarks as well as general-purpose multi-core PARSEC benchmarks. We also study the sensitivity of workload behavior with respect to input data size, and we propose a methodology for identifying representative input data sets

Ghent University Academic Bibliography

Comprehensive characterization of an open source document search engine

Author: Antoniou Georgia
Hadjilambrou Zacharias
Kleanthous Marios
Portero Antoni
Sazeides Yiannakis
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

This work performs a thorough characterization and analysis of the open source Lucene search library. The article describes in detail the architecture, functionality, and micro-architectural behavior of the search engine, and investigates prominent online document search research issues. In particular, we study how intra-server index partitioning affects the response time and throughput, explore the potential use of low power servers for document search, and examine the sources of performance degradation ands the causes of tail latencies. Some of our main conclusions are the following: (a) intra-server index partitioning can reduce tail latencies but with diminishing benefits as incoming query traffic increases, (b) low power servers given enough partitioning can provide same average and tail response times as conventional high performance servers, (c) index search is a CPU-intensive cache-friendly application, and (d) C-states are the main culprits for performance degradation in document search.Web of Science162art. no. 1

DSpace at VSB Technical University of Ostrava

An Open Source Based Data Warehouse Architecture to Support Decision Making in the Tourism Sector

Author: Francesco Mola
Raffaele Miele
Publication venue
Publication date
Field of study

In this paper an alternative Tourism oriented Data Warehousing architecture is proposed which makes use of the most recent free and open source technologies like Java, Postgresql and XML. Such architecture's aim will be to support the decision making process and giving an integrated view of the whole Tourism reality in an established context (local, regional, national, etc.) without requesting big investments for getting the necessary software.Tourism, Data warehousing architecture

Research Papers in Economics

Recommended from our members

Exploiting a perdurantist foundational ontology and graph database for semantic data integration

Author: Foy George
Publication venue: Brunel University London
Publication date: 01/01/2015
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University London.The view of reality that is inherent to perdurantist philosophical ontologies, often termed four dimensional (4D) ontologies, has not been widely adopted within the mainstream of information system design practice. However, as the closed world of enterprise systems is opened to Internet scale Semantic Web and Open Data information sources, there is a need to better understand the semantics of both internal and external data and how they can be integrated. Philosophical foundational ontologies can help establish this understanding and there is, therefore, an emerging need to research how they can be applied to the problem of semantic data integration. Therefore, a prime objective of this research was to develop a framework through which to apply a 4D foundational ontology and a graph database to the problem of semantic data integration, and to assess the effectiveness of the approach. The research employed design science, a methodology which is applicable to undertaking research within information systems as it encompasses methods through which the research can be undertaken and the resultant artefacts evaluated. This methodology has a number of discrete stages: problem awareness; a core design-build-evaluate iterative cycle through which the research is conducted; and a conclusion stage. The design science research was conducted through the development of a number of artefacts, the prime being the 4D-Semantic Extract Load (4D-SETL) framework. The effectiveness of the framework was assessed by applying it to semantically interpret and integrate a number of large scale datasets and to instantiate a prototype graph database warehouse to persist the resultant ontology. A series of technical experiments confirmed that directly reflecting the model patterns of 4D ontology within a prototype data warehouse proved an effective means of both structuring and semantically integrating complex datasets and that the artefacts produced by 4D-SETL could function at scale. Through illustrative scenario, the effectiveness of the approach is described in relation to the ability of the framework to address a number of weaknesses in current approaches. Furthermore the major advantages of the 4D-SETL are elaborated; which include ability of the framework is to combine foundational, domain and instance level ontological models in a single coherent system that dispensed with much of the translation normally undertaken between conceptual, logical and physical data models. Additionally, adopting a perdurantist realist foundational ontology provided a clear means of establishing and maintaining the identity of physical objects as their constituent temporal and spatial parts unfold over the course of tim

Brunel University Research Archive

Mission scheduler for a rail guided vehicle system

Author: Giuliani Roberto
Publication venue: Alma Mater Studiorum - Università di Bologna
Publication date: 18/12/2019
Field of study

A transport system with automatic guided vehicles AGVs, is a fully automatic system that provides logistics services in industrial environments such as warehouses and production plants. These systems have reached such a degree of maturity as to allow, in their daily use, the application of heuristic algorithms for the optimization of the various operations they perform. For instance, find the shortest paths between working stations and storage area, assign movements and strategic positions for idle vehicles, operate efficient and long-life battery management and more. A relevant interesting algorithm, presented and developed in this thesis, concerns the sorting of products in the shipping phase, which affects the scheduling tasks assigned to the autonomous vehicles. The scheduler has the aims of determining which operations have more strict constraints and more priority over others. Studies and practice have shown that the adoption of a valid scheduler implies considerable improvements in the system performance, consequently it is advisable to dedicate time and effort to the research for the right one. The following algorithms obtained a successful outcome and they have been implemented for the production of a modern automated warehouse located in the city of Cesena, Italy. The paper is divided into four chapters, with a further one dedicated to conclusions

AMS Tesi di Laurea

ARENA—augmented reality to enhanced experimentation in smart warehouses

Author: Kalempa Vivian Cremer
Leitão Paulo
Limeira Marcelo
Oliveira André Schneider de
Piardi Luis
Publication venue: 'MDPI AG'
Publication date: 01/01/2019
Field of study

The current industrial scenario demands advances that depend on expensive and sophisticated solutions. Augmented Reality (AR) can complement, with virtual elements, the real world. Faced with this features, an AR experience can meet the demand for prototype testing and new solutions, predicting problems and failures that may only exist in real situations. This work presents an environment for experimentation of advanced behaviors in smart factories, allowing experimentation with multi-robot systems (MRS), interconnected, cooperative, and interacting with virtual elements. The concept of ARENA introduces a novel approach to realistic and immersive experimentation in industrial environments, aiming to evaluate new technologies aligned with the Industry 4.0. The proposed method consists of a small-scale warehouse, inspired in a real scenario characterized in this paper, managing by a group of autonomous forklifts, fully interconnected, which are embodied by a swarm of tiny robots developed and prepared to operate in the small scale scenario. The AR is employed to enhance the capabilities of swarm robots, allowing box handling and virtual forklifts. Virtual laser range finders (LRF) are specially designed as segmentation of a global RGB-D camera, to improve robot perception, allowing obstacle avoidance and environment mapping. This infrastructure enables the evaluation of new strategies to improve manufacturing productivity, without compromising the production by automation faults.info:eu-repo/semantics/publishedVersio

Multidisciplinary Digital Publishing Institute

Biblioteca Digital do IPB

The use of alternative data models in data warehousing environments

Author: Gonzalez Castro Victor
Publication venue: 'Heriot-Watt University'
Publication date: 01/05/2009
Field of study

Data Warehouses are increasing their data volume at an accelerated rate; high disk space consumption; slow query response time and complex database administration are common problems in these environments. The lack of a proper data model and an adequate architecture specifically targeted towards these environments are the root causes of these problems. Inefficient management of stored data includes duplicate values at column level and poor management of data sparsity which derives from a low data density, and affects the final size of Data Warehouses. It has been demonstrated that the Relational Model and Relational technology are not the best techniques for managing duplicates and data sparsity. The novelty of this research is to compare some data models considering their data density and their data sparsity management to optimise Data Warehouse environments. The Binary-Relational, the Associative/Triple Store and the Transrelational models have been investigated and based on the research results a novel Alternative Data Warehouse Reference architectural configuration has been defined. For the Transrelational model, no database implementation existed. Therefore it was necessary to develop an instantiation of it’s storage mechanism, and as far as could be determined this is the first public domain instantiation available of the storage mechanism for the Transrelational model

ROS: The Research Output Service. Heriot-Watt University Edinburgh