Search CORE

194 research outputs found

Requirement-driven creation and deployment of multidimensional and ETL designs

Author: Abelló Gamazo Alberto
Jovanovic Petar
Romero Moral Óscar
Simitsis Alkis
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

We present our tool for assisting designers in the error-prone and time-consuming tasks carried out at the early stages of a data warehousing project. Our tool semi-automatically produces multidimensional (MD) and ETL conceptual designs from a given set of business requirements (like SLAs) and data source descriptions. Subsequently, our tool translates both the MD and ETL conceptual designs produced into physical designs, so they can be further deployed on a DBMS and an ETL engine. In this paper, we describe the system architecture and present our demonstration proposal by means of an example.Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

DBMSs Should Talk Back Too

Author: Ioannidis Yannis
Simitsis Alkis
Publication venue
Publication date: 01/01/2009
Field of study

Natural language user interfaces to database systems have been studied for several decades now. They have mainly focused on parsing and interpreting natural language queries to generate them in a formal database language. We envision the reverse functionality, where the system would be able to take the internal result of that translation, say in SQL form, translate it back into natural language, and show it to the initiator of the query for verification. Likewise, information extraction has received considerable attention in the past ten years or so, identifying structured information in free text so that it may then be stored appropriately and queried. Validation of the records stored with a backward translation into text would again be very powerful. Verification and validation of query and data input of a database system correspond to just one example of the many important applications that would benefit greatly from having mature techniques for translating such database constructs into free-flowing text. The problem appears to be deceivingly simple, as there are no ambiguities or other complications in interpreting internal database elements, so initially a straightforward translation appears adequate. Reality teaches us quite the opposite, however, as the resulting text should be expressive, i.e., accurate in capturing the underlying queries or data, and effective, i.e., allowing fast and unique interpretation of them. Achieving both of these qualities is very difficult and raises several technical challenges that need to be addressed. In this paper, we first expose the reader to several situations and applications that need translation into natural language, thereby, motivating the problem. We then outline, by example, the research problems that need to be solved, separately for data translations and query translations.Comment: CIDR 200

arXiv.org e-Print Archive

CiteSeerX

DSpace at NTUA

Effectiveness of Economic Adjustment Programmes for Debt Crises Implemented in the Southern European Union Countries

Author: Bilmpili V
Simitsis G
Zoumpoulidis V
Publication venue: 'Knowledge E'
Publication date: 01/01/2018
Field of study

This article addresses the effectiveness of the economic adjustment programmes for debt crises implemented in the southern European Union countries, a rather contemporary, as well as disputable, issue. All South-European countries that faced a debt crisis had already adopted the European single currency – Euro. Our literature review depicts contemporary research work on debt crises, their economic and social implications either generally or, more relative to our work, South-European-country specific. Our research work is based on a wide range of statistical indices, in an effort to appreciate the effectiveness of the economic adjustment programmes, holistically. The countries addressed were Greece, Portugal, Spain and Cyprus. The applied statistical indices were grouped in six pillars that are considered to be essential to social prosperity. These pillars are financial prosperity, employment, healthcare, education, governance and entrepreneurship. All data were eventually incorporated in a single index, namely `Social Prosperity Index', in an attempt to attain a holistic view on the effectiveness of these programmes. This approach contradicts the mainstream approach of pure financially oriented assessments. Portugal scores first in this appraisal – not only fully recovering but even improving social prosperity standards for its citizens – followed closely by Spain and Cyprus. Greece recorded the worst classification, albeit the index is recovering to pro-crisis levels. Our empirical results suggest that these programmes had a significant impact on the countries that were implemented. In solely financial terms, the programmes proved to be quite effective for all countries. However, their effectiveness is rather questionable if we take into consideration all pillars of social prosperity. The most problematic pillar is employment, which challenges governments and especially their citizens. European and sovereign policies must urgently address employment problems, whereas economists are already talking about a `lost generation'.     Keywords: sovereign debt crisis, Euro, social prosperity, economic adjustment programmes, South Europe &nbsp

Neliti

KnE Publishing Platform

Multi-core column-store parallelization under concurrent workload

Author: Gawade M.
Kersten M.
Simitsis A.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

International Migration, Integration and Social Cohesion online publications

Adversarial Learning in Real-World Fraud Detection: Challenges and Perspectives

Author: Bontempi Gianluca
Caelen Olivier
Lunghi Danele
Simitsis Alkis
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 03/07/2023
Field of study

Data economy relies on data-driven systems and complex machine learning applications are fueled by them. Unfortunately, however, machine learning models are exposed to fraudulent activities and adversarial attacks, which threaten their security and trustworthiness. In the last decade or so, the research interest on adversarial machine learning has grown significantly, revealing how learning applications could be severely impacted by effective attacks. Although early results of adversarial machine learning indicate the huge potential of the approach to specific domains such as image processing, still there is a gap in both the research literature and practice regarding how to generalize adversarial techniques in other domains and applications. Fraud detection is a critical defense mechanism for data economy, as it is for other applications as well, which poses several challenges for machine learning. In this work, we describe how attacks against fraud detection systems differ from other applications of adversarial machine learning, and propose a number of interesting directions to bridge this gap

arXiv.org e-Print Archive

GEM: requirement-driven generation of ETL and multidimensional conceptual designs

Author: Abelló Gamazo Alberto
Romero Moral Óscar
Simitsis Alkis
Publication venue
Publication date: 01/01/2010
Field of study

Technical ReportAt the early stages of a data warehouse design project, the main objective is to collect the business requirements and needs, and translate them into an appropriate conceptual, multidimensional design. Typically, this task is performed manually, through a series of interviews involving two different parties: the business analysts and technical designers. Producing an appropriate conceptual design is an errorprone task that undergoes several rounds of reconciliation and redesigning, until the business needs are satisfied. It is of great importance for the business of an enterprise to facilitate and automate such a process. The goal of our research is to provide designers with a semi-automatic means for producing conceptual multidimensional designs and also, conceptual representation of the extract-transform-load (ETL)processes that orchestrate the data flow from the operational sources to the data warehouse constructs. In particular, we describe a method that combines information about the data sources along with the business requirements, for validating and completing –if necessary– these requirements, producing a multidimensional design, and identifying the ETL operations needed. We present our method in terms of the TPC-DS benchmark and show its applicability and usefulness.Preprin

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Multi-core column-store parallelization under concurrent workload

Author: Gawade M.M. (Mrunal)
Kersten M.L. (Martin)
Simitsis A. (Alkis)
Publication venue: 'American College of Medical Physics (ACMP)'
Publication date: 01/06/2016
Field of study

CWI's Institutional Repository

XWeB: the XML Warehouse Benchmark

Author: A. Schmidt
A. Simitsis
C. Kit
J. Darmont
J. Gray
K. Runapongsa
L. Afanasiev
L. Wyatt
P. O’Neil
R. Kimball
R. Torlone
S. Bressan
S. Rizzi
T. Böhme
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 17/09/2010
Field of study

With the emergence of XML as a standard for representing business data, new decision support applications are being developed. These XML data warehouses aim at supporting On-Line Analytical Processing (OLAP) operations that manipulate irregular XML data. To ensure feasibility of these new tools, important performance issues must be addressed. Performance is customarily assessed with the help of benchmarks. However, decision support benchmarks do not currently support XML features. In this paper, we introduce the XML Warehouse Benchmark (XWeB), which aims at filling this gap. XWeB derives from the relational decision support benchmark TPC-H. It is mainly composed of a test data warehouse that is based on a unified reference model for XML warehouses and that features XML-specific structures, and its associate XQuery decision support workload. XWeB's usage is illustrated by experiments on several XML database management systems

arXiv.org e-Print Archive

Crossref

HAL

Quality measures for ETL processes: from goals to implementation

Author: Akkaoui
Batini
Bellatreche
Brereton
Bresciani
Chung
Dustdar
Frakes
Gill
Giorgini
Horkoff
Horkoff
Horkoff
Jarke
Jarke
Jogalekar
Kitchenham
Lamsweerde
Leite
Naumann
Romero
Simitsis
Sánchez-González
Thiele
Yu
Publication venue: 'Wiley'
Publication date: 01/01/2016
Field of study

Extraction transformation loading (ETL) processes play an increasingly important role for the support of modern business operations. These business processes are centred around artifacts with high variability and diverse lifecycles, which correspond to key business entities. The apparent complexity of these activities has been examined through the prism of business process management, mainly focusing on functional requirements and performance optimization. However, the quality dimension has not yet been thoroughly investigated, and there is a need for a more human-centric approach to bring them closer to business-users requirements. In this paper, we take a first step towards this direction by defining a sound model for ETL process quality characteristics and quantitative measures for each characteristic, based on existing literature. Our model shows dependencies among quality characteristics and can provide the basis for subsequent analysis using goal modeling techniques. We showcase the use of goal modeling for ETL process design through a use case, where we employ the use of a goal model that includes quantitative components (i.e., indicators) for evaluation and analysis of alternative design decisions.Peer ReviewedPostprint (author's final draft

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC