2,149 research outputs found

    Extending ETL framework using service oriented architecture

    Get PDF
    Extraction, Transformation and Loading (ETL) represent a big portion of a data warehouse project. Complexity of components extensibility is a main problem in the ETL area, because ETL components are tightly-coupled to each others in the current ETL framework.The missing extensibility feature causes impediments to add new components to the current ETL framework; to meet special business needs.This paper shows how to restructure the current ETL framework based on Service Oriented Architecture (SOA) to be easier to extend.This restructuring solution distributes the ETL into interoperable components. The distribution of Extraction, Transformation and Loading components while keeping interoperability amongst them; can be achieved by SOA.A Classified-Fragmentation component to enhance the report generation speed is added to the new framework; as a proof of the extensibility concept.The result of this work is an extensible ETL framework including Classified-Fragmentation component as an extension

    Framework for Interoperable and Distributed Extraction-Transformation-Loading (ETL) Based on Service Oriented Architecture

    Get PDF
    Extraction. Transformation and Loading (ETL) are the major functionalities in data warehouse (DW) solutions. Lack of component distribution and interoperability is a gap that leads to many problems in the ETL domain, which is due to tightly-coupled components in the current ETL framework. This research discusses how to distribute the Extraction, Transformation and Loading components so as to achieve distribution and interoperability of these ETL components. In addition, it shows how the ETL framework can be extended. To achieve that, Service Oriented Architecture (SOA) is adopted to address the mentioned missing features of distribution and interoperability by restructuring the current ETL framework. This research contributes towards the field of ETL by adding the distribution and inter- operability concepts to the ETL framework. This Ieads to contributions towards the area of data warehousing and business intelligence, because ETL is a core concept in this area. The Design Science Approach (DSA) and Scrum methodologies were adopted for achieving the research goals. The integration of DSA and Scrum provides the suitable methods for achieving the research objectives. The new ETL framework is realized by developing and testing a prototype that is based on the new ETL framework. This prototype is successfully evaluated using three case studies that are conducted using the data and tools of three different organizations. These organizations use data warehouse solutions for the purpose of generating statistical reports that help their top management to take decisions. Results of the case studies show that distribution and interoperability can be achieved by using the new ETL framework

    A unified view of data-intensive flows in business intelligence systems : a survey

    Get PDF
    Data-intensive flows are central processes in today’s business intelligence (BI) systems, deploying different technologies to deliver data, from a multitude of data sources, in user-preferred and analysis-ready formats. To meet complex requirements of next generation BI systems, we often need an effective combination of the traditionally batched extract-transform-load (ETL) processes that populate a data warehouse (DW) from integrated data sources, and more real-time and operational data flows that integrate source data at runtime. Both academia and industry thus must have a clear understanding of the foundations of data-intensive flows and the challenges of moving towards next generation BI environments. In this paper we present a survey of today’s research on data-intensive flows and the related fundamental fields of database theory. The study is based on a proposed set of dimensions describing the important challenges of data-intensive flows in the next generation BI setting. As a result of this survey, we envision an architecture of a system for managing the lifecycle of data-intensive flows. The results further provide a comprehensive understanding of data-intensive flows, recognizing challenges that still are to be addressed, and how the current solutions can be applied for addressing these challenges.Peer ReviewedPostprint (author's final draft

    Extending the Communication Capabilities of Agents

    Get PDF
    Agent technology is in principle well suited for realizing various kinds of distributed systems, but in practice agents are seldomly chosen for realizing real-world applications. One reason hindering agents being used in practice is their cumbersome communication mechanism focused on speech act based message exchange which makes them hard for practitioners used to work in an object oriented way. To broaden the application spectrum of agent technology in practice and make them more accessible for object-oriented developers, this paper presents additional communication means for agents. First, it will be shown how agents can interact using strongly typed service interfaces resorting to asynchronous future based methods. These allow keeping agents autonomous and further support several recurrent interaction patterns within one method call, i.e. without having to use complex message protocols. Second, an extension for binary data streaming via virtual connections will be presented. Its usage resembles established input and output streaming APIs and lets developers transfer data between agents in the same simple way as e.g. a file is written to hard disk. Furthermore, virtual connections allow failure tolerant transmission by multiplexing data across different physical connections. Usefulness of the extensions will be further explained with a real-word example application from the area of business intelligence workflows

    SOA enabled ELTA: approach in designing business intelligence solutions in Era of Big Data

    Get PDF
    The current work presents a new approach for designing business intelligence solutions. In the Era of Big Data, former and robust analytical concepts and utilities need to adapt themselves to the changed market circumstances. The main focus of this work is to address the acceleration of building process of a “data-centric” Business Intelligence (BI) solution besides preparing BI solutions for Big Data utilization. This research addresses the following goals: reducing the time spent during business intelligence solution’s design phase; achieving flexibility of BI solution by adding new data sources; and preparing BI solution for utilizing Big Data concepts. This research proposes an extension of the existing Extract, Load and Transform (ELT) approach to the new one Extract, Load, Transform and Analyze (ELTA) supported by service-orientation concept. Additionally, the proposed model incorporates Service-Oriented Architecture concept as a mediator for the transformation phase. On one side, such incorporation brings flexibility to the BI solution and on the other side; it reduces the complexity of the whole system by moving some responsibilities to external authorities

    ENHANCED BI SYSTEMS WITH ON-DEMAND DATA BASED ON SEMANTIC-ENABLED ENTERPRISE SOA

    Get PDF
    Since the 1990s, companies have been investing into IT infrastructure initiatives such as Enterprise Resource Planning (ERP) systems, Supply Chain Management (SCM) systems, and Customer Relationship Management (CRM) systems in order to increase efficiency, effectiveness, and internal process integration, among other goals. The current value of Business Intelligence (BI) for companies could be summarized by two main achievements: improvement of management of processes and improvement of operational processes. This paper will identify current requirements of BI and present a linkage to service-oriented architectures including added-values. Semantic-enabled Enterprise Service-Oriented Architecture (SESOA) is an enterprise solution that links businesses to external systems based on Web Services and SOA concept. It represents a lightweight web application that annotates Web Services that are coming from different service providers with semantics so that the indexing and discovery of these services can be more comprehensive. BI applications can be considered as service consumers in SESOA and can discover, select and invoke the services supplied by the external systems (service providers). In this way, SESOA forms the bridge between SOA and BI concepts to deliver in real time the ?on-demand? data as services and this opens the BI market to include SMEs as main resources of these services

    Container-Managed ETL Applications for Integrating Data in Near Real-Time

    Get PDF
    As the analytical capabilities and applications of e-business systems expand, providing real-time access to critical business performance indicators to improve the speed and effectiveness of business operations has become crucial. The monitoring of business activities requires focused, yet incremental enterprise application integration (EAI) efforts and balancing information requirements in real-time with historical perspectives. The decision-making process in traditional data warehouse environments is often delayed because data cannot be propagated from the source system to the data warehouse in a timely manner. In this paper, we present an architecture for a container-based ETL (extraction, transformation, loading) environment, which supports a continual near real-time data integration with the aim of decreasing the time it takes to make business decisions and to attain minimized latency between the cause and effect of a business decision. Instead of using vendor proprietary ETL solutions, we use an ETL container for managing ETLets (pronounced “et-lets”) for the ETL processing tasks. The architecture takes full advantage of existing J2EE (Java 2 Platform, Enterprise Edition) technology and enables the implementation of a distributed, scalable, near real-time ETL environment. We have fully implemented the proposed architecture. Furthermore, we compare the ETL container to alternative continuous data integration approaches
    • …
    corecore