2,064 research outputs found

    A Design Framework for Efficient Distributed Analytics on Structured Big Data

    Get PDF
    Distributed analytics architectures are often comprised of two elements: a compute engine and a storage system. Conventional distributed storage systems usually store data in the form of files or key-value pairs. This abstraction simplifies how the data is accessed and reasoned about by an application developer. However, the separation of compute and storage systems makes it difficult to optimize costly disk and network operations. By design the storage system is isolated from the workload and its performance requirements such as block co-location and replication. Furthermore, optimizing fine-grained data access requests becomes difficult as the storage layer is hidden away behind such abstractions. Using a clean slate approach, this thesis proposes a modular distributed analytics system design which is centered around a unified interface for distributed data objects named the DDO. The interface couples key mechanisms that utilize storage, memory, and compute resources. This coupling makes it ideal to optimize data access requests across all memory hierarchy levels, with respect to the workload and its performance requirements. In addition to the DDO, a complementary DDO controller implementation controls the logical view of DDOs, their replication, and distribution across the cluster. A proof-of-concept implementation shows improvement in mean query time by 3-6x on the TPC-H and TPC-DS benchmarks, and more than an order of magnitude improvement in many cases

    The Data Lakehouse: Data Warehousing and More

    Full text link
    Relational Database Management Systems designed for Online Analytical Processing (RDBMS-OLAP) have been foundational to democratizing data and enabling analytical use cases such as business intelligence and reporting for many years. However, RDBMS-OLAP systems present some well-known challenges. They are primarily optimized only for relational workloads, lead to proliferation of data copies which can become unmanageable, and since the data is stored in proprietary formats, it can lead to vendor lock-in, restricting access to engines, tools, and capabilities beyond what the vendor offers. As the demand for data-driven decision making surges, the need for a more robust data architecture to address these challenges becomes ever more critical. Cloud data lakes have addressed some of the shortcomings of RDBMS-OLAP systems, but they present their own set of challenges. More recently, organizations have often followed a two-tier architectural approach to take advantage of both these platforms, leveraging both cloud data lakes and RDBMS-OLAP systems. However, this approach brings additional challenges, complexities, and overhead. This paper discusses how a data lakehouse, a new architectural approach, achieves the same benefits of an RDBMS-OLAP and cloud data lake combined, while also providing additional advantages. We take today's data warehousing and break it down into implementation independent components, capabilities, and practices. We then take these aspects and show how a lakehouse architecture satisfies them. Then, we go a step further and discuss what additional capabilities and benefits a lakehouse architecture provides over an RDBMS-OLAP

    Benefits and barriers of self-service business intelligence implementation in micro-enterprises: a case of ABC Travel & Consulting

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Information Systems and Technologies ManagementSmall medium enterprises (hereinafter: SME) represent 99.8 % of firms in the non-financial business sector of the European Union. SME’s cover three different types of companies, namely micro-, small- and medium-sized enterprises. Micro-enterprises are the most common type of SME in the European Economic Area, accounting for 93.2 % of the non-financial business sector (Muller, Julius, Herr & Peycheva, 2017). Due to their importance, the focus of this work will be on micro-enterprises. They are defined by two factors: firstly, the number of employees has to be lower than ten, and secondly, the turnover or the total assets must be lower than or equal to two million Euros (European Commission, 2014). Business intelligence systems (hereinafter: BIS) have become significantly important in the business world and academic community over the last two decades (Chen, Chiang & Storey, 2012). The global revenue reached a volume of 18.3billionin2017andisforecastedtoreach 18.3 billion in 2017 and is forecasted to reach 22.8 billion by the end of 2020. Modern BIS continue to expand more rapidly than the overall market (Moore, 2017). The benefits of the integration of BIS can be seen longterm, users are typically decision makers at higher organizational levels (Puklavec, Oliveira & Popovic, 2014). With the usage of BIS, knowledge workers such as executives, managers, and analysts can make better and faster decisions (Chaudhuri, Dayal & Narasayya, 2011). The proper usage of BIS can be seen as a prerequisite for business success, but these tools are often complex and require a high level of expertise to work with (Davenport, 2017). It is a challenge for micro companies to implement BIS because they have often only a limited set of financial and human resources (Puklavec, Oliveira & Popovic, 2014)

    The United States Marine Corps Data Collaboration Requirements: Retrieving and Integrating Data From Multiple Databases

    Get PDF
    The goal of this research is to develop an information sharing and database integration model and suggest a framework to fully satisfy the United States Marine Corps collaboration requirements as well as its information sharing and database integration needs. This research is exploratory; it focuses on only one initiative: the IT-21 initiative. The IT-21 initiative dictates The Technology for the United States Navy and Marine Corps, 2000-2035: Becoming a 21st Century Force. The IT-21 initiative states that Navy and Marine Corps information infrastructure will be based largely on commercial systems and services, and the Department of the Navy must ensure that these systems are seamlessly integrated and that information transported over the infrastructure is protected and secure. The Delphi Technique, a qualitative method approach, was used to develop a Holistic Model and to suggest a framework for information sharing and database integration. Data was primarily collected from mid-level to senior information officers, with a focus on Chief Information Officers. In addition, an extensive literature review was conducted to gain insight about known similarities and differences in Strategic Information Management, information sharing strategies, and database integration strategies. It is hoped that the Armed Forces and the Department of Defense will benefit from future development of the information sharing and database integration Holistic Model

    1st doctoral symposium of the international conference on software language engineering (SLE) : collected research abstracts, October 11, 2010, Eindhoven, The Netherlands

    Get PDF
    The first Doctoral Symposium to be organised by the series of International Conferences on Software Language Engineering (SLE) will be held on October 11, 2010 in Eindhoven, as part of the 3rd instance of SLE. This conference series aims to integrate the different sub-communities of the software-language engineering community to foster cross-fertilisation and strengthen research overall. The Doctoral Symposium at SLE 2010 aims to contribute towards these goals by providing a forum for both early and late-stage Ph.D. students to present their research and get detailed feedback and advice from researchers both in and out of their particular research area. Consequently, the main objectives of this event are: – to give Ph.D. students an opportunity to write about and present their research; – to provide Ph.D. students with constructive feedback from their peers and from established researchers in their own and in different SLE sub-communities; – to build bridges for potential research collaboration; and – to foster integrated thinking about SLE challenges across sub-communities. All Ph.D. students participating in the Doctoral Symposium submitted an extended abstract describing their doctoral research. Based on a good set of submisssions we were able to accept 13 submissions for participation in the Doctoral Symposium. These proceedings present final revised versions of these accepted research abstracts. We are particularly happy to note that submissions to the Doctoral Symposium covered a wide range of SLE topics drawn from all SLE sub-communities. In selecting submissions for the Doctoral Symposium, we were supported by the members of the Doctoral-Symposium Selection Committee (SC), representing senior researchers from all areas of the SLE community.We would like to thank them for their substantial effort, without which this Doctoral Symposium would not have been possible. Throughout, they have provided reviews that go beyond the normal format of a review being extra careful in pointing out potential areas of improvement of the research or its presentation. Hopefully, these reviews themselves will already contribute substantially towards the goals of the symposium and help students improve and advance their work. Furthermore, all submitting students were also asked to provide two reviews for other submissions. The members of the SC went out of their way to comment on the quality of these reviews helping students improve their reviewing skills

    Role of Access Control in Information Security: A Security Analysis Approach

    Get PDF
    Information plays a vital role in decision-making and driving the world further in the ever-growing digital world. Authorization, which comes immediately after authentication, is essential in restricting access to information in the digital world. Various access control models have been proposed to ensure authorization by specifying access control policies. Security analysis of access control policies is a highly challenging task. Additionally, the security analysis of decentralized access control policies is complex because decentralization simplifies policy administration but raises security concerns. Therefore, an efficient security analysis approach is required to ensure the correctness of access control policies. This chapter presents a propositional rule-based machine learning approach for analyzing the Role-Based Access Control (RBAC) policies. Specifically, the proposed method maps RBAC policies into propositional rules to analyze security policies. Extensive experiments on various datasets containing RBAC policies demonstrate that the machine learning-based approach can offer valuable insight into analyzing RBAC policies
    • …
    corecore