71 research outputs found

    Continuous Delivery in Data Warehousing

    Get PDF
    Tämän väitöskirjan motivaatio kumpuaa käytännön ongelmasta: kuinka lyhentää aikaa ideasta analysoida jotain siihen, että analyysi on käyttäjien saatavilla. Tietovarastointia on perinteisesti pidetty monimutkaisena ja siten herkkänä virheille. Tietovarastoinnissa erilliset vaiheet tapahtuvat peräkkäin ennalta määritellyssä järjestyksessä. Perinteinen tapa tietovarastoinnissa on ottaa koko ratkaisu kerralla tuotantokäyttöön, jossa kaikki tietovaraston palaset ovat paikoillaan ennen tuotantokäyttöä. Mikäli kehitys seuraa lyhyitä iteraatioita, miksi käyttöönotot tuotantoon eivät seuraa näitä iteraatioita? Tämä väitöskirja esittelee kuinka raportointi- ja tietovarastointitiimit voivat rakentaa yhtäaikaa raportointiratkaisuja (business intelligence) vaiheittain. Yhteistyö tehostaa kehittäjien välistä kommunikaatiota ja lyhentää palautesykliä loppukäyttäjältä kehittäjille mikä tekee palautteesta suorempaa. Jatkuvan käyttöönoton käytännöt tukevat julkaisemista usein tuotantoympäristöön. Kaksikerroksinen tietovarastoarkkitehtuuri erottaa analyyttisen ja tapahtumapohjaisen käsittelyn. Erilaisten käsittelyjen erottaminen mahdollistaa paremman testauksen ja siten jatkuvan käyttöönoton. Käytettäessä jatkuvaa käyttöönottoa, voidaan kehitysaikaa lyhentää myös automatisoimalla tietomuunnosten toteutustyötä. Tämä väitöskirja esittelee tietomallin tietomuunnosten automatisoinnin toteuttamista varten, niin tiedon saattamiseksi tietovarastoon kuin tiedon hyödyntämiseen tietovarastosta. Tutkimuksen arvioinnissa noudatettiin suunnittelutieteen suuntaviivoja. Tutkimus tehtiin yhteistyössä teollisuuden ja yliopistojen kanssa. Näitä ideoita on testattu todellisissa projekteissa lupaavin tuloksin ja siten ne on todistettu toimiviksi.Continuous delivery is an activity in the field of continuous software engineering. Data warehousing, on the other hand, lie within information systems research. This dissertation combines these two traditionally separate concerns of continuous delivery and data warehousing. This dissertation’s motivation stems from a practical problem: how to shorten the time from a reporting idea until it is available for users. Data warehousing has traditionally been considered tedious and delicate. In data warehousing, distinct steps take place one after another in a predefined unalterable sequence. Another traditional aspect of data warehousing is bringing everything at once to a production environment, where all the pieces of a data warehouse are in place before production use. If development follows agile iterations, why are the releases in production not following the same iterations? This dissertation introduces how reporting and data warehouse teams can synchronously build business intelligence solutions in increments. Joint working enhances communication between developers and shortens the feedback cycle from an end-user to developers, and makes the feedback more direct. Continuous delivery practices support releasing frequently to a production environment. A two-layer data warehouse architecture separates analytical and transactional processing. Separating different processing targets enables better testing and, thus, continuous delivery. When frequently deploying with continuous delivery practices, automating transformation creation in data warehousing reduces the development time. This dissertation introduces an information model for automating the implementation of transformations, getting data into a data warehouse and getting data out of it. The research evaluation followed the design science guidelines. Research for this dissertation collaborated with the industry. These ideas have been tested on real projects with promising results, and thus they have been proven to work

    Big Data and Its Applications in Smart Real Estate and the Disaster Management Life Cycle: A Systematic Analysis

    Get PDF
    Big data is the concept of enormous amounts of data being generated daily in different fields due to the increased use of technology and internet sources. Despite the various advancements and the hopes of better understanding, big data management and analysis remain a challenge, calling for more rigorous and detailed research, as well as the identifications of methods and ways in which big data could be tackled and put to good use. The existing research lacks in discussing and evaluating the pertinent tools and technologies to analyze big data in an efficient manner which calls for a comprehensive and holistic analysis of the published articles to summarize the concept of big data and see field-specific applications. To address this gap and keep a recent focus, research articles published in last decade, belonging to top-tier and high-impact journals, were retrieved using the search engines of Google Scholar, Scopus, and Web of Science that were narrowed down to a set of 139 relevant research articles. Different analyses were conducted on the retrieved papers including bibliometric analysis, keywords analysis, big data search trends, and authors’ names, countries, and affiliated institutes contributing the most to the field of big data. The comparative analyses show that, conceptually, big data lies at the intersection of the storage, statistics, technology, and research fields and emerged as an amalgam of these four fields with interlinked aspects such as data hosting and computing, data management, data refining, data patterns, and machine learning. The results further show that major characteristics of big data can be summarized using the seven Vs, which include variety, volume, variability, value, visualization, veracity, and velocity. Furthermore, the existing methods for big data analysis, their shortcomings, and the possible directions were also explored that could be taken for harnessing technology to ensure data analysis tools could be upgraded to be fast and efficient. The major challenges in handling big data include efficient storage, retrieval, analysis, and visualization of the large heterogeneous data, which can be tackled through authentication such as Kerberos and encrypted files, logging of attacks, secure communication through Secure Sockets Layer (SSL) and Transport Layer Security (TLS), data imputation, building learning models, dividing computations into sub-tasks, checkpoint applications for recursive tasks, and using Solid State Drives (SDD) and Phase Change Material (PCM) for storage. In terms of frameworks for big data management, two frameworks exist including Hadoop and Apache Spark, which must be used simultaneously to capture the holistic essence of the data and make the analyses meaningful, swift, and speedy. Further field-specific applications of big data in two promising and integrated fields, i.e., smart real estate and disaster management, were investigated, and a framework for field-specific applications, as well as a merger of the two areas through big data, was highlighted. The proposed frameworks show that big data can tackle the ever-present issues of customer regrets related to poor quality of information or lack of information in smart real estate to increase the customer satisfaction using an intermediate organization that can process and keep a check on the data being provided to the customers by the sellers and real estate managers. Similarly, for disaster and its risk management, data from social media, drones, multimedia, and search engines can be used to tackle natural disasters such as floods, bushfires, and earthquakes, as well as plan emergency responses. In addition, a merger framework for smart real estate and disaster risk management show that big data generated from the smart real estate in the form of occupant data, facilities management, and building integration and maintenance can be shared with the disaster risk management and emergency response teams to help prevent, prepare, respond to, or recover from the disasters

    Enabling Ubiquitous OLAP Analyses

    Get PDF
    An OLAP analysis session is carried out as a sequence of OLAP operations applied to multidimensional cubes. At each step of a session, an operation is applied to the result of the previous step in an incremental fashion. Due to its simplicity and flexibility, OLAP is the most adopted paradigm used to explore the data stored in data warehouses. With the goal of expanding the fruition of OLAP analyses, in this thesis we touch several critical topics. We first present our contributions to deal with data extractions from service-oriented sources, which are nowadays used to provide access to many databases and analytic platforms. By addressing data extraction from these sources we make a step towards the integration of external databases into the data warehouse, thus providing richer data that can be analyzed through OLAP sessions. The second topic that we study is that of visualization of multidimensional data, which we exploit to enable OLAP on devices with limited screen and bandwidth capabilities (i.e., mobile devices). Finally, we propose solutions to obtain multidimensional schemata from unconventional sources (e.g., sensor networks), which are crucial to perform multidimensional analyses

    Shellhive: Towards a Collaborative Visual Programming Language for UNIX Workflows

    Get PDF
    Big data é uma palavra-chave relativamente nova na indústria de software, diariamente é gerado uma enorme quantidade de dados que torna-se complicado geri-lo usando ferramentas de gestão de dados tradicionais, motivando-o a uma nova adaptação da programação tradicional para usarparadigmas e arquitecturas que possam processar tais quantidades de dados. Os sistemas operativos baseados em Unix fornecem ferramentas de programação, que usa paradigmas focadas no processamento de grandes quantidades de dados desde há muito tempo. Nesta tese, propomos umasolução para alavancar tais ferramentas Unix, de forma que programadores com pouca experiência em programação tenham capacidades de criar, entender e modificar tarefas relacionados com big-data com maior facilidade. A própria aplicação permite os utilizadores desenharem workflows colaborativamente para que os principiantes possam ajudar os outros e potencialmente aprender dos utilizadores mais experientes.Big data is a relatively new keyword in the software industry, every day, data is being generated in such a great amount that it becomes difficult to manage using traditional data management tools, motivating the adaptation of traditional programming to use paradigms and architectures that can process large amounts of data. The Unix based operative system provides programmingtools that uses said paradigms that focused on the process of large amount of data for a long time. In this thesis, we propose a solution to leverage said unix tools, to empower programmers with little experience in programming with the ability to create big-data related tasks that it's easier to understand and to modify. The application itself allows the user to design workflows collaboratively in order for people who are comfortable in Unix environment to help each otherand potentially learn from the more experient users

    Database Principles and Technologies – Based on Huawei GaussDB

    Get PDF
    This open access book contains eight chapters that deal with database technologies, including the development history of database, database fundamentals, introduction to SQL syntax, classification of SQL syntax, database security fundamentals, database development environment, database design fundamentals, and the application of Huawei’s cloud database product GaussDB database. This book can be used as a textbook for database courses in colleges and universities, and is also suitable as a reference book for the HCIA-GaussDB V1.5 certification examination. The Huawei GaussDB (for MySQL) used in the book is a Huawei cloud-based high-performance, highly applicable relational database that fully supports the syntax and functionality of the open source database MySQL. All the experiments in this book can be run on this database platform. As the world’s leading provider of ICT (information and communication technology) infrastructure and smart terminals, Huawei’s products range from digital data communication, cyber security, wireless technology, data storage, cloud computing, and smart computing to artificial intelligence

    Building Blocks for IoT Analytics Internet-of-Things Analytics

    Get PDF
    Internet-of-Things (IoT) Analytics are an integral element of most IoT applications, as it provides the means to extract knowledge, drive actuation services and optimize decision making. IoT analytics will be a major contributor to IoT business value in the coming years, as it will enable organizations to process and fully leverage large amounts of IoT data, which are nowadays largely underutilized. The Building Blocks of IoT Analytics is devoted to the presentation the main technology building blocks that comprise advanced IoT analytics systems. It introduces IoT analytics as a special case of BigData analytics and accordingly presents leading edge technologies that can be deployed in order to successfully confront the main challenges of IoT analytics applications. Special emphasis is paid in the presentation of technologies for IoT streaming and semantic interoperability across diverse IoT streams. Furthermore, the role of cloud computing and BigData technologies in IoT analytics are presented, along with practical tools for implementing, deploying and operating non-trivial IoT applications. Along with the main building blocks of IoT analytics systems and applications, the book presents a series of practical applications, which illustrate the use of these technologies in the scope of pragmatic applications. Technical topics discussed in the book include: Cloud Computing and BigData for IoT analyticsSearching the Internet of ThingsDevelopment Tools for IoT Analytics ApplicationsIoT Analytics-as-a-ServiceSemantic Modelling and Reasoning for IoT AnalyticsIoT analytics for Smart BuildingsIoT analytics for Smart CitiesOperationalization of IoT analyticsEthical aspects of IoT analyticsThis book contains both research oriented and applied articles on IoT analytics, including several articles reflecting work undertaken in the scope of recent European Commission funded projects in the scope of the FP7 and H2020 programmes. These articles present results of these projects on IoT analytics platforms and applications. Even though several articles have been contributed by different authors, they are structured in a well thought order that facilitates the reader either to follow the evolution of the book or to focus on specific topics depending on his/her background and interest in IoT and IoT analytics technologies. The compilation of these articles in this edited volume has been largely motivated by the close collaboration of the co-authors in the scope of working groups and IoT events organized by the Internet-of-Things Research Cluster (IERC), which is currently a part of EU's Alliance for Internet of Things Innovation (AIOTI)

    Images on the Move

    Get PDF
    In contemporary society, digital images have become increasingly mobile. They are networked, shared on social media, and circulated across small and portable screens. Accordingly, the discourses of spreadability and circulation have come to supersede the focus on production, indexicality, and manipulability, which had dominated early conceptions of digital photography and film. However, the mobility of images is neither technologically nor conceptually limited to the realm of the digital. The edited volume re-examines the historical, aesthetical, and theoretical relevance of image mobility. The contributors provide a materialist account of images on the move - ranging from wired photography to postcards to streaming media

    Images on the Move: Materiality - Networks - Formats

    Get PDF
    In contemporary society, digital images have become increasingly mobile. They are networked, shared on social media, and circulated across small and portable screens. Accordingly, the discourses of spreadability and circulation have come to supersede the focus on production, indexicality, and manipulability, which had dominated early conceptions of digital photography and film. However, the mobility of images is neither technologically nor conceptually limited to the realm of the digital. The edited volume re-examines the historical, aesthetical, and theoretical relevance of image mobility. The contributors provide a materialist account of images on the move - ranging from wired photography to postcards to streaming media

    Database Principles and Technologies – Based on Huawei GaussDB

    Get PDF
    This open access book contains eight chapters that deal with database technologies, including the development history of database, database fundamentals, introduction to SQL syntax, classification of SQL syntax, database security fundamentals, database development environment, database design fundamentals, and the application of Huawei’s cloud database product GaussDB database. This book can be used as a textbook for database courses in colleges and universities, and is also suitable as a reference book for the HCIA-GaussDB V1.5 certification examination. The Huawei GaussDB (for MySQL) used in the book is a Huawei cloud-based high-performance, highly applicable relational database that fully supports the syntax and functionality of the open source database MySQL. All the experiments in this book can be run on this database platform. As the world’s leading provider of ICT (information and communication technology) infrastructure and smart terminals, Huawei’s products range from digital data communication, cyber security, wireless technology, data storage, cloud computing, and smart computing to artificial intelligence

    Adaptive P2P platform for data sharing

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH
    corecore