605 research outputs found

    Distributed and scalable parsing solution for telecom network data

    Get PDF
    The growing usage of mobile devices and the introduction of 5G networks have increased the significance of network data for the telecom business. The success of telecom organizations can depend on employing efficient data engineering techniques for transforming raw network data into useful information by analytics and machine learning (ML). Elisa Oyj., a Finnish telecommunications company, receives massive amounts of network data from network equipment manufactured by various vendors. The effectiveness of data analytics depends on efficient data engineering processes. This thesis presents a scalable data parsing solution that leverages Spark, a distributed programming framework, for parallelizing parsing routines from an existing parsing solution. We design and deploy this solution as a component of the organization's data engineering pipeline to enable automation of data-centric operations. Experimental results indicate that the efficiency of the proposed solution is heavily dependent on the individual file size distribution. The proposed parsing solution demonstrates reliability, scalability, and speed during empirical evaluation and processes a 24-hour network data within 3 hours. The main outcome of the project is an optimized setup with the minimum number of data partitions to ensure zero failures and thus minimum execution time. A smaller execution time leads to lower costs of the continuously running infrastructure provisioned on the cloud

    Building Blocks for IoT Analytics Internet-of-Things Analytics

    Get PDF
    Internet-of-Things (IoT) Analytics are an integral element of most IoT applications, as it provides the means to extract knowledge, drive actuation services and optimize decision making. IoT analytics will be a major contributor to IoT business value in the coming years, as it will enable organizations to process and fully leverage large amounts of IoT data, which are nowadays largely underutilized. The Building Blocks of IoT Analytics is devoted to the presentation the main technology building blocks that comprise advanced IoT analytics systems. It introduces IoT analytics as a special case of BigData analytics and accordingly presents leading edge technologies that can be deployed in order to successfully confront the main challenges of IoT analytics applications. Special emphasis is paid in the presentation of technologies for IoT streaming and semantic interoperability across diverse IoT streams. Furthermore, the role of cloud computing and BigData technologies in IoT analytics are presented, along with practical tools for implementing, deploying and operating non-trivial IoT applications. Along with the main building blocks of IoT analytics systems and applications, the book presents a series of practical applications, which illustrate the use of these technologies in the scope of pragmatic applications. Technical topics discussed in the book include: Cloud Computing and BigData for IoT analyticsSearching the Internet of ThingsDevelopment Tools for IoT Analytics ApplicationsIoT Analytics-as-a-ServiceSemantic Modelling and Reasoning for IoT AnalyticsIoT analytics for Smart BuildingsIoT analytics for Smart CitiesOperationalization of IoT analyticsEthical aspects of IoT analyticsThis book contains both research oriented and applied articles on IoT analytics, including several articles reflecting work undertaken in the scope of recent European Commission funded projects in the scope of the FP7 and H2020 programmes. These articles present results of these projects on IoT analytics platforms and applications. Even though several articles have been contributed by different authors, they are structured in a well thought order that facilitates the reader either to follow the evolution of the book or to focus on specific topics depending on his/her background and interest in IoT and IoT analytics technologies. The compilation of these articles in this edited volume has been largely motivated by the close collaboration of the co-authors in the scope of working groups and IoT events organized by the Internet-of-Things Research Cluster (IERC), which is currently a part of EU's Alliance for Internet of Things Innovation (AIOTI)

    Technologies and Applications for Big Data Value

    Get PDF
    This open access book explores cutting-edge solutions and best practices for big data and data-driven AI applications for the data-driven economy. It provides the reader with a basis for understanding how technical issues can be overcome to offer real-world solutions to major industrial areas. The book starts with an introductory chapter that provides an overview of the book by positioning the following chapters in terms of their contributions to technology frameworks which are key elements of the Big Data Value Public-Private Partnership and the upcoming Partnership on AI, Data and Robotics. The remainder of the book is then arranged in two parts. The first part “Technologies and Methods” contains horizontal contributions of technologies and methods that enable data value chains to be applied in any sector. The second part “Processes and Applications” details experience reports and lessons from using big data and data-driven approaches in processes and applications. Its chapters are co-authored with industry experts and cover domains including health, law, finance, retail, manufacturing, mobility, and smart cities. Contributions emanate from the Big Data Value Public-Private Partnership and the Big Data Value Association, which have acted as the European data community's nucleus to bring together businesses with leading researchers to harness the value of data to benefit society, business, science, and industry. The book is of interest to two primary audiences, first, undergraduate and postgraduate students and researchers in various fields, including big data, data science, data engineering, and machine learning and AI. Second, practitioners and industry experts engaged in data-driven systems, software design and deployment projects who are interested in employing these advanced methods to address real-world problems

    Quarry: A user-centered big data integration platform

    Get PDF
    Obtaining valuable insights and actionable knowledge from data requires cross-analysis of domain data typically coming from various sources. Doing so, inevitably imposes burdensome processes of unifying different data formats, discovering integration paths, and all this given specific analytical needs of a data analyst. Along with large volumes of data, the variety of formats, data models, and semantics drastically contribute to the complexity of such processes. Although there have been many attempts to automate various processes along the Big Data pipeline, no unified platforms accessible by users without technical skills (like statisticians or business analysts) have been proposed. In this paper, we present a Big Data integration platform (Quarry) that uses hypergraph-based metadata to facilitate (and largely automate) the integration of domain data coming from a variety of sources, and provides an intuitive interface to assist end users both in: (1) data exploration with the goal of discovering potentially relevant analysis facets, and (2) consolidation and deployment of data flows which integrate the data, and prepare them for further analysis (descriptive or predictive), visualization, and/or publishing. We validate Quarry’s functionalities with the use case of World Health Organization (WHO) epidemiologists and data analysts in their fight against Neglected Tropical Diseases (NTDs).This work is partially supported by GENESIS project, funded by the Spanish Ministerio de Ciencia, Innovación y Universidades under project TIN2016-79269-R.Peer ReviewedPostprint (author's final draft
    • …
    corecore