2 research outputs found

    Novel Trends in Scaling Up Machine Learning Algorithms

    Get PDF
    Big Data has been a catalyst force for the Machine Learning (ML) area, forcing us to rethink existing strategies in order to create innovative solutions that will push forward the field. This paper presents an overview of the strategies for using machine learning in Big Data with emphasis on the high-performance parallel implementations on many-core hardware. The rationale is to increase the practical applicability of ML implementations to large-scale data problems. The common underlying thread has been the recent progress in usability, cost effectiveness and diversity of parallel computing platforms, specifically, the Graphics Processing Units (GPUs), tailored for a broad set of data analysis and Machine Learning tasks. In this context, we provide the main outcomes of a GPU Machine Learning Library (GPUMLib) framework, which empowers researchers with the capacity to tackle larger and more complex problems, by using high-performance implementations of wellknown ML algorithms. Moreover, we attempt to give insights on the future trends of Big Data Analytics and the challenges lying ahead

    CHALLENGES IN THE DEPLOYMENT AND OPERATION OF MACHINE LEARNING IN PRACTICE

    Get PDF
    Machine learning has recently emerged as a powerful technique to increase operational efficiency or to develop new value propositions. However, the translation of a prediction algorithm into an operationally usable machine learning model is a time-consuming and in various ways challenging task. In this work, we target to systematically elicit the challenges in deployment and operation to enable broader practical dissemination of machine learning applications. To this end, we first identify relevant challenges with a structured literature analysis. Subsequently, we conduct an interview study with machine learning practitioners across various industries, perform a qualitative content analysis, and identify challenges organized along three distinct categories as well as six overarching clusters. Eventually, results from both literature and interviews are evaluated with a comparative analysis. Key issues identified include automated strategies for data drift detection and handling, standardization of machine learning infrastructure, and appropriate communication and expectation management
    corecore