558 research outputs found

    Per-flow cardinality estimation based on virtual LogLog sketching

    Get PDF
    Flow cardinality estimation is the problem of estimating the number of distinct elements in a data flow, often with a stringent memory constraint. It has wide applications in network traffic measurement and in database systems. The virtual LogLog algorithm proposed recently by Xiao, Chen, Chen and Ling estimates the cardinalities of a large number of flows with a compact memory. The purpose of this thesis is to explore two new perspectives on the estimation process of this algorithm. Firstly, we propose and investigate a family of estimators that generalizes the original vHLL estimator and evaluate the performance of the vHLL estimator compared to other estimators in this family. Secondly, we propose an alternative solution to the estimation problem by deriving a maximum-likelihood estimator. Empirical evidence from both perspectives suggests the near-optimality of the vHLL estimator for per-flow estimation, analogous to the near-optimality of the HLL estimator for single-flow estimation

    Patient Dropout Prediction in Virtual Health: A Multimodal Dynamic Knowledge Graph and Text Mining Approach

    Full text link
    Virtual health has been acclaimed as a transformative force in healthcare delivery. Yet, its dropout issue is critical that leads to poor health outcomes, increased health, societal, and economic costs. Timely prediction of patient dropout enables stakeholders to take proactive steps to address patients' concerns, potentially improving retention rates. In virtual health, the information asymmetries inherent in its delivery format, between different stakeholders, and across different healthcare delivery systems hinder the performance of existing predictive methods. To resolve those information asymmetries, we propose a Multimodal Dynamic Knowledge-driven Dropout Prediction (MDKDP) framework that learns implicit and explicit knowledge from doctor-patient dialogues and the dynamic and complex networks of various stakeholders in both online and offline healthcare delivery systems. We evaluate MDKDP by partnering with one of the largest virtual health platforms in China. MDKDP improves the F1-score by 3.26 percentage points relative to the best benchmark. Comprehensive robustness analyses show that integrating stakeholder attributes, knowledge dynamics, and compact bilinear pooling significantly improves the performance. Our work provides significant implications for healthcare IT by revealing the value of mining relations and knowledge across different service modalities. Practically, MDKDP offers a novel design artifact for virtual health platforms in patient dropout management

    Empowering engineering with data, machine learning and artificial intelligence: a short introductive review

    Get PDF
    Simulation-based engineering has been a major protagonist of the technology of the last century. However, models based on well established physics fail sometimes to describe the observed reality. They often exhibit noticeable differences between physics-based model predictions and measurements. This difference is due to several reasons: practical (uncertainty and variability of the parameters involved in the models) and epistemic (the models themselves are in many cases a crude approximation of a rich reality). On the other side, approaching the reality from experimental data represents a valuable approach because of its generality. However, this approach embraces many difficulties: model and experimental variability; the need of a large number of measurements to accurately represent rich solutions (extremely nonlinear or fluctuating), the associate cost and technical difficulties to perform them; and finally, the difficulty to explain and certify, both constituting key aspects in most engineering applications. This work overviews some of the most remarkable progress in the field in recent years

    Forecasting and Assessing Risk of Individual Electricity Peaks

    Get PDF
    Introduction The overarching aim of this open access book is to present self-contained theory and algorithms for investigation and prediction of electric demand peaks. A cross-section of popular demand forecasting algorithms from statistics, machine learning and mathematics is presented, followed by extreme value theory techniques with examples. In order to achieve carbon targets, good forecasts of peaks are essential. For instance, shifting demand or charging battery depends on correct demand predictions in time. Majority of forecasting algorithms historically were focused on average load prediction. In order to model the peaks, methods from extreme value theory are applied. This allows us to study extremes without making any assumption on the central parts of demand distribution and to predict beyond the range of available data. While applied on individual loads, the techniques described in this book can be extended naturally to substations, or to commercial settings. Extreme value theory techniques presented can be also used across other disciplines, for example for predicting heavy rainfalls, wind speed, solar radiation and extreme weather events. The book is intended for students, academics, engineers and professionals that are interested in short term load prediction, energy data analytics, battery control, demand side response and data science in general.</p

    MC^2S: a Mobile Component-based CrowdSensing framework

    Get PDF
    CrowdSensing often refers to sharing data collected by sensing devices with the aim of measure a phenomena of common interest. Within this thesis we will describe MC^2S, a novel Component-based framework suitable for the easy development of multiple, secure, portable, interopeable and concurrent MCS applications. The framework has been built in collaboration between University of Pisa and Trinity College Dublin, starting from September 2015. It exploits both Apache Felix implementation of OSGi framework specifications to ensure composite applications and Java environment to guarantee portability over an huge range of heterogeneous hardaware. However, even if MC^2S framework already offers several forefront capabilities, a lot of additional features may be introduced during the development of its next versions
    • …
    corecore