63 research outputs found

    Automated machine learning for COVID-19 forecasting

    Get PDF
    In the context of the current COVID-19 pandemic, various sophisticated epidemic and machine learning models have been used for forecasting. These models, however, rely on carefully selected architectures and detailed data that is often only available for specific regions. Automated machine learning (AutoML) addresses these challenges by allowing to automatically create forecasting pipelines in a data-driven manner, resulting in high-quality predictions. In this paper, we study the role of open data along with AutoML systems in acquiring high-performance forecasting models for COVID-19. Here, we adapted the AutoML framework auto-sklearn to the time series forecasting task and introduced two variants for multi-step ahead COVID-19 forecasting, which we refer to as (a) multi-output and (b) repeated single output forecasting. We studied the usefulness of anonymised open mobility datasets (place visits and the use of different transportation modes) in addition to open mortality data. We evaluated three drift adaptation strategies to deal with concept drifts in data by (i) refitting our models on part of the data, (ii) the full data, or (iii) retraining the models completely. We compared the performance of our AutoML methods in terms of RMSE with five baselines on two testing periods (over 2020 and 2021). Our results show that combining mobility features and mortality data improves forecasting accuracy. Furthermore, we show that when faced with concept drifts, our method refitted on recent data using place visits mobility features outperforms all other approaches for 22 of the 26 countries considered in our study.Algorithms and the Foundations of Software technolog

    MultiETSC: automated machine learning for early time series classification

    Get PDF
    Early time series classification (EarlyTSC) involves the prediction of a class label based on partial observation of a given time series. Most EarlyTSC algorithms consider the trade-off between accuracy and earliness as two competing objectives, using a single dedicated hyperparameter. To obtain insights into this trade-off requires finding a set of non-dominated (Pareto efficient) classifiers. So far, this has been approached through manual hyperparameter tuning. Since the trade-off hyperparameters only provide indirect control over the earliness-accuracy trade-off, manual tuning is tedious and tends to result in many sub-optimal hyperparameter settings. This complicates the search for optimal hyperparameter settings and forms a hurdle for the application of EarlyTSC to real-world problems. To address these issues, we propose an automated approach to hyperparameter tuning and algorithm selection for EarlyTSC, building on developments in the fast-moving research area known as automated machine learning (AutoML). To deal with the challenging task of optimising two conflicting objectives in early time series classification, we propose MultiETSC, a system for multi-objective algorithm selection and hyperparameter optimisation (MO-CASH) for EarlyTSC. MultiETSC can potentially leverage any existing or future EarlyTSC algorithm and produces a set of Pareto optimal algorithm configurations from which a user can choose a posteriori. As an additional benefit, our proposed framework can incorporate and leverage time-series classification algorithms not originally designed for EarlyTSC for improving performance on EarlyTSC; we demonstrate this property using a newly defined, "naive" fixed-time algorithm. In an extensive empirical evaluation of our new approach on a benchmark of 115 data sets, we show that MultiETSC performs substantially better than baseline methods, ranking highest (avg. rank 1.98) compared to conceptually simpler single-algorithm (2.98) and single-objective alternatives (4.36).Computer Science

    VPint: value propagation-based spatial interpolation

    Get PDF
    Given the common problem of missing data in real-world applications from various fields, such as remote sensing, ecology and meteorology, the interpolation of missing spatial and spatio-temporal data can be of tremendous value. Existing methods for spatial interpolation, most notably Gaussian processes and spatial autoregressive models, tend to suffer from (a) a trade-off between modelling local or global spatial interaction, (b) the assumption there is only one possible path between two points, and (c) the assumption of homogeneity of intermediate locations between points. Addressing these issues, we propose a value propagation-based spatial interpolation method called VPint, inspired by Markov reward processes (MRPs), and introduce two variants thereof: (i) a static discount (SD-MRP) and (ii) a data-driven weight prediction (WP-MRP) variant. Both these interpolation variants operate locally, while implicitly accounting for global spatial relationships in the entire system through recursion. We evaluated our proposed methods by comparing the mean absolute error, root mean squared error, peak signal-to-noise ratio and structural similarity of interpolated grid cells to those of 8 common baselines. Our analysis involved detailed experiments on a synthetic and two real-world datasets, as well as experiments on convergence and scalability. Empirical results demonstrate the competitive advantage of VPint on randomly missing data, where it performed better than baselines in terms of mean absolute error and structural similarity, as well as spatially clustered missing data, where it performed best on 2 out of 3 datasets.Algorithms and the Foundations of Software technolog

    Identifying Stops and Moves in WiFi Tracking Data

    Get PDF
    Algorithms and the Foundations of Software technolog

    Automated machine learning for satellite data: integrating remote sensing pre-trained models into AutoML systems

    Get PDF
    Algorithms and the Foundations of Software technolog

    Automated machine learning for remaining useful life estimation of aircraft engines

    Get PDF
    Algorithms and the Foundations of Software technolog

    A novel data-driven approach to examine children’s movements and social behaviour in schoolyard environments

    Get PDF
    Social participation at schoolyards is crucial for children's development. Yet, schoolyard environments contain features that can hinder children's social participation. In this paper, we empirically examine schoolyards to identify existing obstacles. Traditionally, this type of study requires huge amounts of detailed information about children in a given environment. Collecting such data is exceedingly difficult and expensive. In this study, we present a novel sensor data-driven approach for gathering this information and examining the effect of schoolyard environments on children's behaviours in light of schoolyard affordances and individual effectivities. Sensor data is collected from 150 children at two primary schools, using location trackers, proximity tags, and Multi-Motion receivers to measure locations, face-to-face contacts, and activities. Results show strong potential for this data-driven approach, as it allows collecting data from individuals and their interactions with schoolyard environments, examining the triad of physical, social, and cultural affordances in schoolyards, and identifying factors that significantly impact children's behaviours. Based on this approach, we further obtain better knowledge on the impact of these factors and identify limitations in schoolyard designs, which can inform schools, designers, and policymakers about current problems and practical solutions.Algorithms and the Foundations of Software technolog

    Mixing characterisation for a serpentine microchannel equipped with embedded barriers

    Full text link
    This paper describes the design, simulation, fabrication and experimental analysis of a passive micromixer for the mixing of biological solvents. The mixer consists of a T-junction, followed by a serpentine microchannel. the serpentine has three arcs, each equipped with circular barriers that are patterned as two opposing triangles. >The barriers are engineered to induce periodic perturbations in the flow field and enhance the mixing. CFD (Computational Fluid Dynamics) method is applied to optimise the geometric variables of the mixer before fabrication. The mixer is made from PDMS (Polydimethylsiloxane) using photo- and soft-lithography techniques. Experimental measurements are performed using yellow and blue food dyes as the mixing fluids. The mixing is measured by analysing the composition of the flow\u27s colour across the outlet channel. The performance of the mixer is examined in a wide range of flow rates from 0.5 to 10 µl/min. Mixing efficiencies of higher than 99.4% are obtained in the experiments confirming the results of numerical simulations. The proposed mixer can be employed as a part of lab-on-a-chip for biomedical applications
    • …
    corecore