15 research outputs found

    Benchmarking Change Detector Algorithms from Different Concept Drift Perspectives

    Get PDF
    The stream mining paradigm has become increasingly popular due to the vast number of algorithms and methodologies it provides to address the current challenges of Internet of Things (IoT) and modern machine learning systems. Change detection algorithms, which focus on identifying drifts in the data distribution during the operation of a machine learning solution, are a crucial aspect of this paradigm. However, selecting the best change detection method for different types of concept drift can be challenging. This work aimed to provide a benchmark for four drift detection algorithms (EDDM, DDM, HDDMW, and HDDMA) for abrupt, gradual, and incremental drift types. To shed light on the capacity and possible trade-offs involved in selecting a concept drift algorithm, we compare their detection capability, detection time, and detection delay. The experiments were carried out using synthetic datasets, where various attributes, such as stream size, the amount of drifts, and drift duration can be controlled and manipulated on our generator of synthetic stream. Our results show that HDDMW provides the best trade-off among all performance indicators, demonstrating superior consistency in detecting abrupt drifts, but has suboptimal time consumption and a limited ability to detect incremental drifts. However, it outperforms other algorithms in detection delay for both abrupt and gradual drifts with an efficient detection performance and detection time performance

    Detecting and mitigating adversarial examples in regression tasks: A photovoltaic power generation forecasting case study

    Get PDF
    With data collected by Internet of Things sensors, deep learning (DL) models can forecast the generation capacity of photovoltaic (PV) power plants. This functionality is especially relevant for PV power operators and users as PV plants exhibit irregular behavior related to environmental conditions. However, DL models are vulnerable to adversarial examples, which may lead to increased predictive error and wrong operational decisions. This work proposes a new scheme to detect adversarial examples and mitigate their impact on DL forecasting models. This approach is based on one-class classifiers and features extracted from the data inputted to the forecasting models. Tests were performed using data collected from a real-world PV power plant along with adversarial samples generated by the Fast Gradient Sign Method under multiple attack patterns and magnitudes. One-class Support Vector Machine and Local Outlier Factor were evaluated as detectors of attacks to Long-Short Term Memory and Temporal Convolutional Network forecasting models. According to the results, the proposed scheme showed a high capability of detecting adversarial samples with an average F1-score close to 90%. Moreover, the detection and mitigation approach strongly reduced the prediction error increase caused by adversarial samples

    Time series segmentation based on stationarity analysis to improve new samples prediction

    Get PDF
    A wide range of applications based on sequential data, named time series, have become increasingly popular in recent years, mainly those based on the Internet of Things (IoT). Several different machine learning algorithms exploit the patterns extracted from sequential data to support multiple tasks. However, this data can suffer from unreliable readings that can lead to low accuracy models due to the low-quality training sets available. Detecting the change point between high representative segments is an important ally to find and thread biased subsequences. By constructing a framework based on the Augmented Dickey-Fuller (ADF) test for data stationarity, two proposals to automatically segment subsequences in a time series were developed. The former proposal, called Change Detector segmentation, relies on change detection methods of data stream mining. The latter, called ADF-based segmentation, is constructed on a new change detector derived from the ADF test only. Experiments over real-file IoT databases and benchmarks showed the improvement provided by our proposals for prediction tasks with traditional Autoregressive integrated moving average (ARIMA) and Deep Learning (Long short-term memory and Temporal Convolutional Networks) methods. Results obtained by the Long short-term memory predictive model reduced the relative prediction error from 1 to 0.67, compared to time series without segmentation

    Information and telecommunications project for a digital city: a brazilian case study

    Get PDF
    Making information and telecommunications available is a permanent challenge for cities concerned to their social, urban and local planning and development, focused on life quality of their citizens and on the effectiveness of public management. Such a challenge requires the involvement of everyone in the city. The objective is to describe the information and telecommunications project from the planning of a digital city carried out in Vinhedo-SP, Brazil. It was built as a telecommunications infrastructure of the kind of "open access metropolitan area networks" which enables the integration of citizens in a single telecommunications environment. The research methodology was emphasized by a case study which turned to be a research-action, comprising the municipal administration and its local units. The results achieved describe, by means of a methodology, the phases, sub-phases, activities, approval points and resulting products, and formalize their respective challenges and difficulties. The contributions have to do with the practical feasibility of the project and execution of its methodology. The conclusion reiterates the importance of the project, collectively implemented and accepted, as a tool to help the management of cities, in the implementation of Strategic Digital City Projects, in the decisions of public administration managers, and in the quality of life of their citizens3119811

    Evaluating the Four-Way Performance Trade-Off for Data Stream Classification in Edge Computing

    Get PDF
    Edge computing (EC) is a promising technology capable of bridging the gap between Cloud computing services and the demands of emerging technologies such as the Internet of Things (IoT). Most EC-based solutions, from wearable devices to smart cities architectures, benefit from Machine Learning (ML) methods to perform various tasks, such as classification. In these cases, ML solutions need to deal efficiently with a huge amount of data, while balancing predictive performance, memory and time costs, and energy consumption. The fact that these data usually come in the form of a continuous and evolving data stream makes the scenario even more challenging. Many algorithms have been proposed to cope with data stream classification, e.g., Very Fast Decision Tree (VFDT) and Strict VFDT (SVFDT). Recently, Online Local Boosting (OLBoost) has also been introduced to improve predictive performance without modifying the underlying structure of the decision tree produced by these algorithms. In this work, we compared the four-way relationship among time efficiency, energy consumption, predictive performance, and memory costs, tuning the hyperparameters of VFDT and the two versions of SVFDT with and without OLBoost. Experiments over 6 benchmark datasets using an EC device revealed that VFDT and SVFDT-I were the most energy-friendly algorithms, with SVFDT-I also significantly reducing memory consumption. OLBoost, as expected, improved the predictive performance, but caused a deterioration in memory and energy consumption
    corecore