3,445 research outputs found
iSAGE: An Incremental Version of SAGE for Online Explanation on Data Streams
Existing methods for explainable artificial intelligence (XAI), including
popular feature importance measures such as SAGE, are mostly restricted to the
batch learning scenario. However, machine learning is often applied in dynamic
environments, where data arrives continuously and learning must be done in an
online manner. Therefore, we propose iSAGE, a time- and memory-efficient
incrementalization of SAGE, which is able to react to changes in the model as
well as to drift in the data-generating process. We further provide efficient
feature removal methods that break (interventional) and retain (observational)
feature dependencies. Moreover, we formally analyze our explanation method to
show that iSAGE adheres to similar theoretical properties as SAGE. Finally, we
evaluate our approach in a thorough experimental analysis based on
well-established data sets and data streams with concept drift
Semantics-Empowered Big Data Processing with Applications
We discuss the nature of Big Data and address the role of semantics in analyzing and processing Big Data that arises in the context of Physical-Cyber-Social Systems. We organize our research around the Five Vs of Big Data, where four of the Vs are harnessed to produce the fifth V - value. To handle the challenge of Volume, we advocate semantic perception that can convert low-level observational data to higher-level abstractions more suitable for decision-making. To handle the challenge of Variety, we resort to the use of semantic models and annotations of data so that much of the intelligent processing can be done at a level independent of heterogeneity of data formats and media. To handle the challenge of Velocity, we seek to use continuous semantics capability to dynamically create event or situation specific models and recognize relevant new concepts, entities and facts. To handle Veracity, we explore the formalization of trust models and approaches to glean trustworthiness. The above four Vs of Big Data are harnessed by the semantics-empowered analytics to derive value for supporting practical applications transcending physical-cyber-social continuum
Scalability Benchmarking of Cloud-Native Applications Applied to Event-Driven Microservices
Cloud-native applications constitute a recent trend for designing large-scale software systems. This thesis introduces the Theodolite benchmarking method, allowing researchers and practitioners to conduct empirical scalability evaluations of cloud-native applications, their frameworks, configurations, and deployments. The benchmarking method is applied to event-driven microservices, a specific type of cloud-native applications that employ distributed stream processing frameworks to scale with massive data volumes. Extensive experimental evaluations benchmark and compare the scalability of various stream processing frameworks under different configurations and deployments, including different public and private cloud environments. These experiments show that the presented benchmarking method provides statistically sound results in an adequate amount of time. In addition, three case studies demonstrate that the Theodolite benchmarking method can be applied to a wide range of applications beyond stream processing
Real-Time Big Data Analytics in Smart Cities from LoRa-Based IoT Networks
The currently burst of the Internet of Things (IoT) tech-nologies
implies the emergence of new lines of investigation regarding not only to hardware
and protocols but also to new methods of pro-duced data analysis satisfying the
IoT environment constraints: a real-time and a big data approach. The Real-time
restriction is about the continuous generation of data provided by the endpoints
connected to an IoT network; due to the connection and scaling capabilities of an IoT
network, the amount of data to process is so high that Big data tech-niques
become essential. In this article, we present a system consisting of two main
modules. In one hand, the infrastructure, a complete LoRa based network designed,
tested and deployment in the Pablo de Olavide University and, on the other side, the
analytics, a big data streaming sys-tem that processes the inputs produced by the
network to obtain useful, valid and hidden information.Ministerio de Economía y Competitividad TIN2017-88209-C2-1-
- …