22 research outputs found

    A Lightweight, Efficient and Explainable-by-Design Convolutional Neural Network for Internet Traffic Classification

    Full text link
    Traffic classification, i.e. the identification of the type of applications flowing in a network, is a strategic task for numerous activities (e.g., intrusion detection, routing). This task faces some critical challenges that current deep learning approaches do not address. The design of current approaches do not take into consideration the fact that networking hardware (e.g., routers) often runs with limited computational resources. Further, they do not meet the need for faithful explainability highlighted by regulatory bodies. Finally, these traffic classifiers are evaluated on small datasets which fail to reflect the diversity of applications in real-world settings. Therefore, this paper introduces a Lightweight, Efficient and eXplainable-by-design convolutional neural network (LEXNet) for Internet traffic classification, which relies on a new residual block (for lightweight and efficiency purposes) and prototype layer (for explainability). Based on a commercial-grade dataset, our evaluation shows that LEXNet succeeds to maintain the same accuracy as the best performing state-of-the-art neural network, while providing the additional features previously mentioned. Moreover, we illustrate the explainability feature of our approach, which stems from the communication of detected application prototypes to the end-user, and we highlight the faithfulness of LEXNet explanations through a comparison with post hoc methods

    LCE: An Augmented Combination of Bagging and Boosting in Python

    Full text link
    lcensemble is a high-performing, scalable and user-friendly Python package for the general tasks of classification and regression. The package implements Local Cascade Ensemble (LCE), a machine learning method that further enhances the prediction performance of the current state-of-the-art methods Random Forest and XGBoost. LCE combines their strengths and adopts a complementary diversification approach to obtain a better generalizing predictor. The package is compatible with scikit-learn, therefore it can interact with scikit-learn pipelines and model selection tools. It is distributed under the Apache 2.0 license, and its source code is available at https://github.com/LocalCascadeEnsemble/LCE

    XPM: An explainable-by-design pattern-based estrus detection approach to improve resource use in dairy farms

    Get PDF
    International audienceA powerful automatic detection of estrus, the only period when the cow is susceptible to pregnancy, is a key driver to help farmers with reproduction management and subsequently to improve milk production resource use in dairy farms. Automatic solutions to detect both types of estrus (behavioral and silent estrus) based on the combination of affordable phenotyping data (activity, body temperature) exist, but they do not provide faithful explanations to support their alerts and in ways that farmers can understand based on the behaviors they could observe in animals. In this paper, we first propose XPM, a novel pattern-based classifier to detect both types of estrus with real-world affordable sensor data (activity, body temperature) which supports its predictions with perfectly faithful explanations. Then, we show that our approach performs better than a commercial reference in estrus detection, driven by the detection of silent estrus. Finally, we present the explainability of our solution which stems from the communication to the farmers the presence and/or absence of a limited number of patterns determinant of estrus detection, therefore reducing solution mistrust and supporting farmers' decision-making

    MDSC: Modelling Distributed Stream Processing across the Edge-to-Cloud Continuum

    Get PDF
    International audienceThe growth of the Internet of Things is resulting in an explosion of data volumes at the Edge of the Internet. To reduce costs incurred due to data movement and centralized cloud-based processing, it is becoming increasingly important to process and analyze such data closer to the data sources. Exploiting Edge computing capabilities for stream-based processing is however challenging. It requires addressing the complex characteristics and constraints imposed by all the resources along the data path, as well as the large set of heterogeneous data processing and management frameworks. Consequently, the community needs tools that can facilitate the modeling of this complexity and can integrate the various components involved. In this work, we introduce MDSC, a hierarchical approach for modeling distributed stream-based applications on Edge-to-Cloud continuum infrastructures. We demonstrate how MDSC can be applied to a concrete real-life ML-based application -early earthquake warning - to help answer questions such as: when is it worth decentralizing the classification load from the Cloud to the Edge and how

    ÉCLAIRE - Effects of Climate Change on Air Pollution Impacts and Response Strategies for European Ecosytems - second periodic report 01/04/2013 to 30/09/2014

    Get PDF

    ECLAIRE: Effects of Climate Change on Air Pollution Impacts and Response Strategies for European Ecosystems. Project final report

    Get PDF
    The central goal of ECLAIRE is to assess how climate change will alter the extent to which air pollutants threaten terrestrial ecosystems. Particular attention has been given to nitrogen compounds, especially nitrogen oxides (NOx) and ammonia (NH3), as well as Biogenic Volatile Organic Compounds (BVOCs) in relation to tropospheric ozone (O3) formation, including their interactions with aerosol components. ECLAIRE has combined a broad program of field and laboratory experimentation and modelling of pollution fluxes and ecosystem impacts, advancing both mechanistic understanding and providing support to European policy makers. The central finding of ECLAIRE is that future climate change is expected to worsen the threat of air pollutants on Europe’s ecosystems. Firstly, climate warming is expected to increase the emissions of many trace gases, such as agricultural NH3, the soil component of NOx emissions and key BVOCs. Experimental data and numerical models show how these effects will tend to increase atmospheric N deposition in future. By contrast, the net effect on tropospheric O3 is less clear. This is because parallel increases in atmospheric CO2 concentrations will offset the temperature-driven increase for some BVOCs, such as isoprene. By contrast, there is currently insufficient evidence to be confident that CO2 will offset anticipated climate increases in monoterpene emissions. Secondly, climate warming is found to be likely to increase the vulnerability of ecosystems towards air pollutant exposure or atmospheric deposition. Such effects may occur as a consequence of combined perturbation, as well as through specific interactions, such as between drought, O3, N and aerosol exposure. These combined effects of climate change are expected to offset part of the benefit of current emissions control policies. Unless decisive mitigation actions are taken, it is anticipated that ongoing climate warming will increase agricultural and other biogenic emissions, posing a challenge for national emissions ceilings and air quality objectives related to nitrogen and ozone pollution. The O3 effects will be further worsened if progress is not made to curb increases in methane (CH4) emissions in the northern hemisphere. Other key findings of ECLAIRE are that: 1) N deposition and O3 have adverse synergistic effects. Exposure to ambient O3 concentrations was shown to reduce the Nitrogen Use Efficiency of plants, both decreasing agricultural production and posing an increased risk of other forms of nitrogen pollution, such as nitrate leaching (NO3-) and the greenhouse gas nitrous oxide (N2O); 2) within-canopy dynamics for volatile aerosol can increase dry deposition and shorten atmospheric lifetimes; 3) ambient aerosol levels reduce the ability of plants to conserve water under drought conditions; 4) low-resolution mapping studies tend to underestimate the extent of local critical loads exceedance; 5) new dose-response functions can be used to improve the assessment of costs, including estimation of the value of damage due to air pollution effects on ecosystems, 6) scenarios can be constructed that combine technical mitigation measures with dietary change options (reducing livestock products in food down to recommended levels for health criteria), with the balance between the two strategies being a matter for future societal discussion. ECLAIRE has supported the revision process for the National Emissions Ceilings Directive and will continue to deliver scientific underpinning into the future for the UNECE Convention on Long-range Transboundary Air Pollution

    ECLAIRE third periodic report

    Get PDF
    The ÉCLAIRE project (Effects of Climate Change on Air Pollution Impacts and Response Strategies for European Ecosystems) is a four year (2011-2015) project funded by the EU's Seventh Framework Programme for Research and Technological Development (FP7)

    XCM: An Explainable Convolutional Neural Network for Multivariate Time Series Classification

    No full text
    Multivariate Time Series (MTS) classification has gained importance over the past decade with the increase in the number of temporal datasets in multiple domains. The current state-of-the-art MTS classifier is a heavyweight deep learning approach, which outperforms the second-best MTS classifier only on large datasets. Moreover, this deep learning approach cannot provide faithful explanations as it relies on post hoc model-agnostic explainability methods, which could prevent its use in numerous applications. In this paper, we present XCM, an eXplainable Convolutional neural network for MTS classification. XCM is a new compact convolutional neural network which extracts information relative to the observed variables and time directly from the input data. Thus, XCM architecture enables a good generalization ability on both large and small datasets, while allowing the full exploitation of a faithful post hoc model-specific explainability method (Gradient-weighted Class Activation Mapping) by precisely identifying the observed variables and timestamps of the input data that are important for predictions. We first show that XCM outperforms the state-of-the-art MTS classifiers on both the large and small public UEA datasets. Then, we illustrate how XCM reconciles performance and explainability on a synthetic dataset and show that XCM enables a more precise identification of the regions of the input data that are important for predictions compared to the current deep learning MTS classifier also providing faithful explainability. Finally, we present how XCM can outperform the current most accurate state-of-the-art algorithm on a real-world application while enhancing explainability by providing faithful and more informative explanations

    AmĂ©lioration de la Performance et de l’ExplicabilitĂ© des MĂ©thodes d'Apprentissage Automatique de SĂ©ries Temporelles MultivariĂ©es

    No full text
    The prevalent deployment and usage of sensors in a wide range of sectors generate an abundance of multivariate data which has proven to be instrumental for researches, businesses and policies. More specifically, multivariate data which integrates temporal evolution, i.e. Multivariate Time Series (MTS), has received significant interests in recent years, driven by high resolution monitoring applications (e.g. healthcare, mobility) and machine learning. However, for many applications, the adoption of machine learning methods cannot rely solely on their prediction performance. For example, the European Union’s General Data Protection Regulation, which became enforceable on 25 May 2018, introduces a right to explanation for all individuals so that they can obtain “meaningful explanations of the logic involved” when automated decision-making has “legal effects” on individuals or similarly “significantly affecting” them. The current best performing state-of-the-art MTS machine learning methods are “black-box” models, i.e. complicated-to-understand models, which rely on explainability methods providing explanations from any machine learning model to support their predictions (post-hoc model-agnostic). The main line of work in post-hoc model-agnostic explainability methods approximates the decision surface of a model using an explainable surrogate model. However, the explanations from the surrogate models cannot be perfectly faithful with respect to the original model, which is a prerequisite for numerous applications. Faithfulness is critical as it corresponds to the level of trust an end-user can have in the explanations of model predictions, i.e.  the level of relatedness of the explanations to what the model actually computes. This thesis introduces new approaches to enhance both performance and explainability of MTS machine learning methods, and derive insights from the new methods about two real-world applications.Le dĂ©ploiement massif de capteurs couplĂ© Ă  leur exploitation dans de nombreux secteurs gĂ©nĂšre une masse considĂ©rable de donnĂ©es multivariĂ©es qui se sont rĂ©vĂ©lĂ©es clĂ©s pour la recherche scientifique, les activitĂ©s des entreprises et la dĂ©finition de politiques publiques. Plus spĂ©cifiquement, les donnĂ©es multivariĂ©es qui intĂšgrent une Ă©volution temporelle, c’est-Ă -dire des sĂ©ries temporelles, ont reçu une attention toute particuliĂšre ces derniĂšres annĂ©es, notamment grĂące Ă  des applications critiques de monitoring (e.g. mobilitĂ©, santĂ©) et l’apprentissage automatique. Cependant, pour de nombreuses applications, l’adoption d’algorithmes d’apprentissage automatique ne peut se reposer uniquement sur la performance. Par exemple, le rĂšglement gĂ©nĂ©ral sur la protection des donnĂ©es de l’Union europĂ©enne, entrĂ© en application le 25 Mai 2018, introduit un droit Ă  l’explication pour tous les individus afin qu’ils obtiennent des « meaningful explanations of the logic involved » lorsque la prise de dĂ©cision automatisĂ©e a des « legal effects » sur les individus ou les affecte significativement. Les modĂšles d’apprentissage automatique de sĂ©ries temporelles multivariĂ©es de l’état de l’art les plus performants sont des modĂšles difficiles Ă  comprendre (« black-box »), qui se reposent sur des mĂ©thodes d’explicabilitĂ© applicables Ă  n’importe quel modĂšle d’apprentissage automatique (post-hoc modĂšle-agnostique). L’axe de travail principal au sein des mĂ©thodes d’explicabilitĂ© post-hoc modĂšle-agnostique consiste Ă  approximer la surface de dĂ©cision d’un modĂšle en utilisant un modĂšle de remplacement explicable. Cependant, les explications du modĂšle de remplacement ne peuvent pas ĂȘtre parfaitement exactes au regard du modĂšle original, ce qui constitue un prĂ©requis pour de nombreuses applications. L’exactitude est cruciale car elle correspond au niveau de confiance que l’utilisateur peut porter aux explications relatives aux prĂ©dictions du modĂšle, c’est-Ă -dire Ă  quel point les explications reflĂštent ce que le modĂšle calcule.Cette thĂšse propose de nouvelles approches pour amĂ©liorer la performance et l’explicabilitĂ© des mĂ©thodes d’apprentissage automatique de sĂ©ries temporelles multivariĂ©es, et Ă©tablit de nouvelles connaissances concernant deux applications rĂ©elles

    A Performance-Explainability Framework to Benchmark Machine Learning Methods: Application to Multivariate Time Series Classifiers

    No full text
    International audienceOur research aims to propose a new performance-explainability analytical framework to assess and benchmark machine learning methods. The framework details a set of characteristics that systematize the performance-explainability assessment of existing machine learning methods. In order to illustrate the use of the framework, we apply it to benchmark the current state-of-the-art multivariate time series classifiers
    corecore