120 research outputs found

    Code-Upload AI Challenges on EvalAI

    Get PDF
    Artificial intelligence develops techniques and systems whose performance must be evaluated on a regular basis in order to certify and foster progress in the discipline. We have developed several tools such as EvalAI which helps us in evaluating the performance of these systems and to push the frontiers of machine learning and artificial intelligence. Initially, the AI community focussed on simple and traditional methods of evaluating these systems in the form of prediction upload challenges but with the advent of deep learning, larger datasets, and complex AI agents, etc. these methods are not sufficient for evaluation. A technique to evaluate these AI agents is by uploading their code, running it on the sequestered test dataset, and reporting the results on the leaderboard. In this work, we introduced code upload evaluation of AI agents on EvalAI for all kinds of AI tasks, i.e.reinforcement learning, supervised learning, and unsupervised learning. We offer features such as scalable backend, prioritized submission evaluation, secure test environment, and running AI agents code in an isolated sanitized environment. The end-to-end pipeline is extremely flexible, modular, and portable which can later be extended to multi-agents setups and evaluation on dynamic datasets. We also proposed a procedure using GitHub for AI challenge creation to version, maintain, and reduce the friction in this conglomerate process. Finally, we focused on providing analytics to all the users of the platform along with easing the hosting of EvalAI on private servers as an internal evaluation platform.M.S

    GLocalX - From Local to Global Explanations of Black Box AI Models

    Get PDF
    Artificial Intelligence (AI) has come to prominence as one of the major components of our society, with applications in most aspects of our lives. In this field, complex and highly nonlinear machine learning models such as ensemble models, deep neural networks, and Support Vector Machines have consistently shown remarkable accuracy in solving complex tasks. Although accurate, AI models often are “black boxes” which we are not able to understand. Relying on these models has a multifaceted impact and raises significant concerns about their transparency. Applications in sensitive and critical domains are a strong motivational factor in trying to understand the behavior of black boxes. We propose to address this issue by providing an interpretable layer on top of black box models by aggregating “local” explanations. We present GLOCALX, a “local-first” model agnostic explanation method. Starting from local explanations expressed in form of local decision rules, GLOCALX iteratively generalizes them into global explanations by hierarchically aggregating them. Our goal is to learn accurate yet simple interpretable models to emulate the given black box, and, if possible, replace it entirely. We validate GLOCALX in a set of experiments in standard and constrained settings with limited or no access to either data or local explanations. Experiments show that GLOCALX is able to accurately emulate several models with simple and small models, reaching state-of-the-art performance against natively global solutions. Our findings show how it is often possible to achieve a high level of both accuracy and comprehensibility of classification models, even in complex domains with high-dimensional data, without necessarily trading one property for the other. This is a key requirement for a trustworthy AI, necessary for adoption in high-stakes decision making applications.Artificial Intelligence (AI) has come to prominence as one of the major components of our society, with applications in most aspects of our lives. In this field, complex and highly nonlinear machine learning models such as ensemble models, deep neural networks, and Support Vector Machines have consistently shown remarkable accuracy in solving complex tasks. Although accurate, AI models often are “black boxes” which we are not able to understand. Relying on these models has a multifaceted impact and raises significant concerns about their transparency. Applications in sensitive and critical domains are a strong motivational factor in trying to understand the behavior of black boxes. We propose to address this issue by providing an interpretable layer on top of black box models by aggregating “local” explanations. We present GLOCALX, a “local-first” model agnostic explanation method. Starting from local explanations expressed in form of local decision rules, GLOCALX iteratively generalizes them into global explanations by hierarchically aggregating them. Our goal is to learn accurate yet simple interpretable models to emulate the given black box, and, if possible, replace it entirely. We validate GLOCALX in a set of experiments in standard and constrained settings with limited or no access to either data or local explanations. Experiments show that GLOCALX is able to accurately emulate several models with simple and small models, reaching state-of-the-art performance against natively global solutions. Our findings show how it is often possible to achieve a high level of both accuracy and comprehensibility of classification models, even in complex domains with high-dimensional data, without necessarily trading one property for the other. This is a key requirement for a trustworthy AI, necessary for adoption in high-stakes decision making applications

    Performance Evaluation And Anomaly detection in Mobile BroadBand Across Europe

    Get PDF
    With the rapidly growing market for smartphones and user’s confidence for immediate access to high-quality multimedia content, the delivery of video over wireless networks has become a big challenge. It makes it challenging to accommodate end-users with flawless quality of service. The growth of the smartphone market goes hand in hand with the development of the Internet, in which current transport protocols are being re-evaluated to deal with traffic growth. QUIC and WebRTC are new and evolving standards. The latter is a unique and evolving standard explicitly developed to meet this demand and enable a high-quality experience for mobile users of real-time communication services. QUIC has been designed to reduce Web latency, integrate security features, and allow a highquality experience for mobile users. Thus, the need to evaluate the performance of these rising protocols in a non-systematic environment is essential to understand the behavior of the network and provide the end user with a better multimedia delivery service. Since most of the work in the research community is conducted in a controlled environment, we leverage the MONROE platform to investigate the performance of QUIC and WebRTC in real cellular networks using static and mobile nodes. During this Thesis, we conduct measurements ofWebRTC and QUIC while making their data-sets public to the interested experimenter. Building such data-sets is very welcomed with the research community, opening doors to applying data science to network data-sets. The development part of the experiments involves building Docker containers that act as QUIC and WebRTC clients. These containers are publicly available to be used candidly or within the MONROE platform. These key contributions span from Chapter 4 to Chapter 5 presented in Part II of the Thesis. We exploit data collection from MONROE to apply data science over network data-sets, which will help identify networking problems shifting the Thesis focus from performance evaluation to a data science problem. Indeed, the second part of the Thesis focuses on interpretable data science. Identifying network problems leveraging Machine Learning (ML) has gained much visibility in the past few years, resulting in dramatically improved cellular network services. However, critical tasks like troubleshooting cellular networks are still performed manually by experts who monitor the network around the clock. In this context, this Thesis contributes by proposing the use of simple interpretable ML algorithms, moving away from the current trend of high-accuracy ML algorithms (e.g., deep learning) that do not allow interpretation (and hence understanding) of their outcome. We prefer having lower accuracy since we consider it interesting (anomalous) the scenarios misclassified by the ML algorithms, and we do not want to miss them by overfitting. To this aim, we present CIAN (from Causality Inference of Anomalies in Networks), a practical and interpretable ML methodology, which we implement in the form of a software tool named TTrees (from Troubleshooting Trees) and compare it to a supervised counterpart, named STress (from Supervised Trees). Both methodologies require small volumes of data and are quick at training. Our experiments using real data from operational commercial mobile networks e.g., sampled with MONROE probes, show that STrees and CIAN can automatically identify and accurately classify network anomalies—e.g., cases for which a low network performance is not justified by operational conditions—training with just a few hundreds of data samples, hence enabling precise troubleshooting actions. Most importantly, our experiments show that a fully automated unsupervised approach is viable and efficient. In Part III of the Thesis which includes Chapter 6 and 7. In conclusion, in this Thesis, we go through a data-driven networking roller coaster, from performance evaluating upcoming network protocols in real mobile networks to building methodologies that help identify and classify the root cause of networking problems, emphasizing the fact that these methodologies are easy to implement and can be deployed in production environments.This work has been supported by IMDEA Networks InstitutePrograma de Doctorado en Multimedia y Comunicaciones por la Universidad Carlos III de Madrid y la Universidad Rey Juan CarlosPresidente: Matteo Sereno.- Secretario: Antonio de la Oliva Delgado.- Vocal: Raquel Barco Moren

    Mapping the Current Landscape of Research Library Engagement with Emerging Technologies in Research and Learning: Final Report

    Get PDF
    The generation, dissemination, and analysis of digital information is a significant driver, and consequence, of technological change. As data and information stewards in physical and virtual space, research libraries are thoroughly entangled in the challenges presented by the Fourth Industrial Revolution:1 a societal shift powered not by steam or electricity, but by data, and characterized by a fusion of the physical and digital worlds.2 Organizing, structuring, preserving, and providing access to growing volumes of the digital data generated and required by research and industry will become a critically important function. As partners with the community of researchers and scholars, research libraries are also recognizing and adapting to the consequences of technological change in the practices of scholarship and scholarly communication. Technologies that have emerged or become ubiquitous within the last decade have accelerated information production and have catalyzed profound changes in the ways scholars, students, and the general public create and engage with information. The production of an unprecedented volume and diversity of digital artifacts, the proliferation of machine learning (ML) technologies,3 and the emergence of data as the “world’s most valuable resource,”4 among other trends, present compelling opportunities for research libraries to contribute in new and significant ways to the research and learning enterprise. Librarians are all too familiar with predictions of the research library’s demise in an era when researchers have so much information at their fingertips. A growing body of evidence provides a resounding counterpoint: that the skills, experience, and values of librarians, and the persistence of libraries as an institution, will become more important than ever as researchers contend with the data deluge and the ephemerality and fragility of much digital content. This report identifies strategic opportunities for research libraries to adopt and engage with emerging technologies,5 with a roughly fiveyear time horizon. It considers the ways in which research library values and professional expertise inform and shape this engagement, the ways library and library worker roles will be reconceptualized, and the implication of a range of technologies on how the library fulfills its mission. The report builds on a literature review covering the last five years of published scholarship, primarily North American information science literature, and interviews with a dozen library field experts, completed in fall 2019. It begins with a discussion of four cross-cutting opportunities that permeate many or all aspects of research library services. Next, specific opportunities are identified in each of five core research library service areas: facilitating information discovery, stewarding the scholarly and cultural record, advancing digital scholarship, furthering student learning and success, and creating learning and collaboration spaces. Each section identifies key technologies shaping user behaviors and library services, and highlights exemplary initiatives. Underlying much of the discussion in this report is the idea that “digital transformation is increasingly about change management”6 —that adoption of or engagement with emerging technologies must be part of a broader strategy for organizational change, for “moving emerging work from the periphery to the core,”7 and a broader shift in conceptualizing the research library and its services. Above all, libraries are benefitting from the ways in which emerging technologies offer opportunities to center users and move from a centralized and often siloed service model to embedded, collaborative engagement with the research and learning enterprise

    Technologies and Applications for Big Data Value

    Get PDF
    This open access book explores cutting-edge solutions and best practices for big data and data-driven AI applications for the data-driven economy. It provides the reader with a basis for understanding how technical issues can be overcome to offer real-world solutions to major industrial areas. The book starts with an introductory chapter that provides an overview of the book by positioning the following chapters in terms of their contributions to technology frameworks which are key elements of the Big Data Value Public-Private Partnership and the upcoming Partnership on AI, Data and Robotics. The remainder of the book is then arranged in two parts. The first part “Technologies and Methods” contains horizontal contributions of technologies and methods that enable data value chains to be applied in any sector. The second part “Processes and Applications” details experience reports and lessons from using big data and data-driven approaches in processes and applications. Its chapters are co-authored with industry experts and cover domains including health, law, finance, retail, manufacturing, mobility, and smart cities. Contributions emanate from the Big Data Value Public-Private Partnership and the Big Data Value Association, which have acted as the European data community's nucleus to bring together businesses with leading researchers to harness the value of data to benefit society, business, science, and industry. The book is of interest to two primary audiences, first, undergraduate and postgraduate students and researchers in various fields, including big data, data science, data engineering, and machine learning and AI. Second, practitioners and industry experts engaged in data-driven systems, software design and deployment projects who are interested in employing these advanced methods to address real-world problems

    Reproducibility and Replicability in Unmanned Aircraft Systems and Geographic Information Science

    Get PDF
    Multiple scientific disciplines face a so-called crisis of reproducibility and replicability (R&R) in which the validity of methodologies is questioned due to an inability to confirm experimental results. Trust in information technology (IT)-intensive workflows within geographic information science (GIScience), remote sensing, and photogrammetry depends on solutions to R&R challenges affecting multiple computationally driven disciplines. To date, there have only been very limited efforts to overcome R&R-related issues in remote sensing workflows in general, let alone those tied to disruptive technologies such as unmanned aircraft systems (UAS) and machine learning (ML). To accelerate an understanding of this crisis, a review was conducted to identify the issues preventing R&R in GIScience. Key barriers included: (1) awareness of time and resource requirements, (2) accessibility of provenance, metadata, and version control, (3) conceptualization of geographic problems, and (4) geographic variability between study areas. As a case study, a replication of a GIScience workflow utilizing Yolov3 algorithms to identify objects in UAS imagery was attempted. Despite the ability to access source data and workflow steps, it was discovered that the lack of accessibility to provenance and metadata of each small step of the work prohibited the ability to successfully replicate the work. Finally, a novel method for provenance generation was proposed to address these issues. It was found that artificial intelligence (AI) could be used to quickly create robust provenance records for workflows that do not exceed time and resource constraints and provide the information needed to replicate work. Such information can bolster trust in scientific results and provide access to cutting edge technology that can improve everyday life

    Reproducibility and Replicability in Unmanned Aircraft Systems and Geographic Information Science

    Get PDF
    Multiple scientific disciplines face a so-called crisis of reproducibility and replicability (R&R) in which the validity of methodologies is questioned due to an inability to confirm experimental results. Trust in information technology (IT)-intensive workflows within geographic information science (GIScience), remote sensing, and photogrammetry depends on solutions to R&R challenges affecting multiple computationally driven disciplines. To date, there have only been very limited efforts to overcome R&R-related issues in remote sensing workflows in general, let alone those tied to disruptive technologies such as unmanned aircraft systems (UAS) and machine learning (ML). To accelerate an understanding of this crisis, a review was conducted to identify the issues preventing R&R in GIScience. Key barriers included: (1) awareness of time and resource requirements, (2) accessibility of provenance, metadata, and version control, (3) conceptualization of geographic problems, and (4) geographic variability between study areas. As a case study, a replication of a GIScience workflow utilizing Yolov3 algorithms to identify objects in UAS imagery was attempted. Despite the ability to access source data and workflow steps, it was discovered that the lack of accessibility to provenance and metadata of each small step of the work prohibited the ability to successfully replicate the work. Finally, a novel method for provenance generation was proposed to address these issues. It was found that artificial intelligence (AI) could be used to quickly create robust provenance records for workflows that do not exceed time and resource constraints and provide the information needed to replicate work. Such information can bolster trust in scientific results and provide access to cutting edge technology that can improve everyday life

    DevOps for Trustworthy Smart IoT Systems

    Get PDF
    ENACT is a research project funded by the European Commission under its H2020 program. The project consortium consists of twelve industry and research member organisations spread across the whole EU. The overall goal of the ENACT project was to provide a novel set of solutions to enable DevOps in the realm of trustworthy Smart IoT Systems. Smart IoT Systems (SIS) are complex systems involving not only sensors but also actuators with control loops distributed all across the IoT, Edge and Cloud infrastructure. Since smart IoT systems typically operate in a changing and often unpredictable environment, the ability of these systems to continuously evolve and adapt to their new environment is decisive to ensure and increase their trustworthiness, quality and user experience. DevOps has established itself as a software development life-cycle model that encourages developers to continuously bring new features to the system under operation without sacrificing quality. This book reports on the ENACT work to empower the development and operation as well as the continuous and agile evolution of SIS, which is necessary to adapt the system to changes in its environment, such as newly appearing trustworthiness threats
    • …
    corecore