1,670 research outputs found
CEPS Task Force on Artificial Intelligence and Cybersecurity Technology, Governance and Policy Challenges Task Force Evaluation of the HLEG Trustworthy AI Assessment List (Pilot Version). CEPS Task Force Report 22 January 2020
The Centre for European Policy Studies launched a Task Force on Artificial Intelligence (AI) and
Cybersecurity in September 2019. The goal of this Task Force is to bring attention to the market,
technical, ethical and governance challenges posed by the intersection of AI and cybersecurity,
focusing both on AI for cybersecurity but also cybersecurity for AI. The Task Force is multi-stakeholder
by design and composed of academics, industry players from various sectors, policymakers and civil
society.
The Task Force is currently discussing issues such as the state and evolution of the application of AI
in cybersecurity and cybersecurity for AI; the debate on the role that AI could play in the dynamics
between cyber attackers and defenders; the increasing need for sharing information on threats and
how to deal with the vulnerabilities of AI-enabled systems; options for policy experimentation; and
possible EU policy measures to ease the adoption of AI in cybersecurity in Europe.
As part of such activities, this report aims at assessing the High-Level Expert Group (HLEG) on AI Ethics
Guidelines for Trustworthy AI, presented on April 8, 2019. In particular, this report analyses and
makes suggestions on the Trustworthy AI Assessment List (Pilot version), a non-exhaustive list aimed
at helping the public and the private sector in operationalising Trustworthy AI. The list is composed
of 131 items that are supposed to guide AI designers and developers throughout the process of
design, development, and deployment of AI, although not intended as guidance to ensure
compliance with the applicable laws. The list is in its piloting phase and is currently undergoing a
revision that will be finalised in early 2020.
This report would like to contribute to this revision by addressing in particular the interplay between
AI and cybersecurity. This evaluation has been made according to specific criteria: whether and how
the items of the Assessment List refer to existing legislation (e.g. GDPR, EU Charter of Fundamental
Rights); whether they refer to moral principles (but not laws); whether they consider that AI attacks
are fundamentally different from traditional cyberattacks; whether they are compatible with
different risk levels; whether they are flexible enough in terms of clear/easy measurement,
implementation by AI developers and SMEs; and overall, whether they are likely to create obstacles
for the industry.
The HLEG is a diverse group, with more than 50 members representing different stakeholders, such
as think tanks, academia, EU Agencies, civil society, and industry, who were given the difficult task of
producing a simple checklist for a complex issue. The public engagement exercise looks successful
overall in that more than 450 stakeholders have signed in and are contributing to the process.
The next sections of this report present the items listed by the HLEG followed by the analysis and
suggestions raised by the Task Force (see list of the members of the Task Force in Annex 1)
Explaining Explanations in AI
Recent work on interpretability in machine learning and AI has focused on the building of simplified models that approximate the true criteria used to make decisions. These models are a useful pedagogical device for teaching trained professionals how to predict what decisions will be made by the complex system, and most importantly how the system might break. However, when considering any such model it’s important to remember Box’s maxim that "All models are wrong but some are useful." We focus on the distinction between these models and explanations in philosophy and sociology. These models can be understood as a "do it yourself kit" for explanations, allowing a practitioner to directly answer "what if questions" or generate contrastive explanations without external assistance. Although a valuable ability, giving these models as explanations appears more difficult than necessary, and other forms of explanation may not have the same trade-offs. We contrast the different schools of thought on what makes an explanation, and suggest that machine learning might benefit from viewing the problem more broadly
Reasoning and learning services for coalition situational understanding
Situational understanding requires an ability to assess the current situation and anticipate future situations, requiring both pattern recognition and inference. A coalition involves multiple agencies sharing information and analytics. This paper considers how to harness distributed information sources, including multimodal sensors, together with machine learning and reasoning services, to perform situational understanding in a coalition context. To exemplify the approach we focus on a technology integration experiment in which multimodal data — including video and still imagery, geospatial and weather data — is processed and fused in a service-oriented architecture by heterogeneous pattern recognition and inference components. We show how the architecture: (i) provides awareness of the current situation and prediction of future states, (ii) is robust to individual service failure, (iii) supports the generation of ‘why’ explanations for human analysts (including from components based on ‘black box’ deep neural networks which pose particular challenges to explanation generation), and (iv) allows for the imposition of information sharing constraints in a coalition context where there is varying levels of trust between partner agencies
Foundation Metrics: Quantifying Effectiveness of Healthcare Conversations powered by Generative AI
Generative Artificial Intelligence is set to revolutionize healthcare
delivery by transforming traditional patient care into a more personalized,
efficient, and proactive process. Chatbots, serving as interactive
conversational models, will probably drive this patient-centered transformation
in healthcare. Through the provision of various services, including diagnosis,
personalized lifestyle recommendations, and mental health support, the
objective is to substantially augment patient health outcomes, all the while
mitigating the workload burden on healthcare providers. The life-critical
nature of healthcare applications necessitates establishing a unified and
comprehensive set of evaluation metrics for conversational models. Existing
evaluation metrics proposed for various generic large language models (LLMs)
demonstrate a lack of comprehension regarding medical and health concepts and
their significance in promoting patients' well-being. Moreover, these metrics
neglect pivotal user-centered aspects, including trust-building, ethics,
personalization, empathy, user comprehension, and emotional support. The
purpose of this paper is to explore state-of-the-art LLM-based evaluation
metrics that are specifically applicable to the assessment of interactive
conversational models in healthcare. Subsequently, we present an comprehensive
set of evaluation metrics designed to thoroughly assess the performance of
healthcare chatbots from an end-user perspective. These metrics encompass an
evaluation of language processing abilities, impact on real-world clinical
tasks, and effectiveness in user-interactive conversations. Finally, we engage
in a discussion concerning the challenges associated with defining and
implementing these metrics, with particular emphasis on confounding factors
such as the target audience, evaluation methods, and prompt techniques involved
in the evaluation process.Comment: 13 pages, 4 figures, 2 tables, journal pape
Algorithm Auditing: Managing the Legal, Ethical, and Technological Risks of Artificial Intelligence, Machine Learning, and Associated Algorithms
Algorithms are becoming ubiquitous. However, companies are increasingly alarmed about their algorithms causing major financial or reputational damage. A new industry is envisaged: auditing and assurance of algorithms with the remit to validate artificial intelligence, machine learning, and associated algorithms
Trustworthy Federated Learning: A Survey
Federated Learning (FL) has emerged as a significant advancement in the field
of Artificial Intelligence (AI), enabling collaborative model training across
distributed devices while maintaining data privacy. As the importance of FL
increases, addressing trustworthiness issues in its various aspects becomes
crucial. In this survey, we provide an extensive overview of the current state
of Trustworthy FL, exploring existing solutions and well-defined pillars
relevant to Trustworthy . Despite the growth in literature on trustworthy
centralized Machine Learning (ML)/Deep Learning (DL), further efforts are
necessary to identify trustworthiness pillars and evaluation metrics specific
to FL models, as well as to develop solutions for computing trustworthiness
levels. We propose a taxonomy that encompasses three main pillars:
Interpretability, Fairness, and Security & Privacy. Each pillar represents a
dimension of trust, further broken down into different notions. Our survey
covers trustworthiness challenges at every level in FL settings. We present a
comprehensive architecture of Trustworthy FL, addressing the fundamental
principles underlying the concept, and offer an in-depth analysis of trust
assessment mechanisms. In conclusion, we identify key research challenges
related to every aspect of Trustworthy FL and suggest future research
directions. This comprehensive survey serves as a valuable resource for
researchers and practitioners working on the development and implementation of
Trustworthy FL systems, contributing to a more secure and reliable AI
landscape.Comment: 45 Pages, 8 Figures, 9 Table
- …