2,396 research outputs found
Machine Learning of User Profiles: Representational Issues
As more information becomes available electronically, tools for finding
information of interest to users becomes increasingly important. The goal of
the research described here is to build a system for generating comprehensible
user profiles that accurately capture user interest with minimum user
interaction. The research described here focuses on the importance of a
suitable generalization hierarchy and representation for learning profiles
which are predictively accurate and comprehensible. In our experiments we
evaluated both traditional features based on weighted term vectors as well as
subject features corresponding to categories which could be drawn from a
thesaurus. Our experiments, conducted in the context of a content-based
profiling system for on-line newspapers on the World Wide Web (the IDD News
Browser), demonstrate the importance of a generalization hierarchy and the
promise of combining natural language processing techniques with machine
learning (ML) to address an information retrieval (IR) problem.Comment: 6 page
Improving the translation environment for professional translators
When using computer-aided translation systems in a typical, professional translation workflow, there are several stages at which there is room for improvement. The SCATE (Smart Computer-Aided Translation Environment) project investigated several of these aspects, both from a human-computer interaction point of view, as well as from a purely technological side.
This paper describes the SCATE research with respect to improved fuzzy matching, parallel treebanks, the integration of translation memories with machine translation, quality estimation, terminology extraction from comparable texts, the use of speech recognition in the translation process, and human computer interaction and interface design for the professional translation environment. For each of these topics, we describe the experiments we performed and the conclusions drawn, providing an overview of the highlights of the entire SCATE project
Privacy-preserving scoring of tree ensembles : a novel framework for AI in healthcare
Machine Learning (ML) techniques now impact a wide variety of domains. Highly regulated industries such as healthcare and finance have stringent compliance and data governance policies around data sharing. Advances in secure multiparty computation (SMC) for privacy-preserving machine learning (PPML) can help transform these regulated industries by allowing ML computations over encrypted data with personally identifiable information (PII). Yet very little of SMC-based PPML has been put into practice so far. In this paper we present the very first framework for privacy-preserving classification of tree ensembles with application in healthcare. We first describe the underlying cryptographic protocols that enable a healthcare organization to send encrypted data securely to a ML scoring service and obtain encrypted class labels without the scoring service actually seeing that input in the clear. We then describe the deployment challenges we solved to integrate these protocols in a cloud based scalable risk-prediction platform with multiple ML models for healthcare AI. Included are system internals, and evaluations of our deployment for supporting physicians to drive better clinical outcomes in an accurate, scalable, and provably secure manner. To the best of our knowledge, this is the first such applied framework with SMC-based privacy-preserving machine learning for healthcare
A knowledge-based approach towards human activity recognition in smart environments
For many years it is known that the population of older persons is on the rise. A recent report estimates that globally, the share of the population aged 65 years or over is expected to increase from 9.3 percent in 2020 to around 16.0 percent in 2050 [1]. This point has been one of the main sources of motivation for active research in the domain of human
activity recognition in smart-homes. The ability to perform ADL without assistance from
other people can be considered as a reference for the estimation of the independent living
level of the older person. Conventionally, this has been assessed by health-care domain
experts via a qualitative evaluation of the ADL. Since this evaluation is qualitative, it can
vary based on the person being monitored and the caregiver\u2019s experience. A significant
amount of research work is implicitly or explicitly aimed at augmenting the health-care
domain expert\u2019s qualitative evaluation with quantitative data or knowledge obtained from
HAR. From a medical perspective, there is a lack of evidence about the technology readiness
level of smart home architectures supporting older persons by recognizing ADL [2]. We
hypothesize that this may be due to a lack of effective collaboration between smart-home
researchers/developers and health-care domain experts, especially when considering HAR.
We foresee an increase in HAR systems being developed in close collaboration with caregivers
and geriatricians to support their qualitative evaluation of ADL with explainable quantitative
outcomes of the HAR systems. This has been a motivation for the work in this thesis. The
recognition of human activities \u2013 in particular ADL \u2013 may not only be limited to support
the health and well-being of older people. It can be relevant to home users in general. For
instance, HAR could support digital assistants or companion robots to provide contextually
relevant and proactive support to the home users, whether young adults or old. This has also
been a motivation for the work in this thesis.
Given our motivations, namely, (i) facilitation of iterative development and ease in collaboration between HAR system researchers/developers and health-care domain experts in ADL,
and (ii) robust HAR that can support digital assistants or companion robots. There is a need
for the development of a HAR framework that at its core is modular and flexible to facilitate
an iterative development process [3], which is an integral part of collaborative work that involves develop-test-improve phases. At the same time, the framework should be intelligible
for the sake of enriched collaboration with health-care domain experts. Furthermore, it
should be scalable, online, and accurate for having robust HAR, which can enable many
smart-home applications. The goal of this thesis is to design and evaluate such a framework.
This thesis contributes to the domain of HAR in smart-homes. Particularly the contribution can be divided into three parts. The first contribution is Arianna+, a framework to develop
networks of ontologies - for knowledge representation and reasoning - that enables smart
homes to perform human activity recognition online. The second contribution is OWLOOP,
an API that supports the development of HAR system architectures based on Arianna+. It
enables the usage of Ontology Web Language (OWL) by the means of Object-Oriented
Programming (OOP). The third contribution is the evaluation and exploitation of Arianna+
using OWLOOP API. The exploitation of Arianna+ using OWLOOP API has resulted in four
HAR system implementations. The evaluations and results of these HAR systems emphasize
the novelty of Arianna+
Ergodicity and You: Adaptive Heuristics in an Uncertain World
Life requires making decisions under uncertainty. Facing complex, dynamic environments, decision-making processes should focus on the consequences of choices with time as a fundamental consideration. To that end, I recommend honing adaptive heuristics through trial and error while maintaining a margin of safety from ruin
Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPR
There has been much discussion of the right to explanation in the EU General
Data Protection Regulation, and its existence, merits, and disadvantages.
Implementing a right to explanation that opens the black box of algorithmic
decision-making faces major legal and technical barriers. Explaining the
functionality of complex algorithmic decision-making systems and their
rationale in specific cases is a technically challenging problem. Some
explanations may offer little meaningful information to data subjects, raising
questions around their value. Explanations of automated decisions need not
hinge on the general public understanding how algorithmic systems function.
Even though such interpretability is of great importance and should be pursued,
explanations can, in principle, be offered without opening the black box.
Looking at explanations as a means to help a data subject act rather than
merely understand, one could gauge the scope and content of explanations
according to the specific goal or action they are intended to support. From the
perspective of individuals affected by automated decision-making, we propose
three aims for explanations: (1) to inform and help the individual understand
why a particular decision was reached, (2) to provide grounds to contest the
decision if the outcome is undesired, and (3) to understand what would need to
change in order to receive a desired result in the future, based on the current
decision-making model. We assess how each of these goals finds support in the
GDPR. We suggest data controllers should offer a particular type of
explanation, unconditional counterfactual explanations, to support these three
aims. These counterfactual explanations describe the smallest change to the
world that can be made to obtain a desirable outcome, or to arrive at the
closest possible world, without needing to explain the internal logic of the
system
Interactive management control via analytic hierarchy process (AHP). An empirical study in a public university hospital.
Management control in public university hospitals is a challenging task because of continuous
changes due to external pressures (e.g. economic pressures, stakeholder focuses and scientific progress)
and internal complexities (top management turnover, shared leadership, technological evolution, and
researcher oriented mission). Interactive budgeting contributed to improving vertical and horizontal
communication between hospital and stakeholders and between different organizational levels. This
paper describes an application of Analytic Hierarchy Process (AHP) to enhance interactive budgeting in
one of the biggest public university hospital in Italy. AHP improved budget allocation facilitating
elicitation and formalization of units’ needs. Furthermore, AHP facilitated vertical communication
among manager and stakeholders, as it allowed multilevel hierarchical representation of hospital needs,
and horizontal communication among staff of the same hospital, as it allowed units’ need prioritization
and standardization, with a scientific multi-criteria approach, without using complex mathematics.
Finally, AHP allowed traceability of a complex decision making processes (as budget allocation), this
aspect being of paramount importance in public sectors, where managers are called to respond to many
different stakeholders about their choices
Practical Hidden Voice Attacks against Speech and Speaker Recognition Systems
Voice Processing Systems (VPSes), now widely deployed, have been made
significantly more accurate through the application of recent advances in
machine learning. However, adversarial machine learning has similarly advanced
and has been used to demonstrate that VPSes are vulnerable to the injection of
hidden commands - audio obscured by noise that is correctly recognized by a VPS
but not by human beings. Such attacks, though, are often highly dependent on
white-box knowledge of a specific machine learning model and limited to
specific microphones and speakers, making their use across different acoustic
hardware platforms (and thus their practicality) limited. In this paper, we
break these dependencies and make hidden command attacks more practical through
model-agnostic (blackbox) attacks, which exploit knowledge of the signal
processing algorithms commonly used by VPSes to generate the data fed into
machine learning systems. Specifically, we exploit the fact that multiple
source audio samples have similar feature vectors when transformed by acoustic
feature extraction algorithms (e.g., FFTs). We develop four classes of
perturbations that create unintelligible audio and test them against 12 machine
learning models, including 7 proprietary models (e.g., Google Speech API, Bing
Speech API, IBM Speech API, Azure Speaker API, etc), and demonstrate successful
attacks against all targets. Moreover, we successfully use our maliciously
generated audio samples in multiple hardware configurations, demonstrating
effectiveness across both models and real systems. In so doing, we demonstrate
that domain-specific knowledge of audio signal processing represents a
practical means of generating successful hidden voice command attacks
- …