322 research outputs found
Relative Comparison Kernel Learning with Auxiliary Kernels
In this work we consider the problem of learning a positive semidefinite
kernel matrix from relative comparisons of the form: "object A is more similar
to object B than it is to C", where comparisons are given by humans. Existing
solutions to this problem assume many comparisons are provided to learn a high
quality kernel. However, this can be considered unrealistic for many real-world
tasks since relative assessments require human input, which is often costly or
difficult to obtain. Because of this, only a limited number of these
comparisons may be provided. In this work, we explore methods for aiding the
process of learning a kernel with the help of auxiliary kernels built from more
easily extractable information regarding the relationships among objects. We
propose a new kernel learning approach in which the target kernel is defined as
a conic combination of auxiliary kernels and a kernel whose elements are
learned directly. We formulate a convex optimization to solve for this target
kernel that adds only minor overhead to methods that use no auxiliary
information. Empirical results show that in the presence of few training
relative comparisons, our method can learn kernels that generalize to more
out-of-sample comparisons than methods that do not utilize auxiliary
information, as well as similar methods that learn metrics over objects
"AI enhances our performance, I have no doubt this one will do the same": The Placebo effect is robust to negative descriptions of AI
Heightened AI expectations facilitate performance in human-AI interactions
through placebo effects. While lowering expectations to control for placebo
effects is advisable, overly negative expectations could induce nocebo effects.
In a letter discrimination task, we informed participants that an AI would
either increase or decrease their performance by adapting the interface, but in
reality, no AI was present in any condition. A Bayesian analysis showed that
participants had high expectations and performed descriptively better
irrespective of the AI description when a sham-AI was present. Using cognitive
modeling, we could trace this advantage back to participants gathering more
information. A replication study verified that negative AI descriptions do not
alter expectations, suggesting that performance expectations with AI are biased
and robust to negative verbal descriptions. We discuss the impact of user
expectations on AI interactions and evaluation and provide a behavioral placebo
marker for human-AI interactio
Mixture of Kernels and Iterated Semidirect Product of Diffeomorphisms Groups
In the framework of large deformation diffeomorphic metric mapping (LDDMM),
we develop a multi-scale theory for the diffeomorphism group based on previous
works. The purpose of the paper is (1) to develop in details a variational
approach for multi-scale analysis of diffeomorphisms, (2) to generalise to
several scales the semidirect product representation and (3) to illustrate the
resulting diffeomorphic decomposition on synthetic and real images. We also
show that the approaches presented in other papers and the mixture of kernels
are equivalent.Comment: 21 pages, revised version without section on evaluatio
Conic Multi-Task Classification
Traditionally, Multi-task Learning (MTL) models optimize the average of
task-related objective functions, which is an intuitive approach and which we
will be referring to as Average MTL. However, a more general framework,
referred to as Conic MTL, can be formulated by considering conic combinations
of the objective functions instead; in this framework, Average MTL arises as a
special case, when all combination coefficients equal 1. Although the advantage
of Conic MTL over Average MTL has been shown experimentally in previous works,
no theoretical justification has been provided to date. In this paper, we
derive a generalization bound for the Conic MTL method, and demonstrate that
the tightest bound is not necessarily achieved, when all combination
coefficients equal 1; hence, Average MTL may not always be the optimal choice,
and it is important to consider Conic MTL. As a byproduct of the generalization
bound, it also theoretically explains the good experimental results of previous
relevant works. Finally, we propose a new Conic MTL model, whose conic
combination coefficients minimize the generalization bound, instead of choosing
them heuristically as has been done in previous methods. The rationale and
advantage of our model is demonstrated and verified via a series of experiments
by comparing with several other methods.Comment: Accepted by European Conference on Machine Learning and Principles
and Practice of Knowledge Discovery in Databases (ECMLPKDD)-201
Labeling Neural Representations with Inverse Recognition
Deep Neural Networks (DNNs) demonstrate remarkable capabilities in learning
complex hierarchical data representations, but the nature of these
representations remains largely unknown. Existing global explainability
methods, such as Network Dissection, face limitations such as reliance on
segmentation masks, lack of statistical significance testing, and high
computational demands. We propose Inverse Recognition (INVERT), a scalable
approach for connecting learned representations with human-understandable
concepts by leveraging their capacity to discriminate between these concepts.
In contrast to prior work, INVERT is capable of handling diverse types of
neurons, exhibits less computational complexity, and does not rely on the
availability of segmentation masks. Moreover, INVERT provides an interpretable
metric assessing the alignment between the representation and its corresponding
explanation and delivering a measure of statistical significance. We
demonstrate the applicability of INVERT in various scenarios, including the
identification of representations affected by spurious correlations, and the
interpretation of the hierarchical structure of decision-making within the
models.Comment: 25 pages, 16 figure
A Unifying View of Multiple Kernel Learning
Recent research on multiple kernel learning has lead to a number of
approaches for combining kernels in regularized risk minimization. The proposed
approaches include different formulations of objectives and varying
regularization strategies. In this paper we present a unifying general
optimization criterion for multiple kernel learning and show how existing
formulations are subsumed as special cases. We also derive the criterion's dual
representation, which is suitable for general smooth optimization algorithms.
Finally, we evaluate multiple kernel learning in this framework analytically
using a Rademacher complexity bound on the generalization error and empirically
in a set of experiments
Prognostic Significance of Negative Lymph Node Long Axis in Esophageal Cancer: Results From the Randomized Controlled UK MRC OE02 Trial
OBJECTIVE: To analyze the relationship between negative lymph node (LNneg) size as a possible surrogate marker of the host antitumor immune response and overall survival (OS) in esophageal cancer (EC) patients. BACKGROUND: Lymph node (LN) status is a well-established prognostic factor in EC patients. An increased number of LNnegs is related to better survival in EC. Follicular hyperplasia in LNneg is associated with better survival in cancer-bearing mice and might explain increased LN size. METHODS: The long axis of 304 LNnegs was measured in hematoxylin-eosin stained sections from resection specimens of 367 OE02 trial patients (188 treated with surgery alone (S), 179 with neoadjuvant chemotherapy plus surgery (C+S)) as a surrogate of LN size. The relationship between LNneg size, LNneg microarchitecture, clinicopathological variables, and OS was analyzed. RESULTS: Large LNneg size was related to lower pN category (P = 0.01) and lower frequency of lymphatic invasion (P = 0.02) in S patients only. Irrespective of treatment, (y)pN0 patients with large LNneg had the best OS. (y)pN1 patients had the poorest OS irrespective of LNneg size (P < 0.001). Large LNneg contained less lymphocytes (P = 0.02) and had a higher germinal centers/lymphocyte ratio (P = 0.05). CONCLUSIONS: This is the first study to investigate LNneg size in EC patients randomized to neoadjuvant chemotherapy followed by surgery or surgery alone. Our pilot study suggests that LNneg size is a surrogate marker of the host antitumor immune response and a potentially clinically useful new prognostic biomarker for (y)pN0 EC patients. Future studies need to confirm our results and explore underlying biological mechanisms
Machine Learning Models that Remember Too Much
Machine learning (ML) is becoming a commodity. Numerous ML frameworks and
services are available to data holders who are not ML experts but want to train
predictive models on their data. It is important that ML models trained on
sensitive inputs (e.g., personal images or documents) not leak too much
information about the training data.
We consider a malicious ML provider who supplies model-training code to the
data holder, does not observe the training, but then obtains white- or
black-box access to the resulting model. In this setting, we design and
implement practical algorithms, some of them very similar to standard ML
techniques such as regularization and data augmentation, that "memorize"
information about the training dataset in the model yet the model is as
accurate and predictive as a conventionally trained model. We then explain how
the adversary can extract memorized information from the model.
We evaluate our techniques on standard ML tasks for image classification
(CIFAR10), face recognition (LFW and FaceScrub), and text analysis (20
Newsgroups and IMDB). In all cases, we show how our algorithms create models
that have high predictive power yet allow accurate extraction of subsets of
their training data
Security Evaluation of Support Vector Machines in Adversarial Environments
Support Vector Machines (SVMs) are among the most popular classification
techniques adopted in security applications like malware detection, intrusion
detection, and spam filtering. However, if SVMs are to be incorporated in
real-world security systems, they must be able to cope with attack patterns
that can either mislead the learning algorithm (poisoning), evade detection
(evasion), or gain information about their internal parameters (privacy
breaches). The main contributions of this chapter are twofold. First, we
introduce a formal general framework for the empirical evaluation of the
security of machine-learning systems. Second, according to our framework, we
demonstrate the feasibility of evasion, poisoning and privacy attacks against
SVMs in real-world security problems. For each attack technique, we evaluate
its impact and discuss whether (and how) it can be countered through an
adversary-aware design of SVMs. Our experiments are easily reproducible thanks
to open-source code that we have made available, together with all the employed
datasets, on a public repository.Comment: 47 pages, 9 figures; chapter accepted into book 'Support Vector
Machine Applications
- …