322 research outputs found

    Relative Comparison Kernel Learning with Auxiliary Kernels

    Full text link
    In this work we consider the problem of learning a positive semidefinite kernel matrix from relative comparisons of the form: "object A is more similar to object B than it is to C", where comparisons are given by humans. Existing solutions to this problem assume many comparisons are provided to learn a high quality kernel. However, this can be considered unrealistic for many real-world tasks since relative assessments require human input, which is often costly or difficult to obtain. Because of this, only a limited number of these comparisons may be provided. In this work, we explore methods for aiding the process of learning a kernel with the help of auxiliary kernels built from more easily extractable information regarding the relationships among objects. We propose a new kernel learning approach in which the target kernel is defined as a conic combination of auxiliary kernels and a kernel whose elements are learned directly. We formulate a convex optimization to solve for this target kernel that adds only minor overhead to methods that use no auxiliary information. Empirical results show that in the presence of few training relative comparisons, our method can learn kernels that generalize to more out-of-sample comparisons than methods that do not utilize auxiliary information, as well as similar methods that learn metrics over objects

    "AI enhances our performance, I have no doubt this one will do the same": The Placebo effect is robust to negative descriptions of AI

    Full text link
    Heightened AI expectations facilitate performance in human-AI interactions through placebo effects. While lowering expectations to control for placebo effects is advisable, overly negative expectations could induce nocebo effects. In a letter discrimination task, we informed participants that an AI would either increase or decrease their performance by adapting the interface, but in reality, no AI was present in any condition. A Bayesian analysis showed that participants had high expectations and performed descriptively better irrespective of the AI description when a sham-AI was present. Using cognitive modeling, we could trace this advantage back to participants gathering more information. A replication study verified that negative AI descriptions do not alter expectations, suggesting that performance expectations with AI are biased and robust to negative verbal descriptions. We discuss the impact of user expectations on AI interactions and evaluation and provide a behavioral placebo marker for human-AI interactio

    Mixture of Kernels and Iterated Semidirect Product of Diffeomorphisms Groups

    Full text link
    In the framework of large deformation diffeomorphic metric mapping (LDDMM), we develop a multi-scale theory for the diffeomorphism group based on previous works. The purpose of the paper is (1) to develop in details a variational approach for multi-scale analysis of diffeomorphisms, (2) to generalise to several scales the semidirect product representation and (3) to illustrate the resulting diffeomorphic decomposition on synthetic and real images. We also show that the approaches presented in other papers and the mixture of kernels are equivalent.Comment: 21 pages, revised version without section on evaluatio

    Conic Multi-Task Classification

    Full text link
    Traditionally, Multi-task Learning (MTL) models optimize the average of task-related objective functions, which is an intuitive approach and which we will be referring to as Average MTL. However, a more general framework, referred to as Conic MTL, can be formulated by considering conic combinations of the objective functions instead; in this framework, Average MTL arises as a special case, when all combination coefficients equal 1. Although the advantage of Conic MTL over Average MTL has been shown experimentally in previous works, no theoretical justification has been provided to date. In this paper, we derive a generalization bound for the Conic MTL method, and demonstrate that the tightest bound is not necessarily achieved, when all combination coefficients equal 1; hence, Average MTL may not always be the optimal choice, and it is important to consider Conic MTL. As a byproduct of the generalization bound, it also theoretically explains the good experimental results of previous relevant works. Finally, we propose a new Conic MTL model, whose conic combination coefficients minimize the generalization bound, instead of choosing them heuristically as has been done in previous methods. The rationale and advantage of our model is demonstrated and verified via a series of experiments by comparing with several other methods.Comment: Accepted by European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD)-201

    Labeling Neural Representations with Inverse Recognition

    Full text link
    Deep Neural Networks (DNNs) demonstrate remarkable capabilities in learning complex hierarchical data representations, but the nature of these representations remains largely unknown. Existing global explainability methods, such as Network Dissection, face limitations such as reliance on segmentation masks, lack of statistical significance testing, and high computational demands. We propose Inverse Recognition (INVERT), a scalable approach for connecting learned representations with human-understandable concepts by leveraging their capacity to discriminate between these concepts. In contrast to prior work, INVERT is capable of handling diverse types of neurons, exhibits less computational complexity, and does not rely on the availability of segmentation masks. Moreover, INVERT provides an interpretable metric assessing the alignment between the representation and its corresponding explanation and delivering a measure of statistical significance. We demonstrate the applicability of INVERT in various scenarios, including the identification of representations affected by spurious correlations, and the interpretation of the hierarchical structure of decision-making within the models.Comment: 25 pages, 16 figure

    A Unifying View of Multiple Kernel Learning

    Full text link
    Recent research on multiple kernel learning has lead to a number of approaches for combining kernels in regularized risk minimization. The proposed approaches include different formulations of objectives and varying regularization strategies. In this paper we present a unifying general optimization criterion for multiple kernel learning and show how existing formulations are subsumed as special cases. We also derive the criterion's dual representation, which is suitable for general smooth optimization algorithms. Finally, we evaluate multiple kernel learning in this framework analytically using a Rademacher complexity bound on the generalization error and empirically in a set of experiments

    Prognostic Significance of Negative Lymph Node Long Axis in Esophageal Cancer: Results From the Randomized Controlled UK MRC OE02 Trial

    Get PDF
    OBJECTIVE: To analyze the relationship between negative lymph node (LNneg) size as a possible surrogate marker of the host antitumor immune response and overall survival (OS) in esophageal cancer (EC) patients. BACKGROUND: Lymph node (LN) status is a well-established prognostic factor in EC patients. An increased number of LNnegs is related to better survival in EC. Follicular hyperplasia in LNneg is associated with better survival in cancer-bearing mice and might explain increased LN size. METHODS: The long axis of 304 LNnegs was measured in hematoxylin-eosin stained sections from resection specimens of 367 OE02 trial patients (188 treated with surgery alone (S), 179 with neoadjuvant chemotherapy plus surgery (C+S)) as a surrogate of LN size. The relationship between LNneg size, LNneg microarchitecture, clinicopathological variables, and OS was analyzed. RESULTS: Large LNneg size was related to lower pN category (P = 0.01) and lower frequency of lymphatic invasion (P = 0.02) in S patients only. Irrespective of treatment, (y)pN0 patients with large LNneg had the best OS. (y)pN1 patients had the poorest OS irrespective of LNneg size (P < 0.001). Large LNneg contained less lymphocytes (P = 0.02) and had a higher germinal centers/lymphocyte ratio (P = 0.05). CONCLUSIONS: This is the first study to investigate LNneg size in EC patients randomized to neoadjuvant chemotherapy followed by surgery or surgery alone. Our pilot study suggests that LNneg size is a surrogate marker of the host antitumor immune response and a potentially clinically useful new prognostic biomarker for (y)pN0 EC patients. Future studies need to confirm our results and explore underlying biological mechanisms

    Machine Learning Models that Remember Too Much

    Full text link
    Machine learning (ML) is becoming a commodity. Numerous ML frameworks and services are available to data holders who are not ML experts but want to train predictive models on their data. It is important that ML models trained on sensitive inputs (e.g., personal images or documents) not leak too much information about the training data. We consider a malicious ML provider who supplies model-training code to the data holder, does not observe the training, but then obtains white- or black-box access to the resulting model. In this setting, we design and implement practical algorithms, some of them very similar to standard ML techniques such as regularization and data augmentation, that "memorize" information about the training dataset in the model yet the model is as accurate and predictive as a conventionally trained model. We then explain how the adversary can extract memorized information from the model. We evaluate our techniques on standard ML tasks for image classification (CIFAR10), face recognition (LFW and FaceScrub), and text analysis (20 Newsgroups and IMDB). In all cases, we show how our algorithms create models that have high predictive power yet allow accurate extraction of subsets of their training data

    Security Evaluation of Support Vector Machines in Adversarial Environments

    Full text link
    Support Vector Machines (SVMs) are among the most popular classification techniques adopted in security applications like malware detection, intrusion detection, and spam filtering. However, if SVMs are to be incorporated in real-world security systems, they must be able to cope with attack patterns that can either mislead the learning algorithm (poisoning), evade detection (evasion), or gain information about their internal parameters (privacy breaches). The main contributions of this chapter are twofold. First, we introduce a formal general framework for the empirical evaluation of the security of machine-learning systems. Second, according to our framework, we demonstrate the feasibility of evasion, poisoning and privacy attacks against SVMs in real-world security problems. For each attack technique, we evaluate its impact and discuss whether (and how) it can be countered through an adversary-aware design of SVMs. Our experiments are easily reproducible thanks to open-source code that we have made available, together with all the employed datasets, on a public repository.Comment: 47 pages, 9 figures; chapter accepted into book 'Support Vector Machine Applications
    • …
    corecore