2,298 research outputs found

    Design and Evaluation of a Probabilistic Music Projection Interface

    Get PDF
    We describe the design and evaluation of a probabilistic interface for music exploration and casual playlist generation. Predicted subjective features, such as mood and genre, inferred from low-level audio features create a 34- dimensional feature space. We use a nonlinear dimensionality reduction algorithm to create 2D music maps of tracks, and augment these with visualisations of probabilistic mappings of selected features and their uncertainty. We evaluated the system in a longitudinal trial in users’ homes over several weeks. Users said they had fun with the interface and liked the casual nature of the playlist generation. Users preferred to generate playlists from a local neighbourhood of the map, rather than from a trajectory, using neighbourhood selection more than three times more often than path selection. Probabilistic highlighting of subjective features led to more focused exploration in mouse activity logs, and 6 of 8 users said they preferred the probabilistic highlighting mode

    Information visualization for DNA microarray data analysis: A critical review

    Get PDF
    Graphical representation may provide effective means of making sense of the complexity and sheer volume of data produced by DNA microarray experiments that monitor the expression patterns of thousands of genes simultaneously. The ability to use ldquoabstractrdquo graphical representation to draw attention to areas of interest, and more in-depth visualizations to answer focused questions, would enable biologists to move from a large amount of data to particular records they are interested in, and therefore, gain deeper insights in understanding the microarray experiment results. This paper starts by providing some background knowledge of microarray experiments, and then, explains how graphical representation can be applied in general to this problem domain, followed by exploring the role of visualization in gene expression data analysis. Having set the problem scene, the paper then examines various multivariate data visualization techniques that have been applied to microarray data analysis. These techniques are critically reviewed so that the strengths and weaknesses of each technique can be tabulated. Finally, several key problem areas as well as possible solutions to them are discussed as being a source for future work

    Teaching Categories to Human Learners with Visual Explanations

    Get PDF
    We study the problem of computer-assisted teaching with explanations. Conventional approaches for machine teaching typically only provide feedback at the instance level e.g., the category or label of the instance. However, it is intuitive that clear explanations from a knowledgeable teacher can significantly improve a student's ability to learn a new concept. To address these existing limitations, we propose a teaching framework that provides interpretable explanations as feedback and models how the learner incorporates this additional information. In the case of images, we show that we can automatically generate explanations that highlight the parts of the image that are responsible for the class label. Experiments on human learners illustrate that, on average, participants achieve better test set performance on challenging categorization tasks when taught with our interpretable approach compared to existing methods

    Interpreting CLIP's Image Representation via Text-Based Decomposition

    Full text link
    We investigate the CLIP image encoder by analyzing how individual model components affect the final representation. We decompose the image representation as a sum across individual image patches, model layers, and attention heads, and use CLIP's text representation to interpret the summands. Interpreting the attention heads, we characterize each head's role by automatically finding text representations that span its output space, which reveals property-specific roles for many heads (e.g. location or shape). Next, interpreting the image patches, we uncover an emergent spatial localization within CLIP. Finally, we use this understanding to remove spurious features from CLIP and to create a strong zero-shot image segmenter. Our results indicate that a scalable understanding of transformer models is attainable and can be used to repair and improve models.Comment: Project page and code: https://yossigandelsman.github.io/clip_decomposition

    Prompt Learning with Optimal Transport for Vision-Language Models

    Full text link
    With the increasing attention to large vision-language models such as CLIP, there has been a significant amount of effort dedicated to building efficient prompts. Unlike conventional methods of only learning one single prompt, we propose to learn multiple comprehensive prompts to describe diverse characteristics of categories such as intrinsic attributes or extrinsic contexts. However, directly matching each prompt to the same visual feature is problematic, as it pushes the prompts to converge to one point. To solve this problem, we propose to apply optimal transport to match the vision and text modalities. Specifically, we first model images and the categories with visual and textual feature sets. Then, we apply a two-stage optimization strategy to learn the prompts. In the inner loop, we optimize the optimal transport distance to align visual features and prompts by the Sinkhorn algorithm, while in the outer loop, we learn the prompts by this distance from the supervised data. Extensive experiments are conducted on the few-shot recognition task and the improvement demonstrates the superiority of our method

    Towards AI-Assisted Disease Diagnosis: Learning Deep Feature Representations for Medical Image Analysis

    Get PDF
    Artificial Intelligence (AI) has impacted our lives in many meaningful ways. For our research, we focus on improving disease diagnosis systems by analyzing medical images using AI, specifically deep learning technologies. The recent advances in deep learning technologies are leading to enhanced performance for medical image analysis and computer-aided disease diagnosis. In this dissertation, we explore a major research area in medical image analysis - Image classification. Image classification is the process to assign an image a label from a fixed set of categories. For our research, we focus on the problem of Alzheimer\u27s Disease (AD) diagnosis from 3D structural Magnetic Resonance Imaging (sMRI) and Positron Emission Tomography (PET) brain scans. Alzheimer\u27s Disease is a severe neurological disorder. In this dissertation, we address challenges related to Alzheimer\u27s Disease diagnosis and propose several models for improved diagnosis. We focus on analyzing the 3D Structural MRI (sMRI) and Positron Emission Tomography (PET) brain scans to identify the current stage of Alzheimer\u27s Disease: Normal Controls (CN), Mild Cognitive Impairment (MCI), and Alzheimer\u27s Disease (AD). This dissertation demonstrates ways to improve the performance of a Convolutional Neural Network (CNN) for Alzheimer\u27s Disease diagnosis. Besides, we present approaches to solve the class-imbalance problem and improving classification performance with limited training data for medical image analysis. To understand the decision of the CNN, we present methods to visualize the behavior of a CNN model for disease diagnosis. As a case study, we analyzed brain PET scans of AD and CN patients to see how CNN discriminates among data samples of different classes. Additionally, this dissertation proposes a novel approach to generate synthetic medical images using Generative Adversarial Networks (GANs). Working with the limited dataset and small amount of annotated samples makes it difficult to develop a robust automated disease diagnosis model. Our proposed model can solve such issue and generate brain MRI and PET images for three different stages of Alzheimer\u27s Disease - Normal Control (CN), Mild Cognitive Impairment (MCI), and Alzheimer\u27s Disease (AD). Our proposed approach can be generalized to create synthetic data for other medical image analysis problems and help to develop better disease diagnosis model
    corecore