2,298 research outputs found
Design and Evaluation of a Probabilistic Music Projection Interface
We describe the design and evaluation of a probabilistic
interface for music exploration and casual playlist generation.
Predicted subjective features, such as mood and
genre, inferred from low-level audio features create a 34-
dimensional feature space. We use a nonlinear dimensionality
reduction algorithm to create 2D music maps of
tracks, and augment these with visualisations of probabilistic
mappings of selected features and their uncertainty.
We evaluated the system in a longitudinal trial in users’
homes over several weeks. Users said they had fun with the
interface and liked the casual nature of the playlist generation.
Users preferred to generate playlists from a local
neighbourhood of the map, rather than from a trajectory,
using neighbourhood selection more than three times more
often than path selection. Probabilistic highlighting of subjective
features led to more focused exploration in mouse
activity logs, and 6 of 8 users said they preferred the probabilistic
highlighting mode
Information visualization for DNA microarray data analysis: A critical review
Graphical representation may provide effective means of making sense of the complexity and sheer volume of data produced by DNA microarray experiments that monitor the expression patterns of thousands of genes simultaneously. The ability to use ldquoabstractrdquo graphical representation to draw attention to areas of interest, and more in-depth visualizations to answer focused questions, would enable biologists to move from a large amount of data to particular records they are interested in, and therefore, gain deeper insights in understanding the microarray experiment results. This paper starts by providing some background knowledge of microarray experiments, and then, explains how graphical representation can be applied in general to this problem domain, followed by exploring the role of visualization in gene expression data analysis. Having set the problem scene, the paper then examines various multivariate data visualization techniques that have been applied to microarray data analysis. These techniques are critically reviewed so that the strengths and weaknesses of each technique can be tabulated. Finally, several key problem areas as well as possible solutions to them are discussed as being a source for future work
Teaching Categories to Human Learners with Visual Explanations
We study the problem of computer-assisted teaching with explanations.
Conventional approaches for machine teaching typically only provide feedback at
the instance level e.g., the category or label of the instance. However, it is
intuitive that clear explanations from a knowledgeable teacher can
significantly improve a student's ability to learn a new concept. To address
these existing limitations, we propose a teaching framework that provides
interpretable explanations as feedback and models how the learner incorporates
this additional information. In the case of images, we show that we can
automatically generate explanations that highlight the parts of the image that
are responsible for the class label. Experiments on human learners illustrate
that, on average, participants achieve better test set performance on
challenging categorization tasks when taught with our interpretable approach
compared to existing methods
Interpreting CLIP's Image Representation via Text-Based Decomposition
We investigate the CLIP image encoder by analyzing how individual model
components affect the final representation. We decompose the image
representation as a sum across individual image patches, model layers, and
attention heads, and use CLIP's text representation to interpret the summands.
Interpreting the attention heads, we characterize each head's role by
automatically finding text representations that span its output space, which
reveals property-specific roles for many heads (e.g. location or shape). Next,
interpreting the image patches, we uncover an emergent spatial localization
within CLIP. Finally, we use this understanding to remove spurious features
from CLIP and to create a strong zero-shot image segmenter. Our results
indicate that a scalable understanding of transformer models is attainable and
can be used to repair and improve models.Comment: Project page and code:
https://yossigandelsman.github.io/clip_decomposition
Prompt Learning with Optimal Transport for Vision-Language Models
With the increasing attention to large vision-language models such as CLIP,
there has been a significant amount of effort dedicated to building efficient
prompts. Unlike conventional methods of only learning one single prompt, we
propose to learn multiple comprehensive prompts to describe diverse
characteristics of categories such as intrinsic attributes or extrinsic
contexts. However, directly matching each prompt to the same visual feature is
problematic, as it pushes the prompts to converge to one point. To solve this
problem, we propose to apply optimal transport to match the vision and text
modalities. Specifically, we first model images and the categories with visual
and textual feature sets. Then, we apply a two-stage optimization strategy to
learn the prompts. In the inner loop, we optimize the optimal transport
distance to align visual features and prompts by the Sinkhorn algorithm, while
in the outer loop, we learn the prompts by this distance from the supervised
data. Extensive experiments are conducted on the few-shot recognition task and
the improvement demonstrates the superiority of our method
Towards AI-Assisted Disease Diagnosis: Learning Deep Feature Representations for Medical Image Analysis
Artificial Intelligence (AI) has impacted our lives in many meaningful ways. For our research, we focus on improving disease diagnosis systems by analyzing medical images using AI, specifically deep learning technologies. The recent advances in deep learning technologies are leading to enhanced performance for medical image analysis and computer-aided disease diagnosis. In this dissertation, we explore a major research area in medical image analysis - Image classification. Image classification is the process to assign an image a label from a fixed set of categories. For our research, we focus on the problem of Alzheimer\u27s Disease (AD) diagnosis from 3D structural Magnetic Resonance Imaging (sMRI) and Positron Emission Tomography (PET) brain scans.
Alzheimer\u27s Disease is a severe neurological disorder. In this dissertation, we address challenges related to Alzheimer\u27s Disease diagnosis and propose several models for improved diagnosis. We focus on analyzing the 3D Structural MRI (sMRI) and Positron Emission Tomography (PET) brain scans to identify the current stage of Alzheimer\u27s Disease: Normal Controls (CN), Mild Cognitive Impairment (MCI), and Alzheimer\u27s Disease (AD). This dissertation demonstrates ways to improve the performance of a Convolutional Neural Network (CNN) for Alzheimer\u27s Disease diagnosis. Besides, we present approaches to solve the class-imbalance problem and improving classification performance with limited training data for medical image analysis. To understand the decision of the CNN, we present methods to visualize the behavior of a CNN model for disease diagnosis. As a case study, we analyzed brain PET scans of AD and CN patients to see how CNN discriminates among data samples of different classes.
Additionally, this dissertation proposes a novel approach to generate synthetic medical images using Generative Adversarial Networks (GANs). Working with the limited dataset and small amount of annotated samples makes it difficult to develop a robust automated disease diagnosis model. Our proposed model can solve such issue and generate brain MRI and PET images for three different stages of Alzheimer\u27s Disease - Normal Control (CN), Mild Cognitive Impairment (MCI), and Alzheimer\u27s Disease (AD). Our proposed approach can be generalized to create synthetic data for other medical image analysis problems and help to develop better disease diagnosis model
- …