The detection of interesting patterns in large high-dimensional datasets is
difficult because of their dimensionality and pattern complexity. Therefore,
analysts require automated support for the extraction of relevant patterns. In
this paper, we present FDive, a visual active learning system that helps to
create visually explorable relevance models, assisted by learning a
pattern-based similarity. We use a small set of user-provided labels to rank
similarity measures, consisting of feature descriptor and distance function
combinations, by their ability to distinguish relevant from irrelevant data.
Based on the best-ranked similarity measure, the system calculates an
interactive Self-Organizing Map-based relevance model, which classifies data
according to the cluster affiliation. It also automatically prompts further
relevance feedback to improve its accuracy. Uncertain areas, especially near
the decision boundaries, are highlighted and can be refined by the user. We
evaluate our approach by comparison to state-of-the-art feature selection
techniques and demonstrate the usefulness of our approach by a case study
classifying electron microscopy images of brain cells. The results show that
FDive enhances both the quality and understanding of relevance models and can
thus lead to new insights for brain research.Comment: 12 pages, 7 figures, 2 tables, LaTeX; corrected typo; added DO