1,631 research outputs found
Towards Data-Driven Large Scale Scientific Visualization and Exploration
Technological advances have enabled us to acquire extremely large
datasets but it remains a challenge to store, process, and extract
information from them. This dissertation builds upon recent advances
in machine learning, visualization, and user interactions to
facilitate exploration of large-scale scientific datasets. First, we
use data-driven approaches to computationally identify regions of
interest in the datasets. Second, we use visual presentation for
effective user comprehension. Third, we provide interactions for
human users to integrate domain knowledge and semantic information
into this exploration process.
Our research shows how to extract, visualize, and explore informative
regions on very large 2D landscape images, 3D volumetric datasets,
high-dimensional volumetric mouse brain datasets with thousands of
spatially-mapped gene expression profiles, and geospatial trajectories
that evolve over time. The contribution of this dissertation include:
(1) We introduce a sliding-window saliency model that discovers
regions of user interest in very large images; (2) We develop visual
segmentation of intensity-gradient histograms to identify meaningful
components from volumetric datasets; (3) We extract boundary surfaces
from a wealth of volumetric gene expression mouse brain profiles to
personalize the reference brain atlas; (4) We show how to efficiently
cluster geospatial trajectories by mapping each sequence of locations
to a high-dimensional point with the kernel distance framework.
We aim to discover patterns, relationships, and anomalies that would
lead to new scientific, engineering, and medical advances. This work
represents one of the first steps toward better visual understanding
of large-scale scientific data by combining machine learning and human
intelligence
Recommended from our members
Identifying Place Histories from Activity Traces with an Eye to Parameter Impact.
Events that happened in the past are important for understanding the ongoing processes, predicting future developments, and making informed decisions. Important and/or interesting events tend to attract many people. Some people leave traces of their attendance in the form of computer-processable data, such as records in the databases of mobile phone operators or photos on photo sharing web sites. We developed a suite of visual analytics methods for reconstructing past events from these activity traces. Our tools combine geocomputations, interactive geovisualizations, and statistical methods to enable integrated analysis of the spatial, temporal, and thematic components of the data, including numeric attributes and texts.We also support interactive investigation of the sensitivity of the analysis results to the parameters used in the computations. For this purpose, statistical summaries of computation results obtained with different combinations of parameter values are visualized in a way facilitating comparisons. We demonstrate the utility of our approach on two large real data sets, mobile phone calls in Milano during 9 days and flickr photos made on British Isles during 5 years
Rekonstruktion und skalierbare Detektion und Verfolgung von 3D Objekten
The task of detecting objects in images is essential for autonomous systems to categorize, comprehend and eventually navigate or manipulate its environment. Since many applications demand not only detection of objects but also the estimation of their exact poses, 3D CAD models can prove helpful since they provide means for feature extraction and hypothesis refinement. This work, therefore, explores two paths: firstly, we will look into methods to create richly-textured and geometrically accurate models of real-life objects. Using these reconstructions as a basis, we will investigate on how to improve in the domain of 3D object detection and pose estimation, focusing especially on scalability, i.e. the problem of dealing with multiple objects simultaneously.Objekterkennung in Bildern ist für ein autonomes System von entscheidender Bedeutung, um seine Umgebung zu kategorisieren, zu erfassen und schließlich zu navigieren oder zu manipulieren. Da viele Anwendungen nicht nur die Erkennung von Objekten, sondern auch die Schätzung ihrer exakten Positionen erfordern, können sich 3D-CAD-Modelle als hilfreich erweisen, da sie Mittel zur Merkmalsextraktion und Verfeinerung von Hypothesen bereitstellen. In dieser Arbeit werden daher zwei Wege untersucht: Erstens werden wir Methoden untersuchen, um strukturreiche und geometrisch genaue Modelle realer Objekte zu erstellen. Auf der Grundlage dieser Konstruktionen werden wir untersuchen, wie sich der Bereich der 3D-Objekterkennung und der Posenschätzung verbessern lässt, wobei insbesondere die Skalierbarkeit im Vordergrund steht, d.h. das Problem der gleichzeitigen Bearbeitung mehrerer Objekte
Aggregating Local Features into Bundles for High-Precision Object Retrieval
Due to the omnipresence of digital cameras and mobile phones the number of images stored in image databases has grown tremendously in the last years. It becomes apparent that new data management and retrieval techniques are needed to deal with increasingly large image databases. This thesis presents new techniques for content-based image retrieval where the image content itself is used to retrieve images by visual similarity from databases. We focus on the query-by-example scenario, assuming the image itself is provided as query to the retrieval engine.
In many image databases, images are often associated with metadata, which may be exploited to improve the retrieval performance. In this work, we present a technique that fuses cues from the visual domain and textual annotations into a single compact representation. This combined multimodal representation performs significantly better compared to the underlying unimodal representations, which we demonstrate on two large-scale image databases consisting of up to 10 million images.
The main focus of this work is on feature bundling for object retrieval and logo recognition. We present two novel feature bundling techniques that aggregate multiple local features into a single visual description. In contrast to many other works, both approaches encode geometric information about the spatial layout of local features into the corresponding visual description itself. Therefore, these descriptions are highly distinctive and suitable for high-precision object retrieval.
We demonstrate the use of both bundling techniques for logo recognition. Here, the recognition is performed by the retrieval of visually similar images from a database of reference images, making the recognition systems easily scalable to a large number of classes. The results show that our retrieval-based methods can successfully identify small objects such as logos with an extremely low false positive rate. In particular, our feature bundling techniques are beneficial because false positives are effectively avoided upfront due to the highly distinctive descriptions.
We further demonstrate and thoroughly evaluate the use of our bundling technique based on min-Hashing for image and object retrieval. Compared to approaches based on conventional bag-of-words retrieval, it has much higher efficiency: the retrieved result lists are shorter and cleaner while recall is on equal level. The results suggest that this bundling scheme may act as pre-filtering step in a wide range of scenarios and underline the high effectiveness of this approach.
Finally, we present a new variant for extremely fast re-ranking of retrieval results, which ranks the retrieved images according to the spatial consistency of their local features to those of the query image. The demonstrated method is robust to outliers, performs better than existing methods and allows to process several hundreds to thousands of images per second on a single thread
Doctor of Philosophy
dissertationCorrelation is a powerful relationship measure used in many fields to estimate trends and make forecasts. When the data are complex, large, and high dimensional, correlation identification is challenging. Several visualization methods have been proposed to solve these problems, but they all have limitations in accuracy, speed, or scalability. In this dissertation, we propose a methodology that provides new visual designs that show details when possible and aggregates when necessary, along with robust interactive mechanisms that together enable quick identification and investigation of meaningful relationships in large and high-dimensional data. We propose four techniques using this methodology. Depending on data size and dimensionality, the most appropriate visualization technique can be provided to optimize the analysis performance. First, to improve correlation identification tasks between two dimensions, we propose a new correlation task-specific visualization method called correlation coordinate plot (CCP). CCP transforms data into a powerful coordinate system for estimating the direction and strength of correlations among dimensions. Next, we propose three visualization designs to optimize correlation identification tasks in large and multidimensional data. The first is snowflake visualization (Snowflake), a focus+context layout for exploring all pairwise correlations. The next proposed design is a new interactive design for representing and exploring data relationships in parallel coordinate plots (PCPs) for large data, called data scalable parallel coordinate plots (DSPCP). Finally, we propose a novel technique for storing and accessing the multiway dependencies through visualization (MultiDepViz). We evaluate these approaches by using various use cases, compare them to prior work, and generate user studies to demonstrate how our proposed approaches help users explore correlation in large data efficiently. Our results confirmed that CCP/Snowflake, DSPCP, and MultiDepViz methods outperform some current visualization techniques such as scatterplots (SCPs), PCPs, SCP matrix, Corrgram, Angular Histogram, and UntangleMap in both accuracy and timing. Finally, these approaches are applied in real-world applications such as a debugging tool, large-scale code performance data, and large-scale climate data
- …