4,782 research outputs found
Viewpoints: A high-performance high-dimensional exploratory data analysis tool
Scientific data sets continue to increase in both size and complexity. In the
past, dedicated graphics systems at supercomputing centers were required to
visualize large data sets, but as the price of commodity graphics hardware has
dropped and its capability has increased, it is now possible, in principle, to
view large complex data sets on a single workstation. To do this in practice,
an investigator will need software that is written to take advantage of the
relevant graphics hardware. The Viewpoints visualization package described
herein is an example of such software. Viewpoints is an interactive tool for
exploratory visual analysis of large, high-dimensional (multivariate) data. It
leverages the capabilities of modern graphics boards (GPUs) to run on a single
workstation or laptop. Viewpoints is minimalist: it attempts to do a small set
of useful things very well (or at least very quickly) in comparison with
similar packages today. Its basic feature set includes linked scatter plots
with brushing, dynamic histograms, normalization and outlier detection/removal.
Viewpoints was originally designed for astrophysicists, but it has since been
used in a variety of fields that range from astronomy, quantum chemistry, fluid
dynamics, machine learning, bioinformatics, and finance to information
technology server log mining. In this article, we describe the Viewpoints
package and show examples of its usage.Comment: 18 pages, 3 figures, PASP in press, this version corresponds more
closely to that to be publishe
From SpaceStat to CyberGIS: Twenty Years of Spatial Data Analysis Software
This essay assesses the evolution of the way in which spatial data analytical methods have been incorporated into software tools over the past two decades. It is part retrospective and prospective, going beyond a historical review to outline some ideas about important factors that drove the software development, such as methodological advances, the open source movement and the advent of the internet and cyberinfrastructure. The review highlights activities carried out by the author and his collaborators and uses SpaceStat, GeoDa, PySAL and recent spatial analytical web services developed at the ASU GeoDa Center as illustrative examples. It outlines a vision for a spatial econometrics workbench as an example of the incorporation of spatial analytical functionality in a cyberGIS.
Visuelle Analyse groĂer Partikeldaten
Partikelsimulationen sind eine bewĂ€hrte und weit verbreitete numerische Methode in der Forschung und Technik. Beispielsweise werden Partikelsimulationen zur Erforschung der KraftstoffzerstĂ€ubung in Flugzeugturbinen eingesetzt. Auch die Entstehung des Universums wird durch die Simulation von dunkler Materiepartikeln untersucht. Die hierbei produzierten Datenmengen sind immens. So enthalten aktuelle Simulationen Billionen von Partikeln, die sich ĂŒber die Zeit bewegen und miteinander interagieren. Die Visualisierung bietet ein groĂes Potenzial zur Exploration, Validation und Analyse wissenschaftlicher DatensĂ€tze sowie der zugrundeliegenden
Modelle. Allerdings liegt der Fokus meist auf strukturierten Daten mit einer regulĂ€ren Topologie. Im Gegensatz hierzu bewegen sich Partikel frei durch Raum und Zeit. Diese Betrachtungsweise ist aus der Physik als das lagrange Bezugssystem bekannt. Zwar können Partikel aus dem lagrangen in ein regulĂ€res eulersches Bezugssystem, wie beispielsweise in ein uniformes Gitter, konvertiert werden. Dies ist bei einer groĂen Menge an Partikeln jedoch mit einem erheblichen Aufwand verbunden. DarĂŒber hinaus fĂŒhrt diese Konversion meist zu einem Verlust der PrĂ€zision bei gleichzeitig erhöhtem Speicherverbrauch. Im Rahmen dieser Dissertation werde ich neue Visualisierungstechniken erforschen, welche speziell auf der lagrangen Sichtweise basieren. Diese ermöglichen eine effiziente und effektive visuelle Analyse groĂer Partikeldaten
Recommended from our members
Brushing dimensions--a dual visual analysis model for high-dimensional data
In many application fields, data analysts have to deal with datasets that contain many expressions per item. The effective analysis of such multivariate datasets is dependent on the user's ability to understand both the intrinsic dimensionality of the dataset as well as the distribution of the dependent values with respect to the dimensions. In this paper, we propose a visualization model that enables the joint interactive visual analysis of multivariate datasets with respect to their dimensions as well as with respect to the actual data values. We describe a dual setting of visualization and interaction in items space and in dimensions space. The visualization of items is linked to the visualization of dimensions with brushing and focus+context visualization. With this approach, the user is able to jointly study the structure of the dimensions space as well as the distribution of data items with respect to the dimensions. Even though the proposed visualization model is general, we demonstrate its application in the context of a DNA microarray data analysis
Recommended from our members
Designing Progressive and Interactive Analytics Processes for High-Dimensional Data Analysis
In interactive data analysis processes, the dialogue between the human and the computer is the enabling mechanism that can lead to actionable observations about the phenomena being investigated. It is of paramount importance that this dialogue is not interrupted by slow computational mechanisms that do not consider any known temporal human-computer interaction characteristics that prioritize the perceptual and cognitive capabilities of the users. In cases where the analysis involves an integrated computational method, for instance to reduce the dimensionality of the data or to perform clustering, such non-optimal processes are often likely. To remedy this, progressive computations, where results are iteratively improved, are getting increasing interest in visual analytics. In this paper, we present techniques and design considerations to incorporate progressive methods within interactive analysis processes that involve high-dimensional data. We define methodologies to facilitate processes that adhere to the perceptual characteristics of users and describe how online algorithms can be incorporated within these. A set of design recommendations and according methods to support analysts in accomplishing high-dimensional data analysis tasks are then presented. Our arguments and decisions here are informed by observations gathered over a series of analysis sessions with analysts from finance. We document observations and recommendations from this study and present evidence on how our approach contribute to the efficiency and productivity of interactive visual analysis sessions involving high-dimensional data
Adaptive multiresolution visualization of large multidimensional multivariate scientific datasets
The sizes of today\u27s scientific datasets range from megabytes to terabytes, making it impossible to directly browse the raw datasets visually. This presents significant challenges for visualization scientists who are interested in supporting these datasets. In this thesis, we present an adaptive data representation model which can be utilized with many of the commonly employed visualization techniques when dealing with large amounts of data. Our hierarchical design also alleviates the long standing visualization problem due to limited display space. The idea is based on using compactly supported orthogonal wavelets and additional downsizing techniques to generate a hierarchy of fine to coarse approximations of a very large dataset for visualization.
An adaptive data hierarchy, which contains authentic multiresolution approximations and the corresponding error, has many advantages over the original data. First, it allows scientists to visualize the overall structure of a dataset by browsing its coarse approximations. Second, the fine approximations of the hierarchy provide local details of the interesting data subsets. Third, the error of the data representation can provide the scientist with information about the authenticity of the data approximation. Finally, in a client-server network environment, a coarse representation can increase the efficiency of a visualization process by quickly giving users a rough idea of the dataset before they decide whether to continue the transmission or to abort it. For datasets which require long rendering time, an authentic approximation of a very large dataset can speed up the visualization process greatly.
Variations on the main wavelet-based multiresolution hierarchy described in this thesis also lead to other multiresolution representation mechanisms. For example, we investigate the uses of norm projections and principal components to build multiresolution data hierarchies of large multivariate datasets. This leads to the development of a more flexible dual multiresolution visualization environment for large data exploration.
We present the results of experimental studies of our adaptive multiresolution representation using wavelets. Utilizing a multiresolution data hierarchy, we illustrate that information access from a dataset with tens of millions of data values can be achieved in real time. Based on these results, we propose procedures to assist in generating a multiresolution hierarchy of a large dataset. For example, the findings indicate that an ordinary computed tomography volume dataset can be represented effectively for some tasks by an adaptive data hierarchy with less than 1.5% of its original size
Designing visual analytics methods for massive collections of movement data
Exploration and analysis of large data sets cannot be carried out using purely visual means but require the involvement of database technologies, computerized data processing, and computational analysis methods. An appropriate combination of these technologies and methods with visualization may facilitate synergetic work of computer and human whereby the unique capabilities of each âpartnerâ can be utilized. We suggest a systematic approach to defining what methods and techniques, and what ways of linking them, can appropriately support such a work. The main idea is that software tools prepare and visualize the data so that the human analyst can detect various types of patterns by looking at the visual displays. To facilitate the detection of patterns, we must understand what types of patterns may exist in the data (or, more exactly, in the underlying phenomenon). This study focuses on data describing movements of multiple discrete entities that change their positions in space while preserving their integrity and identity. We define the possible types of patterns in such movement data on the basis of an abstract model of the data as a mathematical function that maps entities and times onto spatial positions. Then, we look for data transformations, computations, and visualization techniques that can facilitate the detection of these types of patterns and are suitable for very large data sets â possibly too large for a computer's memory. Under such constraints, visualization is applied to data that have previously been aggregated and generalized by means of database operations and/or computational techniques
Overview of ImageCLEF lifelog 2017: lifelog retrieval and summarization
Despite the increasing number of successful related work- shops and panels, lifelogging has rarely been the subject of a rigorous comparative benchmarking exercise. Following the success of the new lifelog evaluation task at NTCIR-12, the first ImageCLEF 2017 LifeLog task aims to bring the attention of lifelogging to a wide audience and to promote research into some of the key challenges of the coming years. The ImageCLEF 2017 LifeLog task aims to be a comparative evaluation framework for information access and retrieval systems operating over personal lifelog data. Two subtasks were available to participants; all tasks use a single mixed modality data source from three lifeloggers for a period of about one month each. The data contains a large collection of wearable camera images, an XML description of the semantic locations, as well as the physical activities of the lifeloggers. Additional visual concept information was also provided by exploiting the Caffe CNN-based visual concept detector. For the two sub-tasks, 51 topics were chosen based on the real interests of the lifeloggers. In this first year three groups participated in the task, submitting 19 runs across all subtasks, and all participants also provided working notes papers. In general, the groups performance is very good across the tasks, and there are interesting insights into these very relevant challenges
Visualization of Time-Series Data in Parameter Space for Understanding Facial Dynamics
Over the past decade, computer scientists and psychologists have made great efforts to collect and analyze facial dynamics data that exhibit different expressions and emotions. Such data is commonly captured as videos and are transformed into feature-based time-series prior to any analysis. However, the analytical tasks, such as expression classification, have been hindered by the lack of understanding of the complex data space and the associated algorithm space. Conventional graph-based time-series visualization is also found inadequate to support such tasks. In this work, we adopt a visual analytics approach by visualizing the correlation between the algorithm space and our goal â classifying facial dynamics. We transform multiple feature-based time-series for each expression in measurement space to a multi-dimensional representation in parameter space. This enables us to utilize parallel coordinates visualization to gain an understanding of the algorithm space, providing a fast and cost-effective means to support the design of analytical algorithms
- âŠ