167,855 research outputs found
The LSST Data Mining Research Agenda
We describe features of the LSST science database that are amenable to
scientific data mining, object classification, outlier identification, anomaly
detection, image quality assurance, and survey science validation. The data
mining research agenda includes: scalability (at petabytes scales) of existing
machine learning and data mining algorithms; development of grid-enabled
parallel data mining algorithms; designing a robust system for brokering
classifications from the LSST event pipeline (which may produce 10,000 or more
event alerts per night); multi-resolution methods for exploration of petascale
databases; indexing of multi-attribute multi-dimensional astronomical databases
(beyond spatial indexing) for rapid querying of petabyte databases; and more.Comment: 5 pages, Presented at the "Classification and Discovery in Large
Astronomical Surveys" meeting, Ringberg Castle, 14-17 October, 200
Quantitative Approach on Parallel Coordinates and Scatter Plots for Multidimensional-Data Visual Analytics
Parallel coordinates and scatter plots are two well-known visualization techniques for multidimensional data analytics and often employed cooperatively for flexibility increase in exploration of such data. Existing approaches approximately consider qualitative issues and single attribute comparison, which might face statistic challenges in case of quantitative requirement. This paper introduces a new quantitative approach for visual enhancement of parallel coordinates and scatter plots in term of multiple attribute comparison. The method is based on the visual integration of interactive stacked bars and visual queries on parallel axes and scatter charts. The parallel coordinates play the role of a context view while the scatter charts are for focus details. Using the technique, users could not only quantitatively analyze multivariate data, but also flexibly compare multiple target attributes. Moreover, further investigation is enabled for deep understanding of desired information. The characteristics and usefulness of our approach are demonstrated via a case study with two typical use cases
Timbral Data Sonification from Parallel Attribute Graphs
Parallel coordinate plotting is an established data visualization technique that provides means for graphing and exploring multidimensional relational datasets on a two-dimensional display. Each vertical axis represents the range of values for one attribute, and each data tuple appears as a connected path traveling left-to-right across the plot, connecting attribute values for that tuple on the vertical axes. Parallel coordinate plots look like timedomain audio signal waveforms, and they can be translated into audio signals through straightforward mapping algorithms. This study looks at three data sonification algorithms, sonification being the mapping of data into sounds for perceptual exploration, similar to uses of data visualization. Sound-response survey results and subsequent analyses reveal that the most direct method for mapping parallel coordinates of data tuples to audio waveforms is the most accurate for generating sounds that listeners can use to classify data. Future work has begun on improving the accuracy of this audio waveform-based, timbral approach to classifying data
Visual and interactive exploration of point data
Point data, such as Unit Postcodes (UPC), can provide very detailed information at fine
scales of resolution. For instance, socio-economic attributes are commonly assigned to
UPC. Hence, they can be represented as points and observable at the postcode level.
Using UPC as a common field allows the concatenation of variables from disparate data
sources that can potentially support sophisticated spatial analysis. However, visualising
UPC in urban areas has at least three limitations. First, at small scales UPC occurrences
can be very dense making their visualisation as points difficult. On the other hand,
patterns in the associated attribute values are often hardly recognisable at large scales.
Secondly, UPC can be used as a common field to allow the concatenation of highly
multivariate data sets with an associated postcode. Finally, socio-economic variables
assigned to UPC (such as the ones used here) can be non-Normal in their distributions
as a result of a large presence of zero values and high variances which constrain their
analysis using traditional statistics.
This paper discusses a Point Visualisation Tool (PVT), a proof-of-concept system
developed to visually explore point data. Various well-known visualisation techniques
were implemented to enable their interactive and dynamic interrogation. PVT provides
multiple representations of point data to facilitate the understanding of the relations
between attributes or variables as well as their spatial characteristics. Brushing between
alternative views is used to link several representations of a single attribute, as well as
to simultaneously explore more than one variable. PVT’s functionality shows how the
use of visual techniques embedded in an interactive environment enable the exploration
of large amounts of multivariate point data
- …