49,201 research outputs found
Using treemaps for variable selection in spatio-temporal visualisation
We demonstrate and reflect upon the use of enhanced treemaps that incorporate spatial and temporal ordering for exploring a large multivariate spatio-temporal data set. The resulting data-dense views summarise and simultaneously present hundreds of space-, time-, and variable-constrained subsets of a large multivariate data set in a structure that facilitates their meaningful comparison and supports visual analysis. Interactive techniques allow localised patterns to be explored and subsets of interest selected and compared with the spatial aggregate. Spatial variation is considered through interactive raster maps and high-resolution local road maps. The techniques are developed in the context of 42.2 million records of vehicular activity in a 98 km(2) area of central London and informally evaluated through a design used in the exploratory visualisation of this data set. The main advantages of our technique are the means to simultaneously display hundreds of summaries of the data and to interactively browse hundreds of variable combinations with ordering and symbolism that are consistent and appropriate for space- and time- based variables. These capabilities are difficult to achieve in the case of spatio-temporal data with categorical attributes using existing geovisualisation methods. We acknowledge limitations in the treemap representation but enhance the cognitive plausibility of this popular layout through our two-dimensional ordering algorithm and interactions. Patterns that are expected (e.g. more traffic in central London), interesting (e.g. the spatial and temporal distribution of particular vehicle types) and anomalous (e.g. low speeds on particular road sections) are detected at various scales and locations using the approach. In many cases, anomalies identify biases that may have implications for future use of the data set for analyses and applications. Ordered treemaps appear to have potential as interactive interfaces for variable selection in spatio-temporal visualisation. Information Visualization (2008) 7, 210-224. doi: 10.1057/palgrave.ivs.950018
Re-mining item associations: methodology and a case study in apparel retailing
Association mining is the conventional data mining technique for analyzing market basket data and it reveals the positive and negative associations between items. While being an integral part of transaction data, pricing and time information have not been integrated into market basket analysis in earlier studies. This paper proposes a new approach to mine price, time and domain related attributes through re-mining of association mining results. The underlying factors behind positive and negative relationships can be characterized and described through this second data mining stage. The applicability of the methodology is demonstrated through the analysis of data coming from a large apparel retail chain, and its algorithmic complexity is analyzed in comparison to the existing techniques
Conditional t-SNE: Complementary t-SNE embeddings through factoring out prior information
Dimensionality reduction and manifold learning methods such as t-Distributed
Stochastic Neighbor Embedding (t-SNE) are routinely used to map
high-dimensional data into a 2-dimensional space to visualize and explore the
data. However, two dimensions are typically insufficient to capture all
structure in the data, the salient structure is often already known, and it is
not obvious how to extract the remaining information in a similarly effective
manner. To fill this gap, we introduce \emph{conditional t-SNE} (ct-SNE), a
generalization of t-SNE that discounts prior information from the embedding in
the form of labels. To achieve this, we propose a conditioned version of the
t-SNE objective, obtaining a single, integrated, and elegant method. ct-SNE has
one extra parameter over t-SNE; we investigate its effects and show how to
efficiently optimize the objective. Factoring out prior knowledge allows
complementary structure to be captured in the embedding, providing new
insights. Qualitative and quantitative empirical results on synthetic and
(large) real data show ct-SNE is effective and achieves its goal
Beyond Classification: Latent User Interests Profiling from Visual Contents Analysis
User preference profiling is an important task in modern online social
networks (OSN). With the proliferation of image-centric social platforms, such
as Pinterest, visual contents have become one of the most informative data
streams for understanding user preferences. Traditional approaches usually
treat visual content analysis as a general classification problem where one or
more labels are assigned to each image. Although such an approach simplifies
the process of image analysis, it misses the rich context and visual cues that
play an important role in people's perception of images. In this paper, we
explore the possibilities of learning a user's latent visual preferences
directly from image contents. We propose a distance metric learning method
based on Deep Convolutional Neural Networks (CNN) to directly extract
similarity information from visual contents and use the derived distance metric
to mine individual users' fine-grained visual preferences. Through our
preliminary experiments using data from 5,790 Pinterest users, we show that
even for the images within the same category, each user possesses distinct and
individually-identifiable visual preferences that are consistent over their
lifetime. Our results underscore the untapped potential of finer-grained visual
preference profiling in understanding users' preferences.Comment: 2015 IEEE 15th International Conference on Data Mining Workshop
Fusion of Learned Multi-Modal Representations and Dense Trajectories for Emotional Analysis in Videos
When designing a video affective content analysis algorithm, one of the most important steps is the selection of discriminative features for the effective representation of video segments. The majority of existing affective content analysis methods either use low-level audio-visual features or generate handcrafted higher level representations based on these low-level features. We propose in this work to use deep learning methods, in particular convolutional neural networks (CNNs), in order to automatically learn and extract mid-level representations from raw data. To this end, we exploit the audio and visual modality of videos by employing Mel-Frequency Cepstral Coefficients (MFCC) and color values in the HSV color space. We also incorporate dense trajectory based motion features in order to further enhance the performance of the analysis. By means of multi-class support vector machines (SVMs) and fusion mechanisms, music video clips are classified into one of four affective categories representing the four quadrants of the Valence-Arousal (VA) space. Results obtained on a subset of the DEAP dataset show (1) that higher level representations perform better than low-level features, and (2) that incorporating motion information leads to a notable performance gain, independently from the chosen representation
Fault prediction in aircraft engines using Self-Organizing Maps
Aircraft engines are designed to be used during several tens of years. Their
maintenance is a challenging and costly task, for obvious security reasons. The
goal is to ensure a proper operation of the engines, in all conditions, with a
zero probability of failure, while taking into account aging. The fact that the
same engine is sometimes used on several aircrafts has to be taken into account
too. The maintenance can be improved if an efficient procedure for the
prediction of failures is implemented. The primary source of information on the
health of the engines comes from measurement during flights. Several variables
such as the core speed, the oil pressure and quantity, the fan speed, etc. are
measured, together with environmental variables such as the outside
temperature, altitude, aircraft speed, etc. In this paper, we describe the
design of a procedure aiming at visualizing successive data measured on
aircraft engines. The data are multi-dimensional measurements on the engines,
which are projected on a self-organizing map in order to allow us to follow the
trajectories of these data over time. The trajectories consist in a succession
of points on the map, each of them corresponding to the two-dimensional
projection of the multi-dimensional vector of engine measurements. Analyzing
the trajectories aims at visualizing any deviation from a normal behavior,
making it possible to anticipate an operation failure.Comment: Communication pr\'esent\'ee au 7th International Workshop WSOM 09, St
Augustine, Floride, USA, June 200
- …