4 research outputs found

    New insights into the suitability of the third dimension for visualizing multivariate/multidimensional data: a study based on loss of quality quantification

    Get PDF
    Most visualization techniques have traditionally used two-dimensional, instead of three-dimensional representations to visualize multidimensional and multivariate data. In this article, a way to demonstrate the underlying superiority of three-dimensional, with respect to two-dimensional, representation is proposed. Specifically, it is based on the inevitable quality degradation produced when reducing the data dimensionality. The problem is tackled from two different approaches: a visual and an analytical approach. First, a set of statistical tests (point classification, distance perception, and outlier identification) using the two-dimensional and three-dimensional visualization are carried out on a group of 40 users. The results indicate that there is an improvement in the accuracy introduced by the inclusion of a third dimension; however, these results do not allow to obtain definitive conclusions on the superiority of three-dimensional representation. Therefore, in order to draw further conclusions, a deeper study based on an analytical approach is proposed. The aim is to quantify the real loss of quality produced when the data are visualized in two-dimensional and three-dimensional spaces, in relation to the original data dimensionality, to analyze the difference between them. To achieve this, a recently proposed methodology is used. The results obtained by the analytical approach reported that the loss of quality reaches significantly high values only when switching from three-dimensional to two-dimensional representation. The considerable quality degradation suffered in the two-dimensional visualization strongly suggests the suitability of the third dimension to visualize data

    A methodology to compare dimensionality reduction algorithms in terms of loss of quality

    Get PDF
    Dimensionality Reduction (DR) is attracting more attention these days as a result of the increasing need to handle huge amounts of data effectively. DR methods allow the number of initial features to be reduced considerably until a set of them is found that allows the original properties of the data to be kept. However, their use entails an inherent loss of quality that is likely to affect the understanding of the data, in terms of data analysis. This loss of quality could be determinant when selecting a DR method, because of the nature of each method. In this paper, we propose a methodology that allows different DR methods to be analyzed and compared as regards the loss of quality produced by them. This methodology makes use of the concept of preservation of geometry (quality assessment criteria) to assess the loss of quality. Experiments have been carried out by using the most well-known DR algorithms and quality assessment criteria, based on the literature. These experiments have been applied on 12 real-world datasets. Results obtained so far show that it is possible to establish a method to select the most appropriate DR method, in terms of minimum loss of quality. Experiments have also highlighted some interesting relationships between the quality assessment criteria. Finally, the methodology allows the appropriate choice of dimensionality for reducing data to be established, whilst giving rise to a minimum loss of quality

    Faithful visualization and dimensionality reduction on graphics processing unit

    Get PDF
    Information visualization is a process of transforming data, information and knowledge to the geometric representation in order to see unseen information. Dimensionality reduction (DR) is one of the strategies used to visualize high-dimensional data sets by projecting them onto low-dimensional space where they can be visualized directly. The problem of DR is that the straightforward relationship between the original highdimensional data sets and low-dimensional space is lost, which causes the colours of visualization to have no meaning. A new nonlinear DR method which is called faithful stochastic proximity embedding (FSPE) is proposed in this thesis to visualize more complex data sets. The proposed method depends on the low-dimensional space rather than the high-dimensional data sets to overcome the main shortcomings of the DR by overcoming the false neighbour points, and preserving the neighbourhood relation to the true neighbours. The visualization by our proposed method displays the faithful, useful and meaningful colours, where the objects of the image can be easily distinguished. The experiments that were conducted indicated that the FSPE is higher in accuracy than many dimension reduction methods because it prevents as much as possible the false neighbourhood errors to occur in the results. In addition, in the results of other methods, we have demonstrated that the FSPE has an important role in enhancing the low-dimensional space which are carried by other DR methods. Choosing the worst efficient points to update the rest of the points has helped in improving the visualization information. The results showed the proposed method has an impacting role in increasing the trustworthiness of the visualization by retrieving most of the local neighbourhood points, which they missed during the projection process. The sequential dimensionality reduction (SDR) method is the second proposed method in this thesis. It redefines the problem of DR as a sequence of multiple DR problems, each of which reduces the dimensionality by a small amount. It maintains and preserves the relations among neighbour points in low-dimensional space. The results showed the accuracy of the proposed SDR, which leads to a better visualization with minimum false colours compared to the direct projection of the DR method, where those results are confirmed by comparing our method with 21 other methods. Although there are many measurement metrics, our proposed point-wise correlation metric is the better. In this metric, we evaluate the efficiency of each point in the visualization to generate a grey-scale efficiency image. This type of image gives more details instead of representing the evaluation in one single value. The user can recognize the location of both the false and the true points. We compared the results of our proposed methods (FSPE and SDR) and many other dimension reduction methods when applied to four scenarios: (1) the unfolding curved cylinder data sets; (2) projecting a human face data sets into two dimensions; (3) classifing connected networks and (4) visualizing a remote sensing imagery data sets. The results showed that our methods are able to produce good visualization by preserving the corresponding colour distances between the visualization and the original data sets. The proposed methods are implemented on the graphic processing unit (GPU) to visualize different data sets. The benefit of a parallel implementation is to obtain the results in as short a time as possible. The results showed that compute unified device architecture (CUDA) implementation of FSPE and SDR are faster than their sequential codes on the central processing unit (CPU) in calculating floating-point operations, especially for a large data sets. The GPU is also more suited to the implementation of the metric measurement methods because they do a large computation. We illustrated that this massive speed-up requires a parallel structure to be suitable for running on a GPU
    corecore