33 research outputs found

    A Stable Greedy Insertion Treemap Algorithm for Software Evolution Visualization

    Get PDF
    Computing treemap layouts for time-dependent (dynamic) trees is an open problem in information visualization. In particular, the constraints of spatial quality (cell aspect ratio) and stability (small treemap changes mandated by given tree-data changes) are hard to satisfy simultaneously. Most existing treemap methods focus on spatial quality, but are not inherently designed to address stability. We propose here a new treemapping method that aims to jointly optimize both these constraints. Our method is simple to implement, generic (handles any types of dynamic hierarchies), and fast. We compare our method with 14 state of the art treemaping algorithms using four quality metrics, over 28 dynamic hierarchies extracted from evolving software codebases. The comparison shows that our proposal jointly optimizes spatial quality and stability better than existing methods

    Visualization of dynamic multidimensional and hierarchical datasets

    Get PDF
    When it comes to tools and techniques designed to help understanding complex abstract data, visualization methods play a prominent role. They enable human operators to lever age their pattern finding, outlier detection, and questioning abilities to visually reason about a given dataset. Many methods exist that create suitable and useful visual represen tations of static abstract, non-spatial, data. However, for temporal abstract, non-spatial, datasets, in which the data changes and evolves through time, far fewer visualization tech niques exist. This thesis focuses on the particular cases of temporal hierarchical data representation via dynamic treemaps, and temporal high-dimensional data visualization via dynamic projec tions. We tackle the joint question of how to extend projections and treemaps to stably, accurately, and scalably handle temporal multivariate and hierarchical data. The literature for static visualization techniques is rich and the state-of-the-art methods have proven to be valuable tools in data analysis. Their temporal/dynamic counterparts, however, are not as well studied, and, until recently, there were few hierarchical and high-dimensional methods that explicitly took into consideration the temporal aspect of the data. In addi tion, there are few or no metrics to assess the quality of these temporal mappings, and even fewer comprehensive benchmarks to compare these methods. This thesis addresses the abovementioned shortcomings. For both dynamic treemaps and dynamic projections, we propose ways to accurately measure temporal stability; we eval uate existing methods considering the tradeoff between stability and visual quality; and we propose new methods that strike a better balance between stability and visual quality than existing state-of-the-art techniques. We demonstrate our methods with a wide range of real-world data, including an application of our new dynamic projection methods to support the analysis and classification of hyperkinetic movement disorder data.Quando se trata de ferramentas e técnicas projetadas para ajudar na compreensão dados abstratos complexos, métodos de visualização desempenham um papel proeminente. Eles permitem que os operadores humanos alavanquem suas habilidades de descoberta de padrões, detecção de valores discrepantes, e questionamento visual para a raciocinar sobre um determinado conjunto de dados. Existem muitos métodos que criam representações visuais adequadas e úteis de para dados estáticos, abstratos, e não-espaciais. No entanto, para dados temporais, abstratos, e não-espaciais, isto é, dados que mudam e evoluem no tempo, existem poucas técnicas apropriadas. Esta tese concentra-se nos casos específicos de representação temporal de dados hierárquicos por meio de treemaps dinâmicos, e visualização temporal de dados de alta dimen sionalidade via projeções dinâmicas. Nós abordar a questão conjunta de como estender projeções e treemaps de forma estável, precisa e escalável para lidar com conjuntos de dados hierárquico-temporais e multivariado-temporais. Em ambos os casos, a literatura para técnicas estáticas é rica e os métodos estado da arte provam ser ferramentas valiosas em análise de dados. Suas contrapartes temporais/dinâmicas, no entanto, não são tão bem estudadas e, até recentemente, existiam poucos métodos hierárquicos e de alta dimensão que explicitamente levavam em consideração o aspecto temporal dos dados. Além disso, existiam poucas métricas para avaliar a qualidade desses mapeamentos visuais temporais, e ainda menos benchmarks abrangentes para comparação esses métodos. Esta tese aborda as deficiências acima mencionadas para treemaps dinâmicos e projeções dinâmicas. Propomos maneiras de medir com precisão a estabilidade temporal; avalia mos os métodos existentes, considerando o compromisso entre estabilidade e qualidade visual; e propomos novos métodos que atingem um melhor equilíbrio entre estabilidade e a qualidade visual do que as técnicas estado da arte atuais. Demonstramos nossos mé todos com uma ampla gama de dados do mundo real, incluindo uma aplicação de nossos novos métodos de projeção dinâmica para apoiar a análise e classificação dos dados de transtorno de movimentos

    Visualization of dynamic multidimensional and hierarchical datasets

    Get PDF
    When it comes to tools and techniques designed to help understanding complex abstract data, visualization methods play a prominent role. They enable human operators to lever age their pattern finding, outlier detection, and questioning abilities to visually reason about a given dataset. Many methods exist that create suitable and useful visual represen tations of static abstract, non-spatial, data. However, for temporal abstract, non-spatial, datasets, in which the data changes and evolves through time, far fewer visualization tech niques exist. This thesis focuses on the particular cases of temporal hierarchical data representation via dynamic treemaps, and temporal high-dimensional data visualization via dynamic projec tions. We tackle the joint question of how to extend projections and treemaps to stably, accurately, and scalably handle temporal multivariate and hierarchical data. The literature for static visualization techniques is rich and the state-of-the-art methods have proven to be valuable tools in data analysis. Their temporal/dynamic counterparts, however, are not as well studied, and, until recently, there were few hierarchical and high-dimensional methods that explicitly took into consideration the temporal aspect of the data. In addi tion, there are few or no metrics to assess the quality of these temporal mappings, and even fewer comprehensive benchmarks to compare these methods. This thesis addresses the abovementioned shortcomings. For both dynamic treemaps and dynamic projections, we propose ways to accurately measure temporal stability; we eval uate existing methods considering the tradeoff between stability and visual quality; and we propose new methods that strike a better balance between stability and visual quality than existing state-of-the-art techniques. We demonstrate our methods with a wide range of real-world data, including an application of our new dynamic projection methods to support the analysis and classification of hyperkinetic movement disorder data.Quando se trata de ferramentas e técnicas projetadas para ajudar na compreensão dados abstratos complexos, métodos de visualização desempenham um papel proeminente. Eles permitem que os operadores humanos alavanquem suas habilidades de descoberta de padrões, detecção de valores discrepantes, e questionamento visual para a raciocinar sobre um determinado conjunto de dados. Existem muitos métodos que criam representações visuais adequadas e úteis de para dados estáticos, abstratos, e não-espaciais. No entanto, para dados temporais, abstratos, e não-espaciais, isto é, dados que mudam e evoluem no tempo, existem poucas técnicas apropriadas. Esta tese concentra-se nos casos específicos de representação temporal de dados hierárquicos por meio de treemaps dinâmicos, e visualização temporal de dados de alta dimen sionalidade via projeções dinâmicas. Nós abordar a questão conjunta de como estender projeções e treemaps de forma estável, precisa e escalável para lidar com conjuntos de dados hierárquico-temporais e multivariado-temporais. Em ambos os casos, a literatura para técnicas estáticas é rica e os métodos estado da arte provam ser ferramentas valiosas em análise de dados. Suas contrapartes temporais/dinâmicas, no entanto, não são tão bem estudadas e, até recentemente, existiam poucos métodos hierárquicos e de alta dimensão que explicitamente levavam em consideração o aspecto temporal dos dados. Além disso, existiam poucas métricas para avaliar a qualidade desses mapeamentos visuais temporais, e ainda menos benchmarks abrangentes para comparação esses métodos. Esta tese aborda as deficiências acima mencionadas para treemaps dinâmicos e projeções dinâmicas. Propomos maneiras de medir com precisão a estabilidade temporal; avalia mos os métodos existentes, considerando o compromisso entre estabilidade e qualidade visual; e propomos novos métodos que atingem um melhor equilíbrio entre estabilidade e a qualidade visual do que as técnicas estado da arte atuais. Demonstramos nossos mé todos com uma ampla gama de dados do mundo real, incluindo uma aplicação de nossos novos métodos de projeção dinâmica para apoiar a análise e classificação dos dados de transtorno de movimentos

    Space Partitioning Schemes and Algorithms for Generating Regular and Spiral Treemaps

    Full text link
    Treemaps have been widely applied to the visualization of hierarchical data. A treemap takes a weighted tree and visualizes its leaves in a nested planar geometric shape, with sub-regions partitioned such that each sub-region has an area proportional to the weight of its associated leaf nodes. Efficiently generating visually appealing treemaps that also satisfy other quality criteria is an interesting problem that has been tackled from many directions. We present an optimization model and five new algorithms for this problem, including two divide and conquer approaches and three spiral treemap algorithms. Our optimization model is able to generate superior treemaps that could serve as a benchmark for comparing the quality of more computationally efficient algorithms. Our divide and conquer and spiral algorithms either improve the performance of their existing counterparts with respect to aspect ratio and stability or perform competitively. Our spiral algorithms also expand their applicability to a wider range of input scenarios. Four of these algorithms are computationally efficient as well with quasilinear running times and the last algorithm achieves a cubic running time. A full version of this paper with all appendices, data, and source codes is available at \anonymizeOSF{\OSFSupplementText}

    行列技術を用いた動的ネットワーク可視化

    Get PDF
    学位の種別: 課程博士審査委員会委員 : (主査)東京大学教授 大澤 幸生, 東京大学教授 青山 和浩, 東京大学教授 和泉 潔, 東京大学准教授 森 純一郎, 首都大学東京教授 高間 康史University of Tokyo(東京大学

    Using machine learning to support better and intelligent visualisation for genomic data

    Get PDF
    Massive amounts of genomic data are created for the advent of Next Generation Sequencing technologies. Great technological advances in methods of characterising the human diseases, including genetic and environmental factors, make it a great opportunity to understand the diseases and to find new diagnoses and treatments. Translating medical data becomes more and more rich and challenging. Visualisation can greatly aid the processing and integration of complex data. Genomic data visual analytics is rapidly evolving alongside with advances in high-throughput technologies such as Artificial Intelligence (AI), and Virtual Reality (VR). Personalised medicine requires new genomic visualisation tools, which can efficiently extract knowledge from the genomic data effectively and speed up expert decisions about the best treatment of an individual patient’s needs. However, meaningful visual analysis of such large genomic data remains a serious challenge. Visualising these complex genomic data requires not only simply plotting of data but should also lead to better decisions. Machine learning has the ability to make prediction and aid in decision-making. Machine learning and visualisation are both effective ways to deal with big data, but they focus on different purposes. Machine learning applies statistical learning techniques to automatically identify patterns in data to make highly accurate prediction, while visualisation can leverage the human perceptual system to interpret and uncover hidden patterns in big data. Clinicians, experts and researchers intend to use both visualisation and machine learning to analyse their complex genomic data, but it is a serious challenge for them to understand and trust machine learning models in the serious medical industry. The main goal of this thesis is to study the feasibility of intelligent and interactive visualisation which combined with machine learning algorithms for medical data analysis. A prototype has also been developed to illustrate the concept that visualising genomics data from childhood cancers in meaningful and dynamic ways could lead to better decisions. Machine learning algorithms are used and illustrated during visualising the cancer genomic data in order to provide highly accurate predictions. This research could open a new and exciting path to discovery for disease diagnostics and therapies

    Multi-Level Trace Abstraction, Linking and Display

    Get PDF
    RÉSUMÉ Certains types de problèmes logiciels et de bogues ne peuvent être identifiées et résolus que lors de l'analyse de l'exécution des applications. L'analyse d'exécution (et le débogage) des systèmes parallèles et distribués est très difficile en utilisant uniquement le code source et les autres artefacts logiciels statiques. L'analyse dynamique par trace d'exécution est de plus en plus utilisée pour étudier le comportement d'un système. Les traces d'exécution contiennent généralement une grande quantité d'information sur l'exécution du système, par exemple quel processus/module interagit avec quels autres processus/modules, ou encore quel fichier est touché par celui-ci, et ainsi de suite. Les traces au niveau du système d'exploitation sont un des types de données des plus utiles et efficaces qui peuvent être utilisés pour détecter des problèmes d'exécution complexes. En effet, ils contiennent généralement des informations détaillées sur la communication inter-processus, sur l'utilisation de la mémoire, le système de fichiers, les appels système, la couche réseau, les blocs de disque, etc. Cette information peut être utilisée pour raisonner sur le comportement d'exécution du système et investiguer les bogues ainsi que les problèmes d'exécution. D'un autre côté, les traces d'exécution peuvent rapidement avoir une très grande taille en peu de temps à cause de la grande quantité d'information qu'elles contiennent. De plus, les traces contiennent généralement des données de bas niveau (appels système, interruptions, etc ) pour lesquelles l'analyse et la compréhension du contexte requièrent des connaissances poussées dans le domaine des systèmes d'exploitation. Très souvent, les administrateurs système et analystes préfèrent des données de plus haut niveau pour avoir une idée plus générale du comportement du système, contrairement aux traces noyau dont le niveau d'abstraction est très bas. Pour pouvoir générer efficacement des analyses de plus haut niveau, il est nécessaire de développer des algorithmes et des outils efficaces pour analyser les traces noyau et mettre en évidence les événements les plus pertinents. Le caractère expressif des événements de trace permet aux analystes de raisonner sur l'exécution du système à des niveaux plus élevés, pour découvrir le sens de l'exécution différents endroits, et détecter les comportements problématiques et inattendus. Toutefois, pour permettre une telle analyse, un outil de visualisation supplémentaire est nécessaire pour afficher les événements abstraits à de plus hauts niveaux d'abstraction. Cet outil peut permettre aux utilisateurs de voir une liste des problèmes détectés, de les suivre dans les couches de plus bas niveau (ex. directement dans la trace détaillée) et éventuellement de découvrir les raisons des problèmes détectés. Dans cette thèse, un cadre d'application est présenté pour relever ces défis : réduire la taille d'une trace ainsi que sa complexité, générer plusieurs niveaux d'événements abstraits pour les organiser d'une facon hiérarchique et les visualiser à de multiples niveaux, et enfin permettre aux utilisateurs d'effectuer une analyse verticale et à plusieurs niveaux d'abstraction, conduisant à une meilleure connaissance et compréhension de l'exécution du système. Le cadre d'application proposé est étudié en deux grandes parties : d'abord plusieurs niveaux d'abstraction de trace, et ensuite l'organisation de la trace à plusieurs niveaux et sa visualisation. La première partie traite des techniques utilisées pour les données de trace abstraites en utilisant soit les traces des événements contenus, un ensemble prédéfini de paramètres et de mesures, ou une structure basée l'abstraction pour extraire les ressources impliquées dans le système. La deuxième partie, en revanche, indique l'organisation hiérarchique des événements abstraits générés, l'établissement de liens entre les événements connexes, et enfin la visualisation en utilisant une vue de la chronologie avec échelle ajustable. Cette vue affiche les événements à différents niveaux de granularité, et permet une navigation hiérarchique à travers différentes couches d'événements de trace, en soutenant la mise a l'échelle sémantique. Grâce à cet outil, les utilisateurs peuvent tout d'abord avoir un aperçu de l'exécution, contenant un ensemble de comportements de haut niveau, puis peuvent se déplacer dans la vue et se concentrer sur une zone d'intérêt pour obtenir plus de détails sur celle-ci. L'outil proposé synchronise et coordonne les différents niveaux de la vue en établissant des liens entre les données, structurellement ou sémantiquement. La liaison structurelle utilise la délimitation par estampilles de temps des événements pour lier les données, tandis que le second utilise une pré-analyse des événements de trace pour trouver la pertinence entre eux ainsi que pour les lier. Lier les événements relie les informations de différentes couches qui appartiennent théoriquement à la même procédure ou spécifient le même comportement. Avec l'utilisation de la liaison de la correspondance, des événements dans une couche peuvent être analysés par rapport aux événements et aux informations disponibles dans d'autres couches, ce qui conduit à une analyse à plusieurs niveaux et la compréhension de l'exécution du système sous-jacent. Les exemples et les résultats expérimentaux des techniques d'abstraction et de visualisation proposés sont présentes dans cette thèse qui prouve l'efficacité de l'approche. Dans ce projet, toutes les évaluations et les expériences ont été menées sur la base des événements de trace au niveau du système d'exploitation recueillies par le traceur Linux Trace Toolkit Next Generation ( LTTng ). LTTng est un outil libre, léger et à faible impact qui fournit des informations d'exécution détaillées à partir des différents modules du système sous-jacent, tant au niveau du noyau Linux que de l'espace utilisateur. LTTng fonctionne en instrumentant le noyau et les applications utilisateur en insérant quelques points de traces à des endroits différents.----------ABSTRACT Some problems and bugs can only be identified and resolved using runtime application behavior analysis. Runtime analysis of multi-threaded and distributed systems is very difficult, almost impossible, by only analyzing the source code and other static software artifacts. Therefore, dynamic analysis through execution traces is increasingly used to study system runtime behavior. Execution traces usually contain large amounts of valuable information about the system execution, e.g., which process/module interacts with which other processes/modules, which file is touched by which process/module, which function is called by which process/module/function and so on. Operating system level traces are among the most useful and effective information sources that can be used to detect complex bugs and problems. Indeed, they contain detailed information about inter-process communication, memory usage, file system, system calls, networking, disk blocks, etc. This information can be used to understand the system runtime behavior, and to identify a large class of bugs, problems, and misbehavior. However, execution traces may become large, even within a few seconds or minutes of execution, making the analysis difficult. Moreover, traces are often filled with low-level data (system calls, interrupts, etc.) so that people need a complete understanding of the domain knowledge to analyze these data. It is often preferable for analysts to look at relatively abstract and high-level events, which are more readable and representative than the original trace data, and reveal the same behavior but at higher levels of granularity. However, to achieve such high-level data, effective algorithms and tools must be developed to process trace events, remove less important ones, highlight only necessary data, generalize trace events, and finally aggregate and group similar and related events. The expressive nature of the synthetic events allows analysts to reason about system execution at higher levels, to uncover execution behavior in different areas, and detect its problematic and unexpected aspects. However, to allow such analysis, an additional visualization tool may be required to display abstract events at different levels and explore them easily. This tool may enable users to see a list of detected problems, follow the problems in the detailed levels (e.g., within the raw trace events), analyze, and possibly discover the reasons for the detected problems. In this thesis, a framework is presented to address those challenges: to reduce the execution trace size and complexity, to generate several levels of abstract events, to organize the data in a hierarchy, to visualize them at multiple levels, and finally to enable users to perform a top-down and multiscale analysis over trace data, leading to a better understanding and comprehension of underlying system execution. The proposed framework is studied in two major parts: multi-level trace abstraction, and multi-level trace organization and visualization. The first part discusses the techniques used to abstract out trace data using either the trace events content, a predefined set of metrics and measures, or structure-based abstraction to extract the resources involved in the system execution. The second part determines the hierarchical organization of the generated abstract events, establishes links between the related events, and finally visualizes events using a zoomable timeline view. This view displays the events at different granularity levels, and enables a hierarchical navigation through different layers of trace events by supporting the semantic zooming. Using this tool, users can first see an overview of the execution, and then can pan around the view, and focus and zoom on any area of interest for more details and insight. The proposed view synchronizes and coordinates the different view levels by establishing links between data, structurally or semantically. The structural linking uses bounding times-tamps of the events to link the data, while the latter uses a pre-analysis of the trace events to find their relevance, and to link them together. Establishing Links connects the information that are conceptually related together ( e.g., events belong to the same process or specify the same behavior), so that events in one layer can be analyzed with respect to events and information in other layers, leading to a multi-level analysis and a better comprehension of the underlying system execution. In this project, all evaluations and experiments were conducted on operating system level traces obtained with the Linux Trace Toolkit Next Generation (LTTng). LTTng is a lightweight and low-impact open source tracing tool that provides valuable runtime information from the various modules of the underlying system at both the Linux kernel and user-space levels. LTTng works by instrumenting the (kernel and user-space) applications, with statically inserted tracepoints, and by generating log entries at runtime each time a tracepoint is hit
    corecore