3 research outputs found

    Preserving Command Line Workflow for a Package Management System Using ASCII DAG Visualization

    Get PDF
    Package managers provide ease of access to applications by removing the time-consuming and sometimes completely prohibitive barrier of successfully building, installing, and maintaining the software for a system. A package dependency contains dependencies between all packages required to build and run the target software. Package management system developers, package maintainers, and users may consult the dependency graph when a simple listing is insufficient for their analyses. However, users working in a remote command line environment must disrupt their workflow to visualize dependency graphs in graphical programs, possibly needing to move files between devices or incur forwarding lag. Such is the case for users of Spack, an open source package management system originally developed to ease the complex builds required by supercomputing environments. To preserve the command line workflow of Spack, we develop an interactive ASCII visualization for its dependency graphs. Through interviews with Spack maintainers, we identify user goals and corresponding visual tasks for dependency graphs. We evaluate the use of our visualization through a command line-centered study, comparing it to the system's two existing approaches. We observe that despite the limitations of the ASCII representation, our visualization is preferred by participants when approached from a command line interface workflow.U.S. Department of Energy by Lawrence Livermore National Laboratory [DE-AC52-07NA27344, LLNL-JRNL-746358]This item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at [email protected]

    Mathematical optimization for the visualization of complex datasets

    Get PDF
    This PhD dissertation focuses on developing new Mathematical Optimization models and solution approaches which help to gain insight into complex data structures arising in Information Visualization. The approaches developed in this thesis merge concepts from Multivariate Data Analysis and Mathematical Optimization, bridging theoretical mathematics with real life problems. The usefulness of Information Visualization lies with its power to improve interpretability and decision making from the unknown phenomena described by raw data, as fully discussed in Chapter 1. In particular, datasets involving frequency distributions and proximity relations, which even might vary over the time, are the ones studied in this thesis. Frameworks to visualize such enclosed information, which make use of Mixed Integer (Non)linear Programming and Difference of Convex tools, are formally proposed. Algorithmic approaches such as Large Neighborhood Search or Difference of Convex Algorithm enable us to develop matheuristics to handle such models. More specifically, Chapter 2 addresses the problem of visualizing a frequency distribution and an adjacency relation attached to a set of individuals. This information is represented using a rectangular map, i.e., a subdivision of a rectangle into rectangular portions so that their areas reflect the frequencies, and the adjacencies between portions represent the adjacencies between the individuals. The visualization problem is formulated as a Mixed Integer Linear Programming model, and a matheuristic that has this model at its heart is proposed. Chapter 3 generalizes the model presented in the previous chapter by developing a visualization framework which handles simultaneously the representation of a frequency distribution and a dissimilarity relation. This framework consists of a partition of a given rectangle into piecewise rectangular portions so that the areas of the regions represent the frequencies and the distances between them represent the dissimilarities. This visualization problem is formally stated as a Mixed Integer Nonlinear Programming model, which is solved by means of a matheuristic based on Large Neighborhood Search. Contrary to previous chapters in which a partition of the visualization region is sought, Chapter 4 addresses the problem of visualizing a set of individuals, which has attached a dissimilarity measure and a frequency distribution, without necessarily cov-ering the visualization region. In this visualization problem individuals are depicted as convex bodies whose areas are proportional to the given frequencies. The aim is to determine the location of the convex bodies in the visualization region. In order to solve this problem, which generalizes the standard Multidimensional Scaling, Difference of Convex tools are used. In Chapter 5, the model stated in the previous chapter is extended to the dynamic case, namely considering that frequencies and dissimilarities are observed along a set of time periods. The solution approach combines Difference of Convex techniques with Nonconvex Quadratic Binary Optimization. All the approaches presented are tested in real datasets. Finally, Chapter 6 closes this thesis with general conclusions and future lines of research.Esta tesis se centra en desarrollar nuevos modelos y algoritmos basados en la Optimizaci贸n Matem谩tica que ayuden a comprender estructuras de datos complejas frecuentes en el 谩rea de Visualizaci贸n de la Informaci贸n. Las metodolog铆as propuestas fusionan conceptos de An谩lisis de Datos Multivariantes y de Optimizaci贸n Matem谩tica, aunando las matem谩ticas te贸ricas con problemas reales. Como se analiza en el Cap铆tulo 1, una adecuada visualizaci贸n de los datos ayuda a mejorar la interpretabilidad de los fen贸menos desconocidos que describen, as铆 como la toma de decisiones. Concretamente, esta tesis se centra en visualizar datos que involucran distribuciones de frecuencias y relaciones de proximidad, pudiendo incluso ambas variar a lo largo del tiempo. Se proponen diferentes herramientas para visualizar dicha informaci贸n, basadas tanto en la Optimizaci贸n (No) Lineal Entera Mixta como en la optimizaci贸n de funciones Diferencia de Convexas. Adem谩s, metodolog铆as como la B煤squeda por Entornos Grandes y el Algoritmo DCA permiten el desarrollo de mateheur铆sticas para resolver dichos modelos. Concretamente, el Cap铆tulo 2 trata el problema de visualizar simult谩neamente una distribuci贸n de frequencias y una relaci贸n de adyacencias en un conjunto de individuos. Esta informaci贸n se representa a trav茅s de un mapa rectangular, es decir, una subdivisi贸n de un rect谩ngulo en porciones rectangulares, de manera que las 谩reas de estas porciones representen las frecuencias y las adyacencias entre las porciones representen las adyacencias entre los individuos. Este problema de visualizaci贸n se formula con la ayuda de la Optimizaci贸n Lineal Entera Mixta. Adem谩s, se propone una mateheur铆stica basada en este modelo como m茅todo de resoluci贸n. En el Cap铆tulo 3 se generaliza el modelo presentado en el cap铆tulo anterior, construyendo una herramienta que permite visualizar simult谩neamente una distribuci贸n de frecuencias y una relaci贸n de disimilaridades. Dicha visualizaci贸n se realiza mediante la partici贸n de un rect谩ngulo en porciones rectangulares a trozos de manera que el 谩rea de las porciones refleje la distribuci贸n de frecuencias y las distancias entre las mismas las disimilaridades. Se plantea un modelo No Lineal Entero Mixto para este problema de visualizaci贸n, que es resuelto a trav茅s de una mateheur铆stica basada en la B煤squeda por Entornos Grandes. En contraposici贸n a los cap铆tulos anteriores, en los que se busca una partici贸n de la regi贸n de visualizaci贸n, el Cap铆tulo 4 trata el problema de representar una distribuci贸n de frecuencias y una relaci贸n de disimilaridad sobre un conjunto de individuos, sin forzar a que haya que recubrir dicha regi贸n de visualizaci贸n. En este modelo de visualizaci贸n los individuos son representados como cuerpos convexos cuyas 谩reas son proporcionales a las frecuencias dadas. El objetivo es determinar la localizaci贸n de dichos cuerpos convexos dentro de la regi贸n de visualizaci贸n. Para resolver este problema, que generaliza el tradicional Escalado Multidimensional, se utilizan t茅cnicas de optimizaci贸n basadas en funciones Diferencia de Convexas. En el Cap铆tulo 5, se extiende el modelo desarrollado en el cap铆tulo anterior para el caso en el que los datos son din谩micos, es decir, las frecuencias y disimilaridades se observan a lo largo de varios instantes de tiempo. Se emplean t茅cnicas de optimizaci贸n de funciones Diferencias de Convexas as铆 como Optimizaci贸n Cuadr谩tica Binaria No Convexa para la resoluci贸n del modelo. Todas las metodolog铆as propuestas han sido testadas en datos reales. Finalmente, el Cap铆tulo 6 contiene las conclusiones a esta tesis, as铆 como futuras l铆neas de investigaci贸n.Premio Extraordinario de Doctorado U
    corecore