5,609 research outputs found

    Interactive Visualization of the Largest Radioastronomy Cubes

    Full text link
    3D visualization is an important data analysis and knowledge discovery tool, however, interactive visualization of large 3D astronomical datasets poses a challenge for many existing data visualization packages. We present a solution to interactively visualize larger-than-memory 3D astronomical data cubes by utilizing a heterogeneous cluster of CPUs and GPUs. The system partitions the data volume into smaller sub-volumes that are distributed over the rendering workstations. A GPU-based ray casting volume rendering is performed to generate images for each sub-volume, which are composited to generate the whole volume output, and returned to the user. Datasets including the HI Parkes All Sky Survey (HIPASS - 12 GB) southern sky and the Galactic All Sky Survey (GASS - 26 GB) data cubes were used to demonstrate our framework's performance. The framework can render the GASS data cube with a maximum render time < 0.3 second with 1024 x 1024 pixels output resolution using 3 rendering workstations and 8 GPUs. Our framework will scale to visualize larger datasets, even of Terabyte order, if proper hardware infrastructure is available.Comment: 15 pages, 12 figures, Accepted New Astronomy July 201

    Hillview:A trillion-cell spreadsheet for big data

    Get PDF
    Hillview is a distributed spreadsheet for browsing very large datasets that cannot be handled by a single machine. As a spreadsheet, Hillview provides a high degree of interactivity that permits data analysts to explore information quickly along many dimensions while switching visualizations on a whim. To provide the required responsiveness, Hillview introduces visualization sketches, or vizketches, as a simple idea to produce compact data visualizations. Vizketches combine algorithmic techniques for data summarization with computer graphics principles for efficient rendering. While simple, vizketches are effective at scaling the spreadsheet by parallelizing computation, reducing communication, providing progressive visualizations, and offering precise accuracy guarantees. Using Hillview running on eight servers, we can navigate and visualize datasets of tens of billions of rows and trillions of cells, much beyond the published capabilities of competing systems

    Slicer

    Get PDF
    Explorative data visualization is a widespread tool for gaining insights from datasets. Investigating data in linked visualizations lets users explore potential relationships in their data at will. Furthermore, this type of analysis does not require any technical knowledge, widening the userbase from developers to anyone. Implementing explorative data visualizations in web browsers makes data analysis accessible to anyone with a PC. In addition to accessibility, the available types of visualizations and their interactive latency are essential for the utility of data exploration. Available visualizations limit the number of datasets eligible for use in the application, and latency limits how much exploring the users are willing to do. Existing solutions often do all the computation involved in either the client application or on a backend server. However, using the client limits performance and data size since hardware resources in web browsers are scarce, and sending large datasets over a network is not feasible. Whereas server-based computation often comes with high requirements for server hardware and is limited by network latency and bandwidth on each interaction. This thesis presents Slicer, a framework for creating explorative data visualizations in web browsers. Applications can be created with minimal developer effort, requiring only a description of the visualizations. Slicer implements bar charts and choropleth maps. The visualizations are linked and can be filtered either by brushing or clicking on single targets. To overcome the hurdles of pure client- and server-reliant solutions, Slicer uses a hybrid approach, where prioritized interactions are handled client-side. Recognizing that different types of interactions have different latency thresholds, we trade the cost of switching views for low latency on filtering. To achieve real-time filtering performance, we follow the principle that the chosen resolution of the visualizations, not data size, should limit interactive scalability. We describe use of data tiles accommodating more interactions than shown in earlier work, using an approach based on delta differencing, which ensures constant time complexity when filtering. For computing data tiles, we present techniques for efficient computation on consumer hardware. Our results show that Slicer can offer real-time interactivity on latency-sensitive interactions regardless of data size, averaging above 150Hz on a consumer laptop. For less sensitive interactions, acceptable latency is shown for datasets with tens of millions of records, depending on the resolution of the visualizations

    The archive solution for distributed workflow management agents of the CMS experiment at LHC

    Full text link
    The CMS experiment at the CERN LHC developed the Workflow Management Archive system to persistently store unstructured framework job report documents produced by distributed workflow management agents. In this paper we present its architecture, implementation, deployment, and integration with the CMS and CERN computing infrastructures, such as central HDFS and Hadoop Spark cluster. The system leverages modern technologies such as a document oriented database and the Hadoop eco-system to provide the necessary flexibility to reliably process, store, and aggregate O\mathcal{O}(1M) documents on a daily basis. We describe the data transformation, the short and long term storage layers, the query language, along with the aggregation pipeline developed to visualize various performance metrics to assist CMS data operators in assessing the performance of the CMS computing system.Comment: This is a pre-print of an article published in Computing and Software for Big Science. The final authenticated version is available online at: https://doi.org/10.1007/s41781-018-0005-

    End-User Visualization and Manipulation of Distributed Aggregate Data

    Get PDF
    Aggregate visualization and manipulation enables the viewing and interaction of dynamically changing data sets in a graphically meaningful way. However, off-the-shelf applications typically provide only limited ways to view static aggregates and generally do not support manipulation of aggregate data through the resulting visualization. To be fully dynamic, an aggregate visualization should be customizable to suit the individual&apos;s needs and should allow end-users to modify the data through direct manipulation. This paper describes a software system that empowers end-users to create interactive aggregate visualizations through a visual language interface. Included are mechanisms for specifying how aggregate data is processed from multiple sources of a distributed application, providing functionality similar to project, select, join, and cross product of relational databases. This approach gives end-users the power to create customized, interactive visualizations of dynamically changing ..

    MOLNs: A cloud platform for interactive, reproducible and scalable spatial stochastic computational experiments in systems biology using PyURDME

    Full text link
    Computational experiments using spatial stochastic simulations have led to important new biological insights, but they require specialized tools, a complex software stack, as well as large and scalable compute and data analysis resources due to the large computational cost associated with Monte Carlo computational workflows. The complexity of setting up and managing a large-scale distributed computation environment to support productive and reproducible modeling can be prohibitive for practitioners in systems biology. This results in a barrier to the adoption of spatial stochastic simulation tools, effectively limiting the type of biological questions addressed by quantitative modeling. In this paper, we present PyURDME, a new, user-friendly spatial modeling and simulation package, and MOLNs, a cloud computing appliance for distributed simulation of stochastic reaction-diffusion models. MOLNs is based on IPython and provides an interactive programming platform for development of sharable and reproducible distributed parallel computational experiments
    corecore