303 research outputs found

    Doctor of Philosophy

    Get PDF
    dissertationDataflow pipeline models are widely used in visualization systems. Despite recent advancements in parallel architecture, most systems still support only a single CPU or a small collection of CPUs such as a SMP workstation. Even for systems that are specifically tuned towards parallel visualization, their execution models only provide support for data-parallelism while ignoring taskparallelism and pipeline-parallelism. With the recent popularization of machines equipped with multicore CPUs and multi-GPU units, these visualization systems are undoubtedly falling further behind in reaching maximum efficiency. On the other hand, there exist several libraries that can schedule program executions on multiple CPUs and/or multiple GPUs. However, due to differences in executing a task graph and a pipeline along with their APIs being considerably low-level, it still remains a challenge to integrate these run-time libraries into current visualization systems. Thus, there is a need for a redesigned dataflow architecture to fully support and exploit the power of highly parallel machines in large-scale visualization. The new design must be able to schedule executions on heterogeneous platforms while at the same time supporting arbitrarily large datasets through the use of streaming data structures. The primary goal of this dissertation work is to develop a parallel dataflow architecture for streaming large-scale visualizations. The framework includes supports for platforms ranging from multicore processors to clusters consisting of thousands CPUs and GPUs. We achieve this in our system by introducing the notion of Virtual Processing Elements and Task-Oriented Modules along with a highly customizable scheduler that controls the assignment of tasks to elements dynamically. This creates an intuitive way to maintain multiple CPU/GPU kernels yet still provide coherency and synchronization across module executions. We have implemented these techniques into HyperFlow which is made of an API with all basic dataflow constructs described in the dissertation, and a distributed run-time library that can be used to deploy those pipelines on multicore, multi-GPU and cluster-based platforms

    MOLNs: A cloud platform for interactive, reproducible and scalable spatial stochastic computational experiments in systems biology using PyURDME

    Full text link
    Computational experiments using spatial stochastic simulations have led to important new biological insights, but they require specialized tools, a complex software stack, as well as large and scalable compute and data analysis resources due to the large computational cost associated with Monte Carlo computational workflows. The complexity of setting up and managing a large-scale distributed computation environment to support productive and reproducible modeling can be prohibitive for practitioners in systems biology. This results in a barrier to the adoption of spatial stochastic simulation tools, effectively limiting the type of biological questions addressed by quantitative modeling. In this paper, we present PyURDME, a new, user-friendly spatial modeling and simulation package, and MOLNs, a cloud computing appliance for distributed simulation of stochastic reaction-diffusion models. MOLNs is based on IPython and provides an interactive programming platform for development of sharable and reproducible distributed parallel computational experiments

    Bigdata architecture for large-scale scientific computing

    No full text
    International audience—ABDA'14: Today, the scientific community uses massively simulations to test their theories and to understand physical phenomena. Simulation is however limited by two important factors: the number of elements used and the number of time-steps which are computed and stored. Both limits are constrained by hardware capabilities (computation nodes and/or storage). From this observation arises the VELaSSCo project. The goal is to design, implement and deploy a platform to store data for DEM (Discrete Element Method) and FEM (Finite Element Method) simulations. These simulations can produce huge amounts of data regarding to the number of elements (particles in DEM) which are computed, and also regarding to the number of time-steps processed. The VELaSSCo platform solves this problem by providing a framework fulfilling the application needs and running on any available hardware. This platform is composed of different software modules: a Hadoop distribution and some specific plug-ins. The plug-ins which are designed deal with the data produced by the simulations. The output of the platform is designed to fit with requirements of available visualization software

    Leading Undergraduate Students to Big Data Generation

    Get PDF
    People are facing a flood of data today. Data are being collected at unprecedented scale in many areas, such as networking, image processing, virtualization, scientific computation, and algorithms. The huge data nowadays are called Big Data. Big data is an all encompassing term for any collection of data sets so large and complex that it becomes difficult to process them using traditional data processing applications. In this article, the authors present a unique way which uses network simulator and tools of image processing to train students abilities to learn, analyze, manipulate, and apply Big Data. Thus they develop students handson abilities on Big Data and their critical thinking abilities. The authors used novel image based rendering algorithm with user intervention to generate realistic 3D virtual world. The learning outcomes are significant

    Doctor of Philosophy

    Get PDF
    dissertationInteractive editing and manipulation of digital media is a fundamental component in digital content creation. One media in particular, digital imagery, has seen a recent increase in popularity of its large or even massive image formats. Unfortunately, current systems and techniques are rarely concerned with scalability or usability with these large images. Moreover, processing massive (or even large) imagery is assumed to be an off-line, automatic process, although many problems associated with these datasets require human intervention for high quality results. This dissertation details how to design interactive image techniques that scale. In particular, massive imagery is typically constructed as a seamless mosaic of many smaller images. The focus of this work is the creation of new technologies to enable user interaction in the formation of these large mosaics. While an interactive system for all stages of the mosaic creation pipeline is a long-term research goal, this dissertation concentrates on the last phase of the mosaic creation pipeline - the composition of registered images into a seamless composite. The work detailed in this dissertation provides the technologies to fully realize interactive editing in mosaic composition on image collections ranging from the very small to massive in scale

    Optimization of Real-World MapReduce Applications With Flame-MR: Practical Use Cases

    Get PDF
    [Abstract] Apache Hadoop is a widely used MapReduce framework for storing and processing large amounts of data. However, it presents some performance issues that hinder its utilization in many practical use cases. Although existing alternatives like Spark or Hama can outperform Hadoop, they require to rewrite the source code of the applications due to API incompatibilities. This paper studies the use of Flame-MR, an in-memory processing architecture for MapReduce applications, to improve the performance of real-world use cases in a transparent way while keeping application compatibility. Flame-MR adapts to the characteristics of the workloads, managing efficiently the use of custom data formats and iterative computations, while also reducing workload imbalance. The experimental evaluation, conducted in high performance clusters and the Microsoft Azure cloud, shows a clear outperformance of Flame-MR over Hadoop. In most cases, Flame-MR reduces the execution times by more than a half

    Hillview:A trillion-cell spreadsheet for big data

    Get PDF
    Hillview is a distributed spreadsheet for browsing very large datasets that cannot be handled by a single machine. As a spreadsheet, Hillview provides a high degree of interactivity that permits data analysts to explore information quickly along many dimensions while switching visualizations on a whim. To provide the required responsiveness, Hillview introduces visualization sketches, or vizketches, as a simple idea to produce compact data visualizations. Vizketches combine algorithmic techniques for data summarization with computer graphics principles for efficient rendering. While simple, vizketches are effective at scaling the spreadsheet by parallelizing computation, reducing communication, providing progressive visualizations, and offering precise accuracy guarantees. Using Hillview running on eight servers, we can navigate and visualize datasets of tens of billions of rows and trillions of cells, much beyond the published capabilities of competing systems

    Beyond visualization : designing interfaces to contextualize geospatial data

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2013.Cataloged from PDF version of thesis.Includes bibliographical references (p. 71-74).The growing sensor data collections about our environment have the potential to drastically change our perception of the fragile world we live in. To make sense of such data, we commonly use visualization techniques, enabling public discourse and analysis. This thesis describes the design and implementation of a series of interactive systems that integrate geospatial sensor data visualization and terrain models with various user interface modalities in an educational context to support data analysis and knowledge building using part-digital, part-physical rendering. The main contribution of this thesis is a concrete application scenario and initial prototype of a "Designed Environment" where we can explore the relationship between the surface of Japan's islands, the tension that originates in the fault lines along the seafloor beneath its east coast, and the resulting natural disasters. The system is able to import geospatial data from a multitude of sources on the "Spatial Web", bringing us one step closer to a tangible "dashboard of the Earth."Samuel Luescher.S.M

    Conditional Adversarial Networks for Multimodal Photo-Realistic Point Cloud Rendering

    Get PDF
    We investigate whether conditional generative adversarial networks (C-GANs) are suitable for point cloud rendering. For this purpose, we created a dataset containing approximately 150,000 renderings of point cloud–image pairs. The dataset was recorded using our mobile mapping system, with capture dates that spread across 1 year. Our model learns how to predict realistically looking images from just point cloud data. We show that we can use this approach to colourize point clouds without the usage of any camera images. Additionally, we show that by parameterizing the recording date, we are even able to predict realistically looking views for different seasons, from identical input point clouds.Nutzung von Conditional Generative Adversarial Networks für das multimodale photorealistische Rendering von Punktwolken. Wir untersuchen, ob Conditional Generative Adversarial Networks (C-GANs) für das Rendering von Punktwolken geeignet sind. Zu diesem Zweck haben wir einen Datensatz erstellt, der etwa 150.000 Bildpaare enthält, jedes bestehend aus einem Rendering einer Punktwolke und dem dazugehörigen Kamerabild. Der Datensatz wurde mit unserem Mobile Mapping System aufgezeichnet, wobei die Messkampagnen über ein Jahr verteilt durchgeführt wurden. Unser Modell lernt, ausschließlich auf Basis von Punktwolkendaten realistisch aussehende Bilder vorherzusagen. Wir zeigen, dass wir mit diesem Ansatz Punktwolken ohne die Verwendung von Kamerabildern kolorieren können. Darüber hinaus zeigen wir, dass wir durch die Parametrierung des Aufnahmedatums in der Lage sind, aus identischen Eingabepunktwolken realistisch aussehende Ansichten für verschiedene Jahreszeiten vorherzusagen
    • …
    corecore