15 research outputs found

    Dynamic Prefetching of Data Tiles for Interactive Visualization

    Get PDF
    In this paper, we present ForeCache, a general-purpose tool for exploratory browsing of large datasets. ForeCache utilizes a client-server architecture, where the user interacts with a lightweight client-side interface to browse datasets, and the data to be browsed is retrieved from a DBMS running on a back-end server. We assume a detail-on-demand browsing paradigm, and optimize the back-end support for this paradigm by inserting a separate middleware layer in front of the DBMS. To improve response times, the middleware layer fetches data ahead of the user as she explores a dataset. We consider two different mechanisms for prefetching: (a) learning what to fetch from the user's recent movements, and (b) using data characteristics (e.g., histograms) to find data similar to what the user has viewed in the past. We incorporate these mechanisms into a single prediction engine that adjusts its prediction strategies over time, based on changes in the user's behavior. We evaluated our prediction engine with a user study, and found that our dynamic prefetching strategy provides: (1) significant improvements in overall latency when compared with non-prefetching systems (430% improvement); and (2) substantial improvements in both prediction accuracy (25% improvement) and latency (88% improvement) relative to existing prefetching techniques

    Data-driven Neuroscience: Enabling Breakthroughs Via Innovative Data Management

    Get PDF
    Scientists in all disciplines increasingly rely on simulations to develop a better understanding of the subject they are studying. For example the neuroscientists we collaborate with in the Blue Brain project have started to simulate the brain on a supercomputer. The level of detail of their models is unprecedented as they model details on the subcellular level (e.g., the neurotransmitter). This level of detail, however, also leads to a true data deluge and the neuroscientists have only few tools to efficiently analyze the data. This demonstration showcases three innovative spatial management solutions that have substantial impact on computational neuroscience and other disciplines in that they allow to build, analyze and simulate bigger and more detailed models. More particularly, we visualize the novel query execution strategy of FLAT, an index for the scalable and efficient execution of range queries on increasingly detailed spatial models. FLAT is used to build and analyze models of the brain. We furthermore demonstrate how SCOUT uses previous query results to prefetch spatial data with high accuracy and therefore speeds up the analysis of spatial models. We finally also demonstrate TOUCH, a novel in-memory spatial join, that speeds up the model building process

    SCOUT: Prefetching for Latent Structure Following Queries

    Get PDF
    Today's scientists are quickly moving from in vitro to in silico experimentation: they no longer analyze natural phenomena in a petri dish, but instead they build models and simulate them. Managing and analyzing the massive amounts of data involved in simulations is a major task. Yet, they lack the tools to efficiently work with data of this size. One problem many scientists share is the analysis of the massive spatial models they build. For several types of analysis they need to interactively follow the structures in the spatial model, e.g., the arterial tree, neuron fibers, etc., and issue range queries along the way. Each query takes long to execute, and the total time for executing a sequence of queries significantly delays data analysis. Prefetching the spatial data reduces the response time considerably, but known approaches do not prefetch with high accuracy. We develop SCOUT, a structure-aware method for prefetching data along interactive spatial query sequences. SCOUT uses an approximate graph model of the structures involved in past queries and attempts to identify what particular structure the user follows. Our experiments with neuroscience data show that SCOUT prefetches with an accuracy from 71% to 92%, which translates to a speedup of 4x-15x. SCOUT also improves the prefetching accuracy on datasets from other scientific domains, such as medicine and biology

    JPIP proxy server with prefetching strategies based on user-navigation model and semantic map

    Get PDF
    The efficient transmission of large resolution images and, in particular, the interactive transmission of images in a client-server scenario, is an important aspect for many applications. Among the current image compression standards, JPEG2000 excels for its interactive transmission capabilities. In general, three mechanisms are employed to optimize the transmission of images when using the JPEG2000 Interactive Protocol (JPIP): 1) packet re-sequencing at the server; 2) prefetching at the client; and 3) proxy servers along the network infrastructure. To avoid the congestion of the network, prefetching mechanisms are not commonly employed when many clients within a local area network (LAN) browse images from a remote server. Aimed to maximize the responsiveness of all the clients within a LAN, this work proposes the use of prefetching strategies at the proxy server -rather than at the clients. The main insight behind the proposed prefetching strategies is a user-navigation model and a semantic map that predict the future requests of the clients. Experimental results indicate that the introduction of these strategies into a JPIP proxy server enhances the browsing experience of the end-users notably

    Aplicación de técnicas de aprendizaje automático a la gestión y optimización de cachés de teselas para la aceleración de servicios de mapas en las infraestructuras de datos espaciales

    Get PDF
    La gran proliferación en el uso de servicios de mapas a través de la Web ha motivado la necesidad de disponer de servicios cada vez más escalables. Como respuesta a esta necesidad, los servicios de mapas basados en teselado se han perfilado como una alternativa escalable frente a los servicios de mapas tradicionales, permitiendo la actuación de mecanismos de caché o incluso la prestación del servicio mediante una colección de imágenes pregeneradas. Sin embargo, los requisitos de almacenamiento y tiempo de puesta en marcha de estos servicios resultan a menudo prohibitivos cuando la cartografía a servir cubre una zona geográfica extensa para un elevado número de escalas. Por ello, habitualmente estos servicios se ofrecen recurriendo a cachés parciales que contienen tan solo un subconjunto de la cartografía. Para garantizar una Calidad de Servicio (QoS - Quality of Service) aceptable es necesaria la actuación de adecuadas políticas de mantenimiento y gestión de estas cachés de mapas: 1) Estrategias de población inicial ó seeding de la caché. 2) Algoritmos de carga dinámica ante las peticiones de los usuarios. 3) Políticas de reemplazo de caché. Sin embargo, existe un reducido número de estas estrategias que sean específicas para los servicios de mapas. La mayor parte de estrategias aplicadas a estos servicios son extraídas de otros ámbitos, como los proxies Web tradicionales, las cuáles no tienen en cuenta la componente espacial de los objetos de mapa que gestionan. En la presente tesis se aborda este punto de mejora, diseñando nuevos algoritmos específicos para este dominio de aplicación que permitan optimizar el rendimiento de los servicios de mapas. Dado el elevado número de objetos gestionados por estas cachés y la heterogeneidad de los mismos en cuanto a capas, escalas de representación, etc., se ha hecho un esfuerzo para que las estrategias diseñadas sean automáticas o semi-automáticas, requiriendo la menor intervención humana posible. Así, se han propuesto dos novedosas estrategias para la población inicial de una caché de mapas. Una de ellas utiliza un modelo descriptivo mediante los registros de peticiones pasadas del servicio. La otra se basa en un modelo predictivo para la identificación de fenómenos geográficos directores de las peticiones de los usuarios, parametrizado o bien mediante un análisis regresivo OLS (Ordinary Least Squares) o mediante un sistema inteligente con redes neuronales. Asimismo, se han llevado a cabo importantes contribuciones en relación con las estrategias de reemplazo de estas cachés. Por una parte, se ha propuesto un sistema inteligente basado en redes neuronales, que estima la popularidad de acceso futuro en base a ciertas propiedades de los objetos que gestiona: actualidad de referencia, frecuencia de referencia, y el tamaño de la tesela referenciada. Por otra parte, se ha propuesto una estrategia, bautizada como Spatial-LFU, la cual es una variante de la estrategia Perfect-LFU, simplificada aprovechando la correlación espacial existente entre las peticiones.Departamento de Teoría de la Señal y Comunicaciones e Ingeniería Telemátic

    Cartography

    Get PDF
    The terrestrial space is the place of interaction of natural and social systems. The cartography is an essential tool to understand the complexity of these systems, their interaction and evolution. This brings the cartography to an important place in the modern world. The book presents several contributions at different areas and activities showing the importance of the cartography to the perception and organization of the territory. Learning with the past or understanding the present the use of cartography is presented as a way of looking to almost all themes of the knowledge

    Using MapReduce Streaming for Distributed Life Simulation on the Cloud

    Get PDF
    Distributed software simulations are indispensable in the study of large-scale life models but often require the use of technically complex lower-level distributed computing frameworks, such as MPI. We propose to overcome the complexity challenge by applying the emerging MapReduce (MR) model to distributed life simulations and by running such simulations on the cloud. Technically, we design optimized MR streaming algorithms for discrete and continuous versions of Conway’s life according to a general MR streaming pattern. We chose life because it is simple enough as a testbed for MR’s applicability to a-life simulations and general enough to make our results applicable to various lattice-based a-life models. We implement and empirically evaluate our algorithms’ performance on Amazon’s Elastic MR cloud. Our experiments demonstrate that a single MR optimization technique called strip partitioning can reduce the execution time of continuous life simulations by 64%. To the best of our knowledge, we are the first to propose and evaluate MR streaming algorithms for lattice-based simulations. Our algorithms can serve as prototypes in the development of novel MR simulation algorithms for large-scale lattice-based a-life models.https://digitalcommons.chapman.edu/scs_books/1014/thumbnail.jp

    Fundamentals

    Get PDF
    Volume 1 establishes the foundations of this new field. It goes through all the steps from data collection, their summary and clustering, to different aspects of resource-aware learning, i.e., hardware, memory, energy, and communication awareness. Machine learning methods are inspected with respect to resource requirements and how to enhance scalability on diverse computing architectures ranging from embedded systems to large computing clusters
    corecore