562 research outputs found

    Computer-language based data prefetching techniques

    Get PDF
    Data prefetching has long been used as a technique to improve access times to persistent data. It is based on retrieving data records from persistent storage to main memory before the records are needed. Data prefetching has been applied to a wide variety of persistent storage systems, from file systems to Relational Database Management Systems and NoSQL databases, with the aim of reducing access times to the data maintained by the system and thus improve the execution times of the applications using this data. However, most existing solutions to data prefetching have been based on information that can be retrieved from the storage system itself, whether in the form of heuristics based on the data schema or data access patterns detected by monitoring access to the system. There are multiple disadvantages of these approaches in terms of the rigidity of the heuristics they use, the accuracy of the predictions they make and / or the time they need to make these predictions, a process often performed while the applications are accessing the data and causing considerable overhead. In light of the above, this thesis proposes two novel approaches to data prefetching based on predictions made by analyzing the instructions and statements of the computer languages used to access persistent data. The proposed approaches take into consideration how the data is accessed by the higher-level applications, make accurate predictions and are performed without causing any additional overhead. The first of the proposed approaches aims at analyzing instructions of applications written in object-oriented languages in order to prefetch data from Persistent Object Stores. The approach is based on static code analysis that is done prior to the application execution and hence does not add any overhead. It also includes various strategies to deal with cases that require runtime information unavailable prior to the execution of the application. We integrate this analysis approach into an existing Persistent Object Store and run a series of extensive experiments to measure the improvement obtained by prefetching the objects predicted by the approach. The second approach analyzes statements and historic logs of the declarative query language SPARQL in order to prefetch data from RDF Triplestores. The approach measures two types of similarity between SPARQL queries in order to detect recurring query patterns in the historic logs. Afterwards, it uses the detected patterns to predict subsequent queries and launch them before they are requested to prefetch the data needed by them. Our evaluation of the proposed approach shows that it high-accuracy prediction and can achieve a high cache hit rate when caching the results of the predicted queries.Precargar datos ha sido una de las técnicas más comunes para mejorar los tiempos de acceso a datos persistentes. Esta técnica se basa en predecir los registros de datos que se van a acceder en el futuro y cargarlos del almacenimiento persistente a la memoria con antelación a su uso. Precargar datos ha sido aplicado en multitud de sistemas de almacenimiento persistente, desde sistemas de ficheros a bases de datos relacionales y NoSQL, con el objetivo de reducir los tiempos de acceso a los datos y por lo tanto mejorar los tiempos de ejecución de las aplicaciones que usan estos datos. Sin embargo, la mayoría de los enfoques existentes utilizan predicciones basadas en información que se encuentra dentro del mismo sistema de almacenimiento, ya sea en forma de heurísticas basadas en el esquema de los datos o patrones de acceso a los datos generados mediante la monitorización del acceso al sistema. Estos enfoques presentan varias desventajas en cuanto a la rigidez de las heurísticas usadas, la precisión de las predicciones generadas y el tiempo que necesitan para generar estas predicciones, un proceso que se realiza con frecuencia mientras las aplicaciones acceden a los datos y que puede tener efectos negativos en el tiempo de ejecución de estas aplicaciones. En vista de lo anterior, esta tesis presenta dos enfoques novedosos para precargar datos basados en predicciones generadas por el análisis de las instrucciones y sentencias del lenguaje informático usado para acceder a los datos persistentes. Los enfoques propuestos toman en consideración cómo las aplicaciones acceden a los datos, generan predicciones precisas y mejoran el rendimiento de las aplicaciones sin causar ningún efecto negativo. El primer enfoque analiza las instrucciones de applicaciones escritas en lenguajes de programación orientados a objetos con el fin de precargar datos de almacenes de objetos persistentes. El enfoque emplea análisis estático de código hecho antes de la ejecución de las aplicaciones, y por lo tanto no afecta negativamente el rendimiento de las mismas. El enfoque también incluye varias estrategias para tratar casos que requieren información de runtime no disponible antes de ejecutar las aplicaciones. Además, integramos este enfoque en un almacén de objetos persistentes y ejecutamos una serie extensa de experimentos para medir la mejora de rendimiento que se puede obtener utilizando el enfoque. Por otro lado, el segundo enfoque analiza las sentencias y logs del lenguaje declarativo de consultas SPARQL para precargar datos de triplestores de RDF. Este enfoque aplica dos medidas para calcular la similtud entre las consultas del lenguaje SPARQL con el objetivo de detectar patrones recurrentes en los logs históricos. Posteriormente, el enfoque utiliza los patrones detectados para predecir las consultas siguientes y precargar con antelación los datos que necesitan. Nuestra evaluación muestra que este enfoque produce predicciones de alta precisión y puede lograr un alto índice de aciertos cuando los resultados de las consultas predichas se guardan en el caché.Postprint (published version

    CAPre: Code-Analysis based Prefetching for Persistent object stores

    Get PDF
    Data prefetching aims to improve access times to data storage systems by predicting data records that are likely to be accessed by subsequent requests and retrieving them into a memory cache before they are needed. In the case of Persistent Object Stores, previous approaches to prefetching have been based on predictions made through analysis of the store’s schema, which generates rigid predictions, or monitoring access patterns to the store while applications are executed, which introduces memory and/or computation overhead. In this paper, we present CAPre, a novel prefetching system for Persistent Object Stores based on static code analysis of object-oriented applications. CAPre generates the predictions at compile-time and does not introduce any overhead to the application execution. Moreover, CAPre is able to predict large amounts of objects that will be accessed in the near future, thus enabling the object store to perform parallel prefetching if the objects are distributed, in a much more aggressive way than in schema-based prediction algorithms. We integrate CAPre into a distributed Persistent Object Store and run a series of experiments that show that it can reduce the execution time of applications from 9% to over 50%, depending on the nature of the application and its persistent data model.This work has been supported by the European Union’s Horizon 2020 research and innovation program under the BigStorage European Training Network (ETN) (grant H2020-MSCA-ITN-2014- 642963), the Spanish Ministry of Science and Innovation (contract TIN2015-65316) and the Generalitat de Catalunya, Spain (contract 2014-SGR-1051).Peer ReviewedPostprint (author's final draft

    Computer-language based data prefetching techniques

    Get PDF
    Data prefetching has long been used as a technique to improve access times to persistent data. It is based on retrieving data records from persistent storage to main memory before the records are needed. Data prefetching has been applied to a wide variety of persistent storage systems, from file systems to Relational Database Management Systems and NoSQL databases, with the aim of reducing access times to the data maintained by the system and thus improve the execution times of the applications using this data. However, most existing solutions to data prefetching have been based on information that can be retrieved from the storage system itself, whether in the form of heuristics based on the data schema or data access patterns detected by monitoring access to the system. There are multiple disadvantages of these approaches in terms of the rigidity of the heuristics they use, the accuracy of the predictions they make and / or the time they need to make these predictions, a process often performed while the applications are accessing the data and causing considerable overhead. In light of the above, this thesis proposes two novel approaches to data prefetching based on predictions made by analyzing the instructions and statements of the computer languages used to access persistent data. The proposed approaches take into consideration how the data is accessed by the higher-level applications, make accurate predictions and are performed without causing any additional overhead. The first of the proposed approaches aims at analyzing instructions of applications written in object-oriented languages in order to prefetch data from Persistent Object Stores. The approach is based on static code analysis that is done prior to the application execution and hence does not add any overhead. It also includes various strategies to deal with cases that require runtime information unavailable prior to the execution of the application. We integrate this analysis approach into an existing Persistent Object Store and run a series of extensive experiments to measure the improvement obtained by prefetching the objects predicted by the approach. The second approach analyzes statements and historic logs of the declarative query language SPARQL in order to prefetch data from RDF Triplestores. The approach measures two types of similarity between SPARQL queries in order to detect recurring query patterns in the historic logs. Afterwards, it uses the detected patterns to predict subsequent queries and launch them before they are requested to prefetch the data needed by them. Our evaluation of the proposed approach shows that it high-accuracy prediction and can achieve a high cache hit rate when caching the results of the predicted queries.Precargar datos ha sido una de las técnicas más comunes para mejorar los tiempos de acceso a datos persistentes. Esta técnica se basa en predecir los registros de datos que se van a acceder en el futuro y cargarlos del almacenimiento persistente a la memoria con antelación a su uso. Precargar datos ha sido aplicado en multitud de sistemas de almacenimiento persistente, desde sistemas de ficheros a bases de datos relacionales y NoSQL, con el objetivo de reducir los tiempos de acceso a los datos y por lo tanto mejorar los tiempos de ejecución de las aplicaciones que usan estos datos. Sin embargo, la mayoría de los enfoques existentes utilizan predicciones basadas en información que se encuentra dentro del mismo sistema de almacenimiento, ya sea en forma de heurísticas basadas en el esquema de los datos o patrones de acceso a los datos generados mediante la monitorización del acceso al sistema. Estos enfoques presentan varias desventajas en cuanto a la rigidez de las heurísticas usadas, la precisión de las predicciones generadas y el tiempo que necesitan para generar estas predicciones, un proceso que se realiza con frecuencia mientras las aplicaciones acceden a los datos y que puede tener efectos negativos en el tiempo de ejecución de estas aplicaciones. En vista de lo anterior, esta tesis presenta dos enfoques novedosos para precargar datos basados en predicciones generadas por el análisis de las instrucciones y sentencias del lenguaje informático usado para acceder a los datos persistentes. Los enfoques propuestos toman en consideración cómo las aplicaciones acceden a los datos, generan predicciones precisas y mejoran el rendimiento de las aplicaciones sin causar ningún efecto negativo. El primer enfoque analiza las instrucciones de applicaciones escritas en lenguajes de programación orientados a objetos con el fin de precargar datos de almacenes de objetos persistentes. El enfoque emplea análisis estático de código hecho antes de la ejecución de las aplicaciones, y por lo tanto no afecta negativamente el rendimiento de las mismas. El enfoque también incluye varias estrategias para tratar casos que requieren información de runtime no disponible antes de ejecutar las aplicaciones. Además, integramos este enfoque en un almacén de objetos persistentes y ejecutamos una serie extensa de experimentos para medir la mejora de rendimiento que se puede obtener utilizando el enfoque. Por otro lado, el segundo enfoque analiza las sentencias y logs del lenguaje declarativo de consultas SPARQL para precargar datos de triplestores de RDF. Este enfoque aplica dos medidas para calcular la similtud entre las consultas del lenguaje SPARQL con el objetivo de detectar patrones recurrentes en los logs históricos. Posteriormente, el enfoque utiliza los patrones detectados para predecir las consultas siguientes y precargar con antelación los datos que necesitan. Nuestra evaluación muestra que este enfoque produce predicciones de alta precisión y puede lograr un alto índice de aciertos cuando los resultados de las consultas predichas se guardan en el caché

    BeeSpace Navigator: exploratory analysis of gene function using semantic indexing of biological literature

    Get PDF
    With the rapid decrease in cost of genome sequencing, the classification of gene function is becoming a primary problem. Such classification has been performed by human curators who read biological literature to extract evidence. BeeSpace Navigator is a prototype software for exploratory analysis of gene function using biological literature. The software supports an automatic analogue of the curator process to extract functions, with a simple interface intended for all biologists. Since extraction is done on selected collections that are semantically indexed into conceptual spaces, the curation can be task specific. Biological literature containing references to gene lists from expression experiments can be analyzed to extract concepts that are computational equivalents of a classification such as Gene Ontology, yielding discriminating concepts that differentiate gene mentions from other mentions. The functions of individual genes can be summarized from sentences in biological literature, to produce results resembling a model organism database entry that is automatically computed. Statistical frequency analysis based on literature phrase extraction generates offline semantic indexes to support these gene function services. The website with BeeSpace Navigator is free and open to all; there is no login requirement at www.beespace.illinois.edu for version 4. Materials from the 2010 BeeSpace Software Training Workshop are available at www.beespace.illinois.edu/bstwmaterials.php

    A review of information modelling systems in the built environment

    Get PDF
    The built environment can be described to constitute the surrounding and existing elements created by humans. The systems for modelling information related to the built environment are numerous. Their development are based on varying assumptions and tailored to the various domains in which they are deployed. The functions of these systems are sometimes similar or overlap and they tend to end up with similar acronyms thereby creating confusion to stakeholders in the built environment. As such, stakeholders also find it difficult to choose systems best suited for their needs among the numerous existing ones. A comprehensive record of systems in the built environment with clear definitions of their functions and areas of overlap is therefore necessary to straighten up such confusion and provide requisite understanding among stakeholders. A literature review of information modelling systems in the built environment is therefore proposed. The review examines systems in key sectors of the built environment such the Architectural, Engineering, Construction, Geography and Urban Planning. We conclude that stakeholders should give strong consideration to interoperability needs along the supply chain in which they work while deciding on the choice of information modelling systems to procure

    Cognitive Task Planning for Smart Industrial Robots

    Get PDF
    This research work presents a novel Cognitive Task Planning framework for Smart Industrial Robots. The framework makes an industrial mobile manipulator robot Cognitive by applying Semantic Web Technologies. It also introduces a novel Navigation Among Movable Obstacles algorithm for robots navigating and manipulating inside a firm. The objective of Industrie 4.0 is the creation of Smart Factories: modular firms provided with cyber-physical systems able to strong customize products under the condition of highly flexible mass-production. Such systems should real-time communicate and cooperate with each other and with humans via the Internet of Things. They should intelligently adapt to the changing surroundings and autonomously navigate inside a firm while moving obstacles that occlude free paths, even if seen for the first time. At the end, in order to accomplish all these tasks while being efficient, they should learn from their actions and from that of other agents. Most of existing industrial mobile robots navigate along pre-generated trajectories. They follow ectrified wires embedded in the ground or lines painted on th efloor. When there is no expectation of environment changes and cycle times are critical, this planning is functional. When workspaces and tasks change frequently, it is better to plan dynamically: robots should autonomously navigate without relying on modifications of their environments. Consider the human behavior: humans reason about the environment and consider the possibility of moving obstacles if a certain goal cannot be reached or if moving objects may significantly shorten the path to it. This problem is named Navigation Among Movable Obstacles and is mostly known in rescue robotics. This work transposes the problem on an industrial scenario and tries to deal with its two challenges: the high dimensionality of the state space and the treatment of uncertainty. The proposed NAMO algorithm aims to focus exploration on less explored areas. For this reason it extends the Kinodynamic Motion Planning by Interior-Exterior Cell Exploration algorithm. The extension does not impose obstacles avoidance: it assigns an importance to each cell by combining the efforts necessary to reach it and that needed to free it from obstacles. The obtained algorithm is scalable because of its independence from the size of the map and from the number, shape, and pose of obstacles. It does not impose restrictions on actions to be performed: the robot can both push and grasp every object. Currently, the algorithm assumes full world knowledge but the environment is reconfigurable and the algorithm can be easily extended in order to solve NAMO problems in unknown environments. The algorithm handles sensor feedbacks and corrects uncertainties. Usually Robotics separates Motion Planning and Manipulation problems. NAMO forces their combined processing by introducing the need of manipulating multiple objects, often unknown, while navigating. Adopting standard precomputed grasps is not sufficient to deal with the big amount of existing different objects. A Semantic Knowledge Framework is proposed in support of the proposed algorithm by giving robots the ability to learn to manipulate objects and disseminate the information gained during the fulfillment of tasks. The Framework is composed by an Ontology and an Engine. The Ontology extends the IEEE Standard Ontologies for Robotics and Automation and contains descriptions of learned manipulation tasks and detected objects. It is accessible from any robot connected to the Cloud. It can be considered a data store for the efficient and reliable execution of repetitive tasks; and a Web-based repository for the exchange of information between robots and for the speed up of the learning phase. No other manipulation ontology exists respecting the IEEE Standard and, regardless the standard, the proposed ontology differs from the existing ones because of the type of features saved and the efficient way in which they can be accessed: through a super fast Cascade Hashing algorithm. The Engine lets compute and store the manipulation actions when not present in the Ontology. It is based on Reinforcement Learning techniques that avoid massive trainings on large-scale databases and favors human-robot interactions. The overall system is flexible and easily adaptable to different robots operating in different industrial environments. It is characterized by a modular structure where each software block is completely reusable. Every block is based on the open-source Robot Operating System. Not all industrial robot controllers are designed to be ROS-compliant. This thesis presents the method adopted during this research in order to Open Industrial Robot Controllers and create a ROS-Industrial interface for them

    “AccessBIM” - A Model of Environmental Characteristics for Vision Impaired Indoor Navigation and Way Finding

    Get PDF
    The complexity of modern indoor environments has made navigation difficult for individuals with vision impairment. Hence, this thesis presents the AccessBIM framework, which is an optimized database that’s facilitates generation of a real-time floor plan with path determination. The AccessBIM framework has the potential to play an integral role in improving the independence and quality of life for people with vision impairment whilst also decreasing the cost to the community related to caretakers

    Book Reviews

    Get PDF
    corecore