9 research outputs found

    Accelerated Data Delivery Architecture

    Get PDF
    This paper introduces the Accelerated Data Delivery Architecture (ADDA). ADDA establishes a framework to distribute transactional data and control consistency to achieve fast access to data, distributed scalability and non-blocking concurrency control by using a clean declarative interface. It is designed to be used with web-based business applications. This framework uses a combination of traditional Relational Database Management System (RDBMS) combined with a distributed Not Only SQL (NoSQL) database and a browser-based database. It uses a single physical and conceptual database schema designed for a standard RDBMS driven application. The design allows the architect to assign consistency levels to entities which determine the storage location and query methodology. The implementation of these levels is flexible and requires no database schema changes in order to change the level of an entity. Also, a data leasing system to enforce concurrency control in a non-blocking manner is employed for critical data items. The system also ensures that all data is available for query from the RDBMS server. This means that the system can have the performance advantages of a DDBMS system and the ACID qualities of a single-site RDBMS system without the complex design considerations of traditional DDBMS systems

    Predicting access to persistent objects through static code analysis

    Get PDF
    In this paper, we present a fully-automatic, high-accuracy approach to predict access to persistent objects through static code analysis of object-oriented applications. The most widely-used previous technique uses a simple heuristic to make the predictions while approaches that offer higher accuracy are based on monitoring application execution. These approaches add a non-negligible overhead to the application’s execution time and/or consume a considerable amount of memory. By contrast, we demonstrate in our experimental study that our proposed approach offers better accuracy than the most common technique used to predict access to persistent objects, and makes the predictions farther in advance, without performing any analysis during application executionThis work has been supported by the European Union’s Horizon 2020 research and innovation program (grant H2020-MSCA-ITN-2014-642963), the Spanish Government (grant SEV2015-0493 of the Severo Ochoa Program), the Spanish Ministry of Science and Innovation (contract TIN2015-65316) and Generalitat de Catalunya (contract 2014-SGR-1051). The authors would also like to thank Alex Barceló for his feedback on the formalization included in this paper.Peer ReviewedPostprint (author's final draft

    CAPre: Code-Analysis based Prefetching for Persistent object stores

    Get PDF
    Data prefetching aims to improve access times to data storage systems by predicting data records that are likely to be accessed by subsequent requests and retrieving them into a memory cache before they are needed. In the case of Persistent Object Stores, previous approaches to prefetching have been based on predictions made through analysis of the store’s schema, which generates rigid predictions, or monitoring access patterns to the store while applications are executed, which introduces memory and/or computation overhead. In this paper, we present CAPre, a novel prefetching system for Persistent Object Stores based on static code analysis of object-oriented applications. CAPre generates the predictions at compile-time and does not introduce any overhead to the application execution. Moreover, CAPre is able to predict large amounts of objects that will be accessed in the near future, thus enabling the object store to perform parallel prefetching if the objects are distributed, in a much more aggressive way than in schema-based prediction algorithms. We integrate CAPre into a distributed Persistent Object Store and run a series of experiments that show that it can reduce the execution time of applications from 9% to over 50%, depending on the nature of the application and its persistent data model.This work has been supported by the European Union’s Horizon 2020 research and innovation program under the BigStorage European Training Network (ETN) (grant H2020-MSCA-ITN-2014- 642963), the Spanish Ministry of Science and Innovation (contract TIN2015-65316) and the Generalitat de Catalunya, Spain (contract 2014-SGR-1051).Peer ReviewedPostprint (author's final draft

    Computer-language based data prefetching techniques

    Get PDF
    Data prefetching has long been used as a technique to improve access times to persistent data. It is based on retrieving data records from persistent storage to main memory before the records are needed. Data prefetching has been applied to a wide variety of persistent storage systems, from file systems to Relational Database Management Systems and NoSQL databases, with the aim of reducing access times to the data maintained by the system and thus improve the execution times of the applications using this data. However, most existing solutions to data prefetching have been based on information that can be retrieved from the storage system itself, whether in the form of heuristics based on the data schema or data access patterns detected by monitoring access to the system. There are multiple disadvantages of these approaches in terms of the rigidity of the heuristics they use, the accuracy of the predictions they make and / or the time they need to make these predictions, a process often performed while the applications are accessing the data and causing considerable overhead. In light of the above, this thesis proposes two novel approaches to data prefetching based on predictions made by analyzing the instructions and statements of the computer languages used to access persistent data. The proposed approaches take into consideration how the data is accessed by the higher-level applications, make accurate predictions and are performed without causing any additional overhead. The first of the proposed approaches aims at analyzing instructions of applications written in object-oriented languages in order to prefetch data from Persistent Object Stores. The approach is based on static code analysis that is done prior to the application execution and hence does not add any overhead. It also includes various strategies to deal with cases that require runtime information unavailable prior to the execution of the application. We integrate this analysis approach into an existing Persistent Object Store and run a series of extensive experiments to measure the improvement obtained by prefetching the objects predicted by the approach. The second approach analyzes statements and historic logs of the declarative query language SPARQL in order to prefetch data from RDF Triplestores. The approach measures two types of similarity between SPARQL queries in order to detect recurring query patterns in the historic logs. Afterwards, it uses the detected patterns to predict subsequent queries and launch them before they are requested to prefetch the data needed by them. Our evaluation of the proposed approach shows that it high-accuracy prediction and can achieve a high cache hit rate when caching the results of the predicted queries.Precargar datos ha sido una de las técnicas más comunes para mejorar los tiempos de acceso a datos persistentes. Esta técnica se basa en predecir los registros de datos que se van a acceder en el futuro y cargarlos del almacenimiento persistente a la memoria con antelación a su uso. Precargar datos ha sido aplicado en multitud de sistemas de almacenimiento persistente, desde sistemas de ficheros a bases de datos relacionales y NoSQL, con el objetivo de reducir los tiempos de acceso a los datos y por lo tanto mejorar los tiempos de ejecución de las aplicaciones que usan estos datos. Sin embargo, la mayoría de los enfoques existentes utilizan predicciones basadas en información que se encuentra dentro del mismo sistema de almacenimiento, ya sea en forma de heurísticas basadas en el esquema de los datos o patrones de acceso a los datos generados mediante la monitorización del acceso al sistema. Estos enfoques presentan varias desventajas en cuanto a la rigidez de las heurísticas usadas, la precisión de las predicciones generadas y el tiempo que necesitan para generar estas predicciones, un proceso que se realiza con frecuencia mientras las aplicaciones acceden a los datos y que puede tener efectos negativos en el tiempo de ejecución de estas aplicaciones. En vista de lo anterior, esta tesis presenta dos enfoques novedosos para precargar datos basados en predicciones generadas por el análisis de las instrucciones y sentencias del lenguaje informático usado para acceder a los datos persistentes. Los enfoques propuestos toman en consideración cómo las aplicaciones acceden a los datos, generan predicciones precisas y mejoran el rendimiento de las aplicaciones sin causar ningún efecto negativo. El primer enfoque analiza las instrucciones de applicaciones escritas en lenguajes de programación orientados a objetos con el fin de precargar datos de almacenes de objetos persistentes. El enfoque emplea análisis estático de código hecho antes de la ejecución de las aplicaciones, y por lo tanto no afecta negativamente el rendimiento de las mismas. El enfoque también incluye varias estrategias para tratar casos que requieren información de runtime no disponible antes de ejecutar las aplicaciones. Además, integramos este enfoque en un almacén de objetos persistentes y ejecutamos una serie extensa de experimentos para medir la mejora de rendimiento que se puede obtener utilizando el enfoque. Por otro lado, el segundo enfoque analiza las sentencias y logs del lenguaje declarativo de consultas SPARQL para precargar datos de triplestores de RDF. Este enfoque aplica dos medidas para calcular la similtud entre las consultas del lenguaje SPARQL con el objetivo de detectar patrones recurrentes en los logs históricos. Posteriormente, el enfoque utiliza los patrones detectados para predecir las consultas siguientes y precargar con antelación los datos que necesitan. Nuestra evaluación muestra que este enfoque produce predicciones de alta precisión y puede lograr un alto índice de aciertos cuando los resultados de las consultas predichas se guardan en el caché.Postprint (published version

    Computer-language based data prefetching techniques

    Get PDF
    Data prefetching has long been used as a technique to improve access times to persistent data. It is based on retrieving data records from persistent storage to main memory before the records are needed. Data prefetching has been applied to a wide variety of persistent storage systems, from file systems to Relational Database Management Systems and NoSQL databases, with the aim of reducing access times to the data maintained by the system and thus improve the execution times of the applications using this data. However, most existing solutions to data prefetching have been based on information that can be retrieved from the storage system itself, whether in the form of heuristics based on the data schema or data access patterns detected by monitoring access to the system. There are multiple disadvantages of these approaches in terms of the rigidity of the heuristics they use, the accuracy of the predictions they make and / or the time they need to make these predictions, a process often performed while the applications are accessing the data and causing considerable overhead. In light of the above, this thesis proposes two novel approaches to data prefetching based on predictions made by analyzing the instructions and statements of the computer languages used to access persistent data. The proposed approaches take into consideration how the data is accessed by the higher-level applications, make accurate predictions and are performed without causing any additional overhead. The first of the proposed approaches aims at analyzing instructions of applications written in object-oriented languages in order to prefetch data from Persistent Object Stores. The approach is based on static code analysis that is done prior to the application execution and hence does not add any overhead. It also includes various strategies to deal with cases that require runtime information unavailable prior to the execution of the application. We integrate this analysis approach into an existing Persistent Object Store and run a series of extensive experiments to measure the improvement obtained by prefetching the objects predicted by the approach. The second approach analyzes statements and historic logs of the declarative query language SPARQL in order to prefetch data from RDF Triplestores. The approach measures two types of similarity between SPARQL queries in order to detect recurring query patterns in the historic logs. Afterwards, it uses the detected patterns to predict subsequent queries and launch them before they are requested to prefetch the data needed by them. Our evaluation of the proposed approach shows that it high-accuracy prediction and can achieve a high cache hit rate when caching the results of the predicted queries.Precargar datos ha sido una de las técnicas más comunes para mejorar los tiempos de acceso a datos persistentes. Esta técnica se basa en predecir los registros de datos que se van a acceder en el futuro y cargarlos del almacenimiento persistente a la memoria con antelación a su uso. Precargar datos ha sido aplicado en multitud de sistemas de almacenimiento persistente, desde sistemas de ficheros a bases de datos relacionales y NoSQL, con el objetivo de reducir los tiempos de acceso a los datos y por lo tanto mejorar los tiempos de ejecución de las aplicaciones que usan estos datos. Sin embargo, la mayoría de los enfoques existentes utilizan predicciones basadas en información que se encuentra dentro del mismo sistema de almacenimiento, ya sea en forma de heurísticas basadas en el esquema de los datos o patrones de acceso a los datos generados mediante la monitorización del acceso al sistema. Estos enfoques presentan varias desventajas en cuanto a la rigidez de las heurísticas usadas, la precisión de las predicciones generadas y el tiempo que necesitan para generar estas predicciones, un proceso que se realiza con frecuencia mientras las aplicaciones acceden a los datos y que puede tener efectos negativos en el tiempo de ejecución de estas aplicaciones. En vista de lo anterior, esta tesis presenta dos enfoques novedosos para precargar datos basados en predicciones generadas por el análisis de las instrucciones y sentencias del lenguaje informático usado para acceder a los datos persistentes. Los enfoques propuestos toman en consideración cómo las aplicaciones acceden a los datos, generan predicciones precisas y mejoran el rendimiento de las aplicaciones sin causar ningún efecto negativo. El primer enfoque analiza las instrucciones de applicaciones escritas en lenguajes de programación orientados a objetos con el fin de precargar datos de almacenes de objetos persistentes. El enfoque emplea análisis estático de código hecho antes de la ejecución de las aplicaciones, y por lo tanto no afecta negativamente el rendimiento de las mismas. El enfoque también incluye varias estrategias para tratar casos que requieren información de runtime no disponible antes de ejecutar las aplicaciones. Además, integramos este enfoque en un almacén de objetos persistentes y ejecutamos una serie extensa de experimentos para medir la mejora de rendimiento que se puede obtener utilizando el enfoque. Por otro lado, el segundo enfoque analiza las sentencias y logs del lenguaje declarativo de consultas SPARQL para precargar datos de triplestores de RDF. Este enfoque aplica dos medidas para calcular la similtud entre las consultas del lenguaje SPARQL con el objetivo de detectar patrones recurrentes en los logs históricos. Posteriormente, el enfoque utiliza los patrones detectados para predecir las consultas siguientes y precargar con antelación los datos que necesitan. Nuestra evaluación muestra que este enfoque produce predicciones de alta precisión y puede lograr un alto índice de aciertos cuando los resultados de las consultas predichas se guardan en el caché

    Path and cache conscious prefetching (PCCP)

    No full text
    Main memory cache performance continues to play an important role in determining the overall performance of object-oriented, object-relational and XML databases. An effective method of improving main memory cache performance is to prefetch or pre-load pages in advance to their usage, in anticipation of main memory cache misses. In this paper we describe a framework for creating prefetching algorithms with the novel features of path and cache consciousness. Path consciousness refers to the use of short sequences of object references at key points in the reference trace to identify paths of navigation. Cache consciousness refers to the use of historical page access knowledge to guess which pages are likely to be main memory cache resident most of the time and then assumes these pages do not exist in the context of prefetching. We have conducted a number of experiments comparing our approach against four highly competitive prefetching algorithms. The results shows our approach outperforms existing prefetching techniques in some situations while performing worse in others. We provide guidelines as to when our algorithm should be used and when others maybe more desirable

    Path and cache conscious prefetching (PCCP)

    No full text

    Noname manuscript No. (will be inserted by the editor) Path and Cache Conscious Prefetching (PCCP)

    No full text
    Abstract Main memory cache performance continues to play an important role in determining the overall performance of object-oriented, object-relational and XML databases. An effective method of improving main memory cache performance is to prefetch or pre-load pages in advance to their usage, in anticipation of main memory cache misses. In this paper we describe a framework for creating prefetching algorithms with the novel features of path and cache consciousness. Path consciousness refers to the use of short sequences of object references at key points in the reference trace to identify paths of navigation. Cache consciousness refers to the use of historical page access knowledge to guess which pages are likely to be main memory cache resident most of the time and then assumes these pages do not exist in the context of prefetching. We have conducted a number of experiments comparing our approach against four highly competitive prefetching algorithms. The results shows our approach outperforms existing prefetching techniques in some situations while performing worse in others. We provide guidelines as to when our algorithm should be used and when others maybe more desirable
    corecore