36 research outputs found

    Identifying New Directions in Database Performance Tuning

    Get PDF
    Database performance tuning is a complex and varied active research topic. With enterprise relational database management systems still reliant on the same set-based relational concepts that defined early data management products, the disparity between the object-oriented application development model and the object-relational database model, called the object-relational impedance mismatch problem, is addressed by techniques such as object-relational mapping (ORM). However, this has resulted in generally poor query performance for SQL developed by object applications and an irregular fit with cost-based optimisation algorithms, and leads to questions about the need for the relational model to better adapt to ORM-generated queries. This paper discusses database performance optimisation developments and seeks to demonstrate that current database performance tuning approaches need re-examination. Proposals for further work include exploring concepts such as dynamic schema redefinition; query analysis and optimisation modelling driven by machine learning; and augmentation or replacement of the cost-based optimiser model

    Construction and Performance Analysis of a Groomed Polarity Lexicon Derived from Product Review Source Datasets

    Get PDF
    Using a large, publicly-available dataset [1], we extract over 51 million product reviews. We split and associate each word of each review comment with the review score and store the resulting 3.7 billion word- and score pairs in a relational database. We cleanse the data, grooming the dataset against a standard English dictionary, and create an aggregation model based on word count distributions across review scores. This renders a model dataset of words, each associated with an overall positive or negative polarity sentiment score based on star rating which we correct and normalise across the set. To test the efficacy of the dataset for sentiment classification, we ingest a secondary cross-domain public dataset containing freeform text data and perform sentiment analysis against this dataset. We then compare our model performance against human classification performance by enlisting human volunteers to rate the same data samples. We find our model emulates human judgement reasonably well, reaching correct conclusions in 56% of cases, albeit with significant variance when classifying at a coarse grain. At the fine grain, we find our model can track human judgement to within a 7% margin for some cases. We consider potential improvements to our method and further applications, and the limitations of the lexicon-based approach in cross-domain, big data environments

    Building accountability into the Internet of Things: the IoT Databox model

    Get PDF
    This paper outlines the IoT Databox model as a means of making the Internet of Things (IoT) accountable to individuals. Accountability is a key to building consumer trust and is mandated by the European Union’s General Data Protection Regulation (GDPR). We focus here on the ‘external’ data subject accountability requirement specified by GDPR and how meeting this requirement turns on surfacing the invisible actions and interactions of connected devices and the social arrangements in which they are embedded. The IoT Databox model is proposed as an in principle means of enabling accountability and providing individuals with the mechanisms needed to build trust into the IoT

    Genetic mechanisms of critical illness in COVID-19.

    Get PDF
    Host-mediated lung inflammation is present1, and drives mortality2, in the critical illness caused by coronavirus disease 2019 (COVID-19). Host genetic variants associated with critical illness may identify mechanistic targets for therapeutic development3. Here we report the results of the GenOMICC (Genetics Of Mortality In Critical Care) genome-wide association study in 2,244 critically ill patients with COVID-19 from 208 UK intensive care units. We have identified and replicated the following new genome-wide significant associations: on chromosome 12q24.13 (rs10735079, P = 1.65 × 10-8) in a gene cluster that encodes antiviral restriction enzyme activators (OAS1, OAS2 and OAS3); on chromosome 19p13.2 (rs74956615, P = 2.3 × 10-8) near the gene that encodes tyrosine kinase 2 (TYK2); on chromosome 19p13.3 (rs2109069, P = 3.98 ×  10-12) within the gene that encodes dipeptidyl peptidase 9 (DPP9); and on chromosome 21q22.1 (rs2236757, P = 4.99 × 10-8) in the interferon receptor gene IFNAR2. We identified potential targets for repurposing of licensed medications: using Mendelian randomization, we found evidence that low expression of IFNAR2, or high expression of TYK2, are associated with life-threatening disease; and transcriptome-wide association in lung tissue revealed that high expression of the monocyte-macrophage chemotactic receptor CCR2 is associated with severe COVID-19. Our results identify robust genetic signals relating to key host antiviral defence mechanisms and mediators of inflammatory organ damage in COVID-19. Both mechanisms may be amenable to targeted treatment with existing drugs. However, large-scale randomized clinical trials will be essential before any change to clinical practice

    Development of a Dynamic Design Framework for Relational Database Performance Optimisation

    Get PDF
    Relational Database Management Systems (RDBMSs) are advanced software packages responsible for providing storage and access to relational databases; data stores in which data is arranged in schemas, which are interlinked tables, each table constituted of columns and rows, and each intersection containing a data point. This project considers the impact that the ever-increasing demand in data volume, velocity and variety, combined with changes in query methodology and uptake of objectrelational mapping frameworks driven by modern object-oriented application programming practices, have had upon the effectiveness of the relational database query optimiser; in particular, this research examines the emergence of object-relational impedance mismatch and the corresponding effect on query processing efficiency within the database engine. Firstly, this research reconsiders the query parsing and caching mechanisms within current RDBMSs and notes their deficiencies in query plan re-use. An alternative mechanism for query representation is presented, representing queries as multidimensional structures which are computable, comparable, and reducible to hashes. It is shown how this representation can be used to improve plan re-use and increase the efficiency of the query optimiser. Secondly, new multidimensional representations in real-time are demonstrated using weighted k-means clustering with self-adjusting weights and k to predict superior subschema selection, including application of queries to an alternative sub-schema of data, reducing resource consumption and improving query execution times. This is validated against a real data set and performance is tested at scale. It was found that use of KNN provided the relational database query optimiser with an increasing degree of accuracy and reliability in query classification, with an improvement in query execution time demonstrated at scale, against lifelike database queries, ranging from 6.2% to 20.6%. Finally, a novel method of dynamic schema redefinition is presented. This process defines, creates and destroys sub-schemas, maps queries to their sub-schema variants, and keeps track of performance metrics, self-adjusting the current library of alternative schema representations available. This is defined theoretically against the backdrop of the relational algebra and ZFC axiomatic set theory

    A Novel Method for Calculating Query Hashes for Improved Query Grouping in Relational Database Management Systems

    No full text
    Database queries are stored and compared by relational database manage-ment systems as hashes, or short unique representations, of the original query text. This leads to cache misses and increased resource consumption by data-base engines when queries differing only in non-syntactic detail, or queries which are relationally equivalent, are presented to the query parser. We pro-pose a new method of structural query decomposition, transforming database queries into multidimensional adjacency cubes (MACs), allowing the codifi-cation of queries by structure rather than content as currently implemented. We build and test our solution, demonstrating superior query hash grouping to that currently offered by a leading relational database platform, and consider the applications of this new technique

    The Impact of Object-Relational Mapping Frameworks on Relational Query Performance

    No full text
    This paper considers the impact of object-relational mapping (ORM) tools on relational database query performance. ORM tools are widely used to address the object-relational impedance mismatch problem but can have negative performance consequences. We first define the background of the problem, detailing the growth of ORM tools against a backdrop of a changing application development landscape, then demonstrate examples of undesirable query performance patterns resulting from the application of these tools using a leading application stack. We review selected literature for prior research into this problem and summarise the findings. Finally, we conclude by suggesting future research directions to help mitigate the issue

    Investigating the Effects of Object-Relational Impedance Mismatch on the Efficiency of Object-Relational Mapping Frameworks

    No full text
    The object-relational impedance mismatch (ORIM) problem characterises differences between the object-oriented and relational approaches to data access. Queries generated by object-relational mapping (ORM) frameworks are designed to overcome ORIM difficulties and can cause performance concerns in environments which use object-oriented paradigms. The aim of this paper is twofold, first presenting a survey of database practitioners on the effectiveness of ORM tools followed by an experimental investigation into the extent of operational concerns through the comparison of ORM-generated query performance and SQL query performance with a benchmark data set. The results show there are perceived difficulties in tuning ORM tools and distrust around their effectiveness. Through experimental testing, these views are validated by demonstrating that ORMs exhibit performance issues to the detriment of the query and the overall scalability of the ORM-led approach. Future work on establishing a system to support the query optimiser when parsing and preparing ORM-generated queries is outlined

    Analysis of two-prong events in pp interactions at 205 GeV/c: Separation of elastic and inelastic events

    No full text
    Two-prong events seen in a 205 GeV/c proton beam bubble chamber experiment at NAL are analysed and separated into elastic and inelastic contributions. The total elastic cross section is 6.92+or-0.44 mb. The data is consistent with the CERN ISR break in d sigma /dt at mod t mod approximately 0.1-0.2. The two-prong inelastic cross section is 2.85+or-0.26 mb. About 77% of it is diffractive. (24 refs)

    An ontology-based approach to sensor-mission assignment

    No full text
    Effective deployment of limited and constrained intelligence, surveillance and reconnaisance (ISR) resources is seen as a key issue in modern network-centric joint-forces operations. The aim of our work is to enable proactive and reactive deployment of sensors and other information sources to best support the objectives of a task (or mission) being undertaken. In this paper, we consider one aspect of the deployment problem: proactive assignment of sensors and sources to mission tasks. We view this sub-problem as a matchmaking activity: matching the ISR requirements of tasks to the ISR-providing capabilities of available sensors and sources, and the platforms that carry them. A key issue is that of defining sufficiently-rich representations of these various elements — missions, tasks, ISR requirements, ISR capabilities, sensors, sources, and platforms — to support the matchmaking activity. We argue for an approach based on the use of ontologies: formal models of the various elements that can be used with deductive reasoning mechanisms to produce matches that are logically sound. We introduce a new ontology based on the military Missions and Means Framework (MMF), and show that the matchmaking activity is necessarily multidimensional in nature. We indicate how our approach builds on previous work in representing sensors and sources for various purposes, and highlight the role of current Web standards in providing an engineering foundation for our approach
    corecore