738 research outputs found

    VerdictDB: Universalizing Approximate Query Processing

    Full text link
    Despite 25 years of research in academia, approximate query processing (AQP) has had little industrial adoption. One of the major causes of this slow adoption is the reluctance of traditional vendors to make radical changes to their legacy codebases, and the preoccupation of newer vendors (e.g., SQL-on-Hadoop products) with implementing standard features. Additionally, the few AQP engines that are available are each tied to a specific platform and require users to completely abandon their existing databases---an unrealistic expectation given the infancy of the AQP technology. Therefore, we argue that a universal solution is needed: a database-agnostic approximation engine that will widen the reach of this emerging technology across various platforms. Our proposal, called VerdictDB, uses a middleware architecture that requires no changes to the backend database, and thus, can work with all off-the-shelf engines. Operating at the driver-level, VerdictDB intercepts analytical queries issued to the database and rewrites them into another query that, if executed by any standard relational engine, will yield sufficient information for computing an approximate answer. VerdictDB uses the returned result set to compute an approximate answer and error estimates, which are then passed on to the user or application. However, lack of access to the query execution layer introduces significant challenges in terms of generality, correctness, and efficiency. This paper shows how VerdictDB overcomes these challenges and delivers up to 171×\times speedup (18.45×\times on average) for a variety of existing engines, such as Impala, Spark SQL, and Amazon Redshift, while incurring less than 2.6% relative error. VerdictDB is open-sourced under Apache License.Comment: Extended technical report of the paper that appeared in Proceedings of the 2018 International Conference on Management of Data, pp. 1461-1476. ACM, 201

    Database Learning: Toward a Database that Becomes Smarter Every Time

    Full text link
    In today's databases, previous query answers rarely benefit answering future queries. For the first time, to the best of our knowledge, we change this paradigm in an approximate query processing (AQP) context. We make the following observation: the answer to each query reveals some degree of knowledge about the answer to another query because their answers stem from the same underlying distribution that has produced the entire dataset. Exploiting and refining this knowledge should allow us to answer queries more analytically, rather than by reading enormous amounts of raw data. Also, processing more queries should continuously enhance our knowledge of the underlying distribution, and hence lead to increasingly faster response times for future queries. We call this novel idea---learning from past query answers---Database Learning. We exploit the principle of maximum entropy to produce answers, which are in expectation guaranteed to be more accurate than existing sample-based approximations. Empowered by this idea, we build a query engine on top of Spark SQL, called Verdict. We conduct extensive experiments on real-world query traces from a large customer of a major database vendor. Our results demonstrate that Verdict supports 73.7% of these queries, speeding them up by up to 23.0x for the same accuracy level compared to existing AQP systems.Comment: This manuscript is an extended report of the work published in ACM SIGMOD conference 201

    Transcript copy number estimation using a mouse whole-genome oligonucleotide microarray

    Get PDF
    The ability to quantitatively measure the expression of all genes in a given tissue or cell with a single assay is an exciting promise of gene-expression profiling technology. An in situ-synthesized 60-mer oligonucleotide microarray designed to detect transcripts from all mouse genes was validated, as well as a set of exogenous RNA controls derived from the yeast genome (made freely available without restriction), which allow quantitative estimation of absolute endogenous transcript abundance

    Homozygosity for a missense mutation in the 67 kDa isoform of glutamate decarboxylase in a family with autosomal recessive spastic cerebral palsy: parallels with Stiff-Person Syndrome and other movement disorders

    Get PDF
    Background Cerebral palsy (CP) is an heterogeneous group of neurological disorders of movement and/or posture, with an estimated incidence of 1 in 1000 live births. Non-progressive forms of symmetrical, spastic CP have been identified, which show a Mendelian autosomal recessive pattern of inheritance. We recently described the mapping of a recessive spastic CP locus to a 5 cM chromosomal region located at 2q24-31.1, in rare consanguineous families. Methods Here we present data that refine this locus to a 0.5 cM region, flanked by the microsatellite markers D2S2345 and D2S326. The minimal region contains the candidate gene GAD1, which encodes a glutamate decarboxylase isoform (GAD67), involved in conversion of the amino acid and excitatory neurotransmitter glutamate to the inhibitory neurotransmitter γ-aminobutyric acid (GABA). Results A novel amino acid mis-sense mutation in GAD67 was detected, which segregated with CP in affected individuals. Conclusions This result is interesting because auto-antibodies to GAD67 and the more widely studied GAD65 homologue encoded by the GAD2 gene, are described in patients with Stiff-Person Syndrome (SPS), epilepsy, cerebellar ataxia and Batten disease. Further investigation seems merited of the possibility that variation in the GAD1 sequence, potentially affecting glutamate/GABA ratios, may underlie this form of spastic CP, given the presence of anti-GAD antibodies in SPS and the recognised excitotoxicity of glutamate in various contexts

    Developing core sets for persons following amputation based on the International Classification of Functioning, Disability and Health as a way to specify functioning

    Get PDF
    Amputation is a common late stage sequel of peripheral vascular disease and diabetes or a sequel of accidental trauma, civil unrest and landmines. The functional impairments affect many facets of life including but not limited to: Mobility; activities of daily living; body image and sexuality. Classification, measurement and comparison of the consequences of amputations has been impeded by the limited availability of internationally, multiculturally standardized instruments in the amputee setting. The introduction of the International Classification of Functioning, Disability and Health (ICF) by the World Health Assembly in May 2001 provides a globally accepted framework and classification system to describe, assess and compare function and disability. In order to facilitate the use of the ICF in everyday clinical practice and research, ICF core sets have been developed that focus on specific aspects of function typically associated with a particular disability. The objective of this paper is to outline the development process for the ICF core sets for persons following amputation. The ICF core sets are designed to translate the benefits of the ICF into clinical routine. The ICF core sets will be defined at a Consensus conference which will integrate evidence from preparatory studies, namely: (a) a systematic literature review regarding the outcome measures of clinical trails and observational studies, (b) semi-structured patient interviews, (c) international experts participating in an internet-based survey, and (d) cross-sectional, multi-center studies for clinical applicability. To validate the ICF core sets field-testing will follow. Invitation for participation: The development of ICF Core Sets is an inclusive and open process. Anyone who wishes to actively participate in this process is invited to do so

    Ethnic differences in the distribution of normally formed singleton stillbirths

    Get PDF
    Summary The normally formed singleton stillbirth deliveries occurring in Dudley Road Hospital in 1979, 1980 and 1981 were classified according to the primary aetiology. There was a higher than normal stillbirth rate in the Indian group which was almost entirely accounted for by the increased number of stillbirths falling into the 'intrauterine death before labour' group

    Approaching an investigation of multi-dimensional inequality through the lenses of variety in models of capitalism

    Get PDF
    After a synthetic presentation of the state of poverty and inequality in the world and the contradictions incurred by economic theory in this field after decades of globalization and in the midst of a persisting global crisis, in paragraphs 2. and 3. we outline the rational for our theoretical analysis, underlining two main aspects. First of all, in paragraph 2. we recall the reasons which makes inequality a multidimensional phenomenon, while in paragraph 3. we explore the reasons why the models of capitalism theory is relevant for studying multidimensional inequality. These paragraphs emphasise that inequality is a multidimensional and cumulative phenomenon and it should not be conceived only as the result of the processes of personal and functional distribution of income and wealth, which even by themselves are intrinsically multidimensional. The basic idea is that institutions, the cobweb of relations among them and their interaction with the economic structure define the model of capitalism which characterises a specific country and this, in turn, affects the level and the dynamics of inequality. This approach is consistent with the sociological approach by Rehbein and Souza (2014), based on the analytical framework developed by Pierre Bourdieu. In paragraph 4. we outline the rational for our empirical analysis, applying the notion of institutional complementarity and examining the relationship between institutional complementarity, models of capitalism and inequality. Besides, refining Amable’s analysis (2003), we provide empirical evidence on the relationship between inequality in income distribution and models of capitalism. Additionally, basing on cluster analysis, we identify six different models of capitalism in a sample of OECD countries, provide preliminary evidence on the different level of inequality which characterises each model and suggest that no evidence supports of the idea that a single model of capitalism is taking shape in this sphere in EU. In paragraph 5. we give some hints about issues in search for a new interpretation capable to fasten together the process of increasing inequality, the notion of symbolic violence and the models of capitalism theory. In the last paragraph we focus on conclusions useful for carrying on our research agenda

    Critical reflections on the benefits of ICT in education

    Get PDF
    In both schools and homes, information and communication technologies (ICT) are widely seen as enhancing learning, this hope fuelling their rapid diffusion and adoption throughout developed societies. But they are not yet so embedded in the social practices of everyday life as to be taken for granted, with schools proving slower to change their lesson plans than they were to fit computers in the classroom. This article examines two possible explanations - first, that convincing evidence of improved learning outcomes remains surprisingly elusive, and second, the unresolved debate over whether ICT should be conceived of as supporting delivery of a traditional or a radically different vision of pedagogy based on soft skills and new digital literacies. The difficulty in establishing traditional benefits, and the uncertainty over pursuing alternative benefits, raises fundamental questions over whether society really desires a transformed, technologically-mediated relation between teacher and learner
    corecore