4,210 research outputs found

    The heat of atomization of sulfur trioxide, SO3_3 - a benchmark for computational thermochemistry

    Get PDF
    Calibration ab initio (direct coupled cluster) calculations including basis set extrapolation, relativistic effects, inner-shell correlation, and an anharmonic zero-point energy, predict the total atomization energy at 0 K of SO3_3 to be 335.96 (observed 335.92±\pm0.19) kcal/mol. Inner polarization functions make very large (40 kcal/mol with spdspd, 10 kcal/mol with spdfgspdfg basis sets) contributions to the SCF part of the binding energy. The molecule presents an unusual hurdle for less computationally intensive theoretical thermochemistry methods and is proposed as a benchmark for them. A slight modification of Weizmann-1 (W1) theory is proposed that appears to significantly improve performance for second-row compounds.Comment: Chem. Phys. Lett., in pres

    Big data space fungus

    Get PDF

    Bulkloading and Maintaining XML Documents

    Get PDF
    The popularity of XML as a exchange and storage format brings about massive amounts of documents to be stored, maintained and analyzed -- a challenge that traditionally has been tackled with Database Management Systems (DBMS). To open up the content of XML documents to analysis with declarative query languages, efficient bulk loading techniques are necessary. Database technology has traditionally been offering support for these tasks but yet falls short of providing efficient automation techniques for the challenges that large collections of XML data raise. As storage back-end, many applications rely on relational databases, which are designed towards large data volumes. This paper studies the bulk load and update algorithms for XML data stored in relational format and outlines opportunities and problems. We investigate both (1) bulk insertion and deletion as well as (2) updates in the form of edit scripts which heavily use pointer-chasing techniques which often are considered orthogonal to the algebraic operations relational databases are optimized for. To get the most out of relational database systems, we show that one should make careful use of edit scripts and replace them with bulk operations if more than a very small portion of the database is updated. We implemented our ideas on top of the Monet Database System and benchmarked their performance

    Memory aware query scheduling in a database cluster

    Get PDF
    Query throughput is one of the primary optimization goals in interactive web-based information systems in order to achieve the performance necessary to serve large user communities. Queries in this application domain differ significantly from those in traditional database applications: they are of lower complexity and almost exclusively read-only. The architecture we propose here is specifically tailored to take advantage of the query characteristics. It is based on a large parallel shared-nothing database cluster where each node runs a separate server with a fully replicated copy of the database. A query is assigned and entirely executed on one single node avoiding network contention or synchronization effects. However, the actual key to enhanced throughput is a resource efficient scheduling of the arriving queries. We develop a simple and robust scheduling scheme that takes the currently memory resident data at each server into account and trades off memory re-use and execution time, reordering queries as necessary. Our experimental evaluation demonstrates the effectiveness when scaling the system beyond hundreds of nodes showing super-linear speedup

    Scalable storage for a DBMS using transparent distribution

    Get PDF
    Scalable Distributed Data Structures (SDDSs) provide a self-managing and self-organizing data storage of potentially unbounded size. This stands in contrast to common distribution schemas deployed in conventional distributed DBMS. SDDSs, however, have mostly been used in synthetic scenarios to investigate their properties. In this paper we concentrate on the integration of the LH* SDDS into our efficient and extensible DBMS, called Monet. We show that this merge permits processing very large sets of distributed data. In our implementation we extended the relational algebra interpreter in such a way that access to data, whether it is distributed or locally stored, is transparent to the user. The on-the-fly optimization of operations --- heavily used in Monet --- to deploy different strategies and scenarios inside the primary operators associated with an SDDS adds self-adaptiveness to the query system; it dynamically adopts itself to unforeseen situations. We illustrate the performance efficiency by experiments on a network of workstations. The transparent integration of SDDSs opens new perspectives for very large self-managing database systems

    Omega Omega -storage : a self organizing multi-attribute storage technique for very large main memories

    Get PDF
    Main memory is continuously improving both in price and capacity. With this comes new storage problems as well as new directions of usage. Just before the millennium, several main memory database systems are becoming commercially available. The hot areas include boosting the performance of web-enabled systems, such as search-engines, and auctioning systems. We present a novel data storage structure -- the {em OmegaOmega-storage structure, a high performance data structure, allowing automatically indexed storage of {em very large amounts of multi-attribute data. The experiments show excellent performance for point retrieval, and highly efficient pruning for {em pattern searches. It provides the balanced storage previously achieved by random kd-trees, but avoids their increased pattern match search times, by an effective assignment bits of attributes. Moreover, it avoids the sensitivity of the kd-tree to insert orders

    The Charm Content of W+1 Jet Events as a Probe of the Strange Quark Distribution Function

    Full text link
    We investigate the prospects for measuring the strange quark distribution function of the proton in associated WW plus charm quark production at the Tevatron. The W+cW+c quark signal produced by strange quark -- gluon fusion, sg→W−csg\rightarrow W^-c and sˉg→W+cˉ\bar sg\rightarrow W^+\bar c, is approximately 5\% of the inclusive W+1W+1 jet cross section for jets with a transverse momentum pT(j)>10p_T(j)>10~GeV. We study the sensitivity of the WW plus charm quark cross section to the parametrization of the strange quark distribution function, and evaluate the various background processes. Strategies to identify charm quarks in CDF and D\O \ are discussed. For a charm tagging efficiency of about 10\% and an integrated luminosity of 30~pb−1^{-1} or more, it should be possible to constrain the strange quark distribution function from W+cW+c production at the Tevatron.Comment: submitted to Phys. Lett. B, Latex, 12 pages + 4 postscript figures encoded with uufile, FSU-HEP-930812, MAD/TH/93-6, MAD/PH/788. A postscript file with text and embedded figures is available via anonymous ftp at hepsg1.physics.fsu.edu, file is /pub/keller/fsu-hep-930812.p

    Stethoscope: A platform for interactive visual analysis of query execution plans

    Get PDF
    Searching for the performance bottleneck in an execution trace is an error prone and time consuming activity. Existing tools oer some comfort by providing a visual representation of trace for analysis. In this paper we present the Stethoscope, an interactive visual tool to inspect and analyze columnar database query performance, both online and online. It's unique interactive animated interface capitalizes the large dataflow graph representation of a query execution plan, augmented with query execution trace information. We demonstrate features of Stethoscope for both online and online analysis of long running queries. It helps in understanding where time goes, how optimizers perform, and how parallel processing on multi-core systems is exploited
    • …
    corecore