4,210 research outputs found
The heat of atomization of sulfur trioxide, SO - a benchmark for computational thermochemistry
Calibration ab initio (direct coupled cluster) calculations including basis
set extrapolation, relativistic effects, inner-shell correlation, and an
anharmonic zero-point energy, predict the total atomization energy at 0 K of
SO to be 335.96 (observed 335.920.19) kcal/mol. Inner polarization
functions make very large (40 kcal/mol with , 10 kcal/mol with
basis sets) contributions to the SCF part of the binding energy. The molecule
presents an unusual hurdle for less computationally intensive theoretical
thermochemistry methods and is proposed as a benchmark for them. A slight
modification of Weizmann-1 (W1) theory is proposed that appears to
significantly improve performance for second-row compounds.Comment: Chem. Phys. Lett., in pres
Bulkloading and Maintaining XML Documents
The popularity of XML as a exchange and storage format brings about massive amounts of documents to be stored, maintained and analyzed -- a challenge that traditionally has been tackled with Database Management Systems (DBMS). To open up the content of XML documents to analysis with declarative query languages, efficient bulk loading techniques are necessary.
Database technology has traditionally been offering support for these tasks but yet falls short of providing efficient automation techniques for the challenges that large collections of XML data raise. As storage back-end, many applications rely on relational databases, which are designed towards large data volumes. This paper studies the bulk load and update algorithms for XML data stored in relational format and outlines opportunities and problems. We investigate both (1) bulk insertion and deletion as well as (2) updates in the form of edit scripts which heavily use pointer-chasing techniques which often are considered orthogonal to the algebraic operations relational databases are optimized for. To get the most out of relational database systems, we show that one should make careful use of edit scripts and replace them with bulk operations if more than a very small portion of the database is updated.
We implemented our ideas on top of the Monet Database System and benchmarked their performance
Memory aware query scheduling in a database cluster
Query throughput is one of the primary optimization goals in interactive web-based information systems in order to achieve the performance necessary to serve large user communities. Queries in this application domain differ significantly from those in traditional database applications: they are of lower complexity and almost exclusively read-only. The architecture we propose here is specifically tailored to take advantage of the query characteristics. It is based on a large parallel shared-nothing database cluster where each node runs a separate server with a fully replicated copy of the database. A query is assigned and entirely executed on one single node avoiding network contention or synchronization effects. However, the actual key to enhanced throughput is a resource efficient scheduling of the arriving queries. We develop a simple and robust scheduling scheme that takes the currently memory resident data at each server into account and trades off memory re-use and execution time, reordering queries as necessary. Our experimental evaluation demonstrates the effectiveness when scaling the system beyond hundreds of nodes showing super-linear speedup
Scalable storage for a DBMS using transparent distribution
Scalable Distributed Data Structures (SDDSs) provide a self-managing and self-organizing data storage of potentially unbounded size. This stands in contrast to common distribution schemas deployed in conventional distributed DBMS. SDDSs, however, have mostly been used in synthetic scenarios to investigate their properties. In this paper we concentrate on the integration of the LH* SDDS into our efficient and extensible DBMS, called Monet. We show that this merge permits processing very large sets of distributed data. In our implementation we extended the relational algebra interpreter in such a way that access to data, whether it is distributed or locally stored, is transparent to the user. The on-the-fly optimization of operations --- heavily used in Monet --- to deploy different strategies and scenarios inside the primary operators associated with an SDDS adds self-adaptiveness to the query system; it dynamically adopts itself to unforeseen situations. We illustrate the performance efficiency by experiments on a network of workstations. The transparent integration of SDDSs opens new perspectives for very large self-managing database systems
-storage : a self organizing multi-attribute storage technique for very large main memories
Main memory is continuously improving both in price and capacity. With this comes new storage problems as well as new directions of usage. Just before the millennium, several main memory database systems are becoming commercially available. The hot areas include boosting the performance of web-enabled systems, such as search-engines, and auctioning systems. We present a novel data storage structure -- the {em -storage structure, a high performance data structure, allowing automatically indexed storage of {em very large amounts of multi-attribute data. The experiments show excellent performance for point retrieval, and highly efficient pruning for {em pattern searches. It provides the balanced storage previously achieved by random kd-trees, but avoids their increased pattern match search times, by an effective assignment bits of attributes. Moreover, it avoids the sensitivity of the kd-tree to insert orders
The Charm Content of W+1 Jet Events as a Probe of the Strange Quark Distribution Function
We investigate the prospects for measuring the strange quark distribution
function of the proton in associated plus charm quark production at the
Tevatron. The quark signal produced by strange quark -- gluon fusion,
and , is approximately 5\%
of the inclusive jet cross section for jets with a transverse momentum
~GeV. We study the sensitivity of the plus charm quark cross
section to the parametrization of the strange quark distribution function, and
evaluate the various background processes. Strategies to identify charm quarks
in CDF and D\O \ are discussed. For a charm tagging efficiency of about 10\%
and an integrated luminosity of 30~pb or more, it should be possible to
constrain the strange quark distribution function from production at the
Tevatron.Comment: submitted to Phys. Lett. B, Latex, 12 pages + 4 postscript figures
encoded with uufile, FSU-HEP-930812, MAD/TH/93-6, MAD/PH/788. A postscript
file with text and embedded figures is available via anonymous ftp at
hepsg1.physics.fsu.edu, file is /pub/keller/fsu-hep-930812.p
Stethoscope: A platform for interactive visual analysis of query execution plans
Searching for the performance bottleneck in an execution trace is an error prone and time consuming activity. Existing tools oer some comfort by providing a visual representation of trace for analysis. In this paper we present the Stethoscope, an interactive visual tool to inspect and analyze columnar database query performance, both online and online. It's unique interactive animated interface capitalizes
the large dataflow graph representation of a query execution plan, augmented with query execution trace information. We demonstrate features of Stethoscope for both online and online analysis of long running queries. It helps in understanding where time goes, how optimizers perform, and how
parallel processing on multi-core systems is exploited
- …