30 research outputs found
RProtoBuf: Efficient Cross-Language Data Serialization in R
Modern data collection and analysis pipelines often involve a sophisticated mix of applications written in general purpose and specialized programming languages. Many formats commonly used to import and export data between different programs or systems, such as CSV or JSON, are verbose, inefficient, not type-safe, or tied to a specific programming language. Protocol Buffers are a popular method of serializing structured data between applications - while remaining independent of programming languages or operating systems. They offer a unique combination of features, performance, and maturity that seems particularly well suited for data-driven applications and numerical computing. The RProtoBuf package provides a complete interface to Protocol Buffers from the R environment for statistical computing. This paper outlines the general class of data serialization requirements for statistical computing, describes the implementation of the RProtoBuf package, and illustrates its use with example applications in large-scale data collection pipelines and web services
Probing the evolution of Stark wave packets by a weak half cycle pulse
We probe the dynamic evolution of a Stark wave packet in cesium using weak
half-cycle pulses (HCP's). The state-selective field ionization(SSFI) spectra
taken as a function of HCP delay reveal wave packet dynamics such as Kepler
beats, Stark revivals and fractional revivals. A quantum-mechanical simulation
explains the results as multi-mode interference induced by the HCP.Comment: 4 pages, incl. 3 figures, submitted to PR
Author Roger Bivand [cre, aut],Nicholas Lewin-
Description Set of tools for manipulating and reading geographic data, in particular ESRI shapefiles; C code used from shapelib. It includes binary access to GSHHG shoreline files. The package also provides interface wrappers for exchanging spatial objects with packages such as PB-Smapping, spatstat, maps, RArcInfo, Stata tmap, WinBUGS, Mondrian, and others. License GPL (> = 2
Ab initio van der Waals interactions in simulations of water alter structure from mainly tetrahedral to high-density-like
The structure of liquid water at ambient conditions is studied in ab initio
molecular dynamics simulations using van der Waals (vdW) density-functional
theory, i.e. using the new exchange-correlation functionals optPBE-vdW and
vdW-DF2. Inclusion of the more isotropic vdW interactions counteracts highly
directional hydrogen-bonds, which are enhanced by standard functionals. This
brings about a softening of the microscopic structure of water, as seen from
the broadening of angular distribution functions and, in particular, from the
much lower and broader first peak in the oxygen-oxygen pair-correlation
function (PCF), indicating loss of structure in the outer solvation shells. In
combination with softer non-local correlation terms, as in the new
parameterization of vdW-DF, inclusion of vdW interactions is shown to shift the
balance of resulting structures from open tetrahedral to more close-packed. The
resulting O-O PCF shows some resemblance with experiment for high-density water
(A. K. Soper and M. A. Ricci, Phys. Rev. Lett., 84:2881, 2000), but not
directly with experiment for ambient water. However, an O-O PCF consisting of a
linear combination of 70% from vdW-DF2 and 30% from experiment on low-density
liquid water reproduces near-quantitatively the experimental O-O PCF for
ambient water, indicating consistency with a two-liquid model with fluctuations
between high- and low-density regions
RProtoBuf
Modern data collection and analysis pipelines often involve a sophisticated mix of applications written in general purpose and specialized programming languages. Many formats commonly used to import and export data between different programs or systems, such as CSV or JSON, are verbose, inefficient, not type-safe, or tied to a specific programming language. Protocol Buffers are a popular method of serializing structured data between applications - while remaining independent of programming languages or operating systems. They offer a unique combination of features, performance, and maturity that seems particularly well suited for data-driven applications and numerical computing. The RProtoBuf package provides a complete interface to Protocol Buffers from the R environment for statistical computing. This paper outlines the general class of data serialization requirements for statistical computing, describes the implementation of the RProtoBuf package, and illustrates its use with example applications in large-scale data collection pipelines and web services
Uncertainty in aggregate estimates from sampled distributed traces
Abstract Tracing mechanisms in distributed systems give important insight into system properties and are usually sampled to control overhead. At Google, Dapper [8] is the always-on system for distributed tracing and performance analysis, and it samples fractions of all RPC traffic. Due to difficult implementation, excessive data volume, or a lack of perfect foresight, there are times when system quantities of interest have not been measured directly, and Dapper samples can be aggregated to estimate those quantities in the short or long term. Here we find unbiased variance estimates of linear statistics over RPCs, taking into account all layers of sampling that occur in Dapper, and allowing us to quantify the sampling uncertainty in the aggregate estimates. We apply this methodology to the problem of assigning jobs and data to Google datacenters, using estimates of the resulting cross-datacenter traffic as an optimization criterion, and also to the detection of change points in access patterns to certain data partitions