144 research outputs found
Estimators of Fractal Dimension: Assessing the Roughness of Time Series and Spatial Data
The fractal or Hausdorff dimension is a measure of roughness (or smoothness)
for time series and spatial data. The graph of a smooth, differentiable surface
indexed in has topological and fractal dimension . If the
surface is nondifferentiable and rough, the fractal dimension takes values
between the topological dimension, , and . We review and assess
estimators of fractal dimension by their large sample behavior under infill
asymptotics, in extensive finite sample simulation studies, and in a data
example on arctic sea-ice profiles. For time series or line transect data,
box-count, Hall--Wood, semi-periodogram, discrete cosine transform and wavelet
estimators are studied along with variation estimators with power indices 2
(variogram) and 1 (madogram), all implemented in the R package fractaldim.
Considering both efficiency and robustness, we recommend the use of the
madogram estimator, which can be interpreted as a statistically more efficient
version of the Hall--Wood estimator. For two-dimensional lattice data, we
propose robust transect estimators that use the median of variation estimates
along rows and columns. Generally, the link between power variations of index
for stochastic processes, and the Hausdorff dimension of their sample
paths, appears to be particularly robust and inclusive when .Comment: Published in at http://dx.doi.org/10.1214/11-STS370 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Using Labeled Data to Evaluate Change Detectors in a Multivariate Streaming Environment
We consider the problem of detecting changes in a multivariate data stream. A change detector is defined by a detection algorithm and an alarm threshold. A detection algorithm maps the stream of input vectors into a univariate detection stream. The detector signals a change when the detection stream exceeds the chosen alarm threshold. We consider two aspects of the problem: (1) setting the alarm threshold and (2) measuring/comparing the performance of detection algorithms. We assume we are given a segment of the stream where changes of interest are marked. We present evidence that, without such marked training data, it might not be possible to accurately estimate the false alarm rate for a given alarm threshold. Commonly used approaches assume the data stream consists of independent observations, an implausible assumption given the time series nature of the data. Lack of independence can lead to estimates that are badly biased. Marked training data can also be used for realistic comparison of detection algorithms. We define a version of the receiver operating characteristic curve adapted to the change detection problem and propose a block bootstrap for comparing such curves. We illustrate the proposed methodology using multivariate data derived from an image stream
Interpretation of North Pacific Variability as a Short- and Long-Memory Process*
A major difficulty in investigating the nature of interdecadal variability of climatic time series is their shortness. An approach to this problem is through comparison of models. In this paper we contrast a first order autoregressive (AR(1)) model with a fractionally differenced (FD) model as applied to the winter averaged sea level pressure time series for the Aleutian low (the North Pacific (NP) index), and the Sitka winter air temperature record. Both models fit the same number of parameters. The AR(1) model is a ‘short memory ’ model in that it has a rapidly decaying autocovariance sequence, whereas an FD model exhibits ‘long memory ’ because its autocovariance sequence decays more slowly. Statistical tests cannot distinguish the superiority of one model over the other when fit with 100 NP or 146 Sitka data points. The FD model does equally well for short term prediction and has potentially important implications for long term behavior. In particular, the zero crossings of the FD model tend to be further apart, so they have more of a ‘regime’-like character; a quarter century interval between zero crossings is four times more likely with the FD than the AR(1) model. The long memory parameter δ for the FD model can be used as a characterization of regime-like behavior. The estimated δs for the NP index (spanning 100 years) and the Sitka time series (168 years) are virtually identical, and their size implies moderate long memory behavior. Although the NP index and the Sitka series have broadband low frequency variability and modest long memory behavior, temporal irregularities in their zero crossings are still prevalent. Comparison of the FD and AR(1) models indicates that regime-like behavior cannot be ruled out for North Pacific processes. 2 1
The clustering of massive galaxies at z~0.5 from the first semester of BOSS data
We calculate the real- and redshift-space clustering of massive galaxies at
z~0.5 using the first semester of data by the Baryon Oscillation Spectroscopic
Survey (BOSS). We study the correlation functions of a sample of 44,000 massive
galaxies in the redshift range 0.4<z<0.7. We present a halo-occupation
distribution modeling of the clustering results and discuss the implications
for the manner in which massive galaxies at z~0.5 occupy dark matter halos. The
majority of our galaxies are central galaxies living in halos of mass
10^{13}Msun/h, but 10% are satellites living in halos 10 times more massive.
These results are broadly in agreement with earlier investigations of massive
galaxies at z~0.5. The inferred large-scale bias (b~2) and relatively high
number density (nbar=3e-4 h^3 Mpc^{-3}) imply that BOSS galaxies are excellent
tracers of large-scale structure, suggesting BOSS will enable a wide range of
investigations on the distance scale, the growth of large-scale structure,
massive galaxy evolution and other topics.Comment: 11 pages, 12 figures, matches version accepted by Ap
Decoherence, the measurement problem, and interpretations of quantum mechanics
Environment-induced decoherence and superselection have been a subject of
intensive research over the past two decades, yet their implications for the
foundational problems of quantum mechanics, most notably the quantum
measurement problem, have remained a matter of great controversy. This paper is
intended to clarify key features of the decoherence program, including its more
recent results, and to investigate their application and consequences in the
context of the main interpretive approaches of quantum mechanics.Comment: 41 pages. Final published versio
Cosmological parameters from SDSS and WMAP
We measure cosmological parameters using the three-dimensional power spectrum
P(k) from over 200,000 galaxies in the Sloan Digital Sky Survey (SDSS) in
combination with WMAP and other data. Our results are consistent with a
``vanilla'' flat adiabatic Lambda-CDM model without tilt (n=1), running tilt,
tensor modes or massive neutrinos. Adding SDSS information more than halves the
WMAP-only error bars on some parameters, tightening 1 sigma constraints on the
Hubble parameter from h~0.74+0.18-0.07 to h~0.70+0.04-0.03, on the matter
density from Omega_m~0.25+/-0.10 to Omega_m~0.30+/-0.04 (1 sigma) and on
neutrino masses from <11 eV to <0.6 eV (95%). SDSS helps even more when
dropping prior assumptions about curvature, neutrinos, tensor modes and the
equation of state. Our results are in substantial agreement with the joint
analysis of WMAP and the 2dF Galaxy Redshift Survey, which is an impressive
consistency check with independent redshift survey data and analysis
techniques. In this paper, we place particular emphasis on clarifying the
physical origin of the constraints, i.e., what we do and do not know when using
different data sets and prior assumptions. For instance, dropping the
assumption that space is perfectly flat, the WMAP-only constraint on the
measured age of the Universe tightens from t0~16.3+2.3-1.8 Gyr to
t0~14.1+1.0-0.9 Gyr by adding SDSS and SN Ia data. Including tensors, running
tilt, neutrino mass and equation of state in the list of free parameters, many
constraints are still quite weak, but future cosmological measurements from
SDSS and other sources should allow these to be substantially tightened.Comment: Minor revisions to match accepted PRD version. SDSS data and ppt
figures available at http://www.hep.upenn.edu/~max/sdsspars.htm
The Seventh Data Release of the Sloan Digital Sky Survey
This paper describes the Seventh Data Release of the Sloan Digital Sky Survey
(SDSS), marking the completion of the original goals of the SDSS and the end of
the phase known as SDSS-II. It includes 11663 deg^2 of imaging data, with most
of the roughly 2000 deg^2 increment over the previous data release lying in
regions of low Galactic latitude. The catalog contains five-band photometry for
357 million distinct objects. The survey also includes repeat photometry over
250 deg^2 along the Celestial Equator in the Southern Galactic Cap. A
coaddition of these data goes roughly two magnitudes fainter than the main
survey. The spectroscopy is now complete over a contiguous area of 7500 deg^2
in the Northern Galactic Cap, closing the gap that was present in previous data
releases. There are over 1.6 million spectra in total, including 930,000
galaxies, 120,000 quasars, and 460,000 stars. The data release includes
improved stellar photometry at low Galactic latitude. The astrometry has all
been recalibrated with the second version of the USNO CCD Astrograph Catalog
(UCAC-2), reducing the rms statistical errors at the bright end to 45
milli-arcseconds per coordinate. A systematic error in bright galaxy photometr
is less severe than previously reported for the majority of galaxies. Finally,
we describe a series of improvements to the spectroscopic reductions, including
better flat-fielding and improved wavelength calibration at the blue end,
better processing of objects with extremely strong narrow emission lines, and
an improved determination of stellar metallicities. (Abridged)Comment: 20 pages, 10 embedded figures. Accepted to ApJS after minor
correction
- …