3,602 research outputs found

    Power Spectrum Estimators For Large CMB Datasets

    Get PDF
    Forthcoming high-resolution observations of the Cosmic Microwave Background (CMB) radiation will generate datasets many orders of magnitude larger than have been obtained to date. The size and complexity of such datasets presents a very serious challenge to analysing them with existing or anticipated computers. Here we present an investigation of the currently favored algorithm for obtaining the power spectrum from a sky-temperature map --- the quadratic estimator. We show that, whilst improving on direct evaluation of the likelihood function, current implementations still inherently scale as the equivalent of the cube of the number of pixels or worse, and demonstrate the critical importance of choosing the right implementation for a particular dataset.Comment: 8 pages LATEX, no figures, corrected misaligned columns in table

    Stylistic Fingerprints, POS-tags and Inflected Languages: A Case Study in Polish

    Full text link
    In stylometric investigations, frequencies of the most frequent words (MFWs) and character n-grams outperform other style-markers, even if their performance varies significantly across languages. In inflected languages, word endings play a prominent role, and hence different word forms cannot be recognized using generic text tokenization. Countless inflected word forms make frequencies sparse, making most statistical procedures complicated. Presumably, applying one of the NLP techniques, such as lemmatization and/or parsing, might increase the performance of classification. The aim of this paper is to examine the usefulness of grammatical features (as assessed via POS-tag n-grams) and lemmatized forms in recognizing authorial profiles, in order to address the underlying issue of the degree of freedom of choice within lexis and grammar. Using a corpus of Polish novels, we performed a series of supervised authorship attribution benchmarks, in order to compare the classification accuracy for different types of lexical and syntactic style-markers. Even if the performance of POS-tags as well as lemmatized forms was notoriously worse than that of lexical markers, the difference was not substantial and never exceeded ca. 15%

    Modeling the dynamics of language change: logistic regression, Piotrowski's law, and a handful of examples in Polish

    Full text link
    The study discusses modeling diachronic processes by logistic regression. The phenomenon of nonlinear changes in language was first observed by Raimund Piotrowski (hence labelled as Piotrowski's law), even if actual linguistic evidence usually speaks against using the notion of a "law" in this context. In our study, we apply logistic regression models to 9 changes which occurred between 15th and 18th century in the Polish language. The attested course of the majority of these changes closely follow the expected values, which proves that the language change might indeed resemble a nonlinear phase change scenario. We also extend the original Piotrowski's approach by proposing polynomial logistic regression for these cases which can hardly be described by its standard version. Also, we propose to consider individual language change cases jointly, in order to inspect their possible collinearity or, more likely, their different dynamics in the function of time. Last but not least, we evaluate our results by testing the influence of the subcorpus size on the model's goodness-of-fit

    Markov Chain Beam Randomization: a study of the impact of PLANCK beam measurement errors on cosmological parameter estimation

    Get PDF
    We introduce a new method to propagate uncertainties in the beam shapes used to measure the cosmic microwave background to cosmological parameters determined from those measurements. The method, which we call Markov Chain Beam Randomization, MCBR, randomly samples from a set of templates or functions that describe the beam uncertainties. The method is much faster than direct numerical integration over systematic `nuisance' parameters, and is not restricted to simple, idealized cases as is analytic marginalization. It does not assume the data are normally distributed, and does not require Gaussian priors on the specific systematic uncertainties. We show that MCBR properly accounts for and provides the marginalized errors of the parameters. The method can be generalized and used to propagate any systematic uncertainties for which a set of templates is available. We apply the method to the Planck satellite, and consider future experiments. Beam measurement errors should have a small effect on cosmological parameters as long as the beam fitting is performed after removal of 1/f noise.Comment: 17 pages, 23 figures, revised version with improved explanation of the MCBR and overall wording. Accepted for publication in Astronomy and Astrophysics (to appear in the Planck pre-launch special issue

    Optimized Large-Scale CMB Likelihood And Quadratic Maximum Likelihood Power Spectrum Estimation

    Full text link
    We revisit the problem of exact CMB likelihood and power spectrum estimation with the goal of minimizing computational cost through linear compression. This idea was originally proposed for CMB purposes by Tegmark et al.\ (1997), and here we develop it into a fully working computational framework for large-scale polarization analysis, adopting \WMAP\ as a worked example. We compare five different linear bases (pixel space, harmonic space, noise covariance eigenvectors, signal-to-noise covariance eigenvectors and signal-plus-noise covariance eigenvectors) in terms of compression efficiency, and find that the computationally most efficient basis is the signal-to-noise eigenvector basis, which is closely related to the Karhunen-Loeve and Principal Component transforms, in agreement with previous suggestions. For this basis, the information in 6836 unmasked \WMAP\ sky map pixels can be compressed into a smaller set of 3102 modes, with a maximum error increase of any single multipole of 3.8\% at 32\ell\le32, and a maximum shift in the mean values of a joint distribution of an amplitude--tilt model of 0.006σ\sigma. This compression reduces the computational cost of a single likelihood evaluation by a factor of 5, from 38 to 7.5 CPU seconds, and it also results in a more robust likelihood by implicitly regularizing nearly degenerate modes. Finally, we use the same compression framework to formulate a numerically stable and computationally efficient variation of the Quadratic Maximum Likelihood implementation that requires less than 3 GB of memory and 2 CPU minutes per iteration for 32\ell \le 32, rendering low-\ell QML CMB power spectrum analysis fully tractable on a standard laptop.Comment: 13 pages, 13 figures, accepted by ApJ

    2-Point Correlations in the COBE DMR 4-Year Anisotropy Maps

    Get PDF
    The 2-point temperature correlation function is evaluated from the 4-year COBE DMR microwave anisotropy maps. We examine the 2-point function, which is the Legendre transform of the angular power spectrum, and show that the data are statistically consistent from channel to channel and frequency to frequency. The most likely quadrupole normalization is computed for a scale-invariant power-law spectrum of CMB anisotropy, using a variety of data combinations. For a given data set, the normalization inferred from the 2-point data is consistent with that inferred by other methods. The smallest and largest normalization deduced from any data combination are 16.4 and 19.6 uK respectively, with a value ~18 uK generally preferred.Comment: Sumbitted to ApJ Letter

    Power Spectrum of Primordial Inhomogeneity Determined from the 4-Year COBE DMR Sky Maps

    Get PDF
    Fourier analysis and power spectrum estimation of the cosmic microwave background anisotropy on an incompletely sampled sky developed by Gorski (1994) has been applied to the high-latitude portion of the 4-year COBE DMR 31.5, 53 and 90 GHz sky maps. Likelihood analysis using newly constructed Galaxy cuts (extended beyond |b| = 20deg to excise the known foreground emission) and simultaneously correcting for the faint high latitude galactic foreground emission is conducted on the DMR sky maps pixelized in both ecliptic and galactic coordinates. The Bayesian power spectrum estimation from the foreground corrected 4-year COBE DMR data renders n ~ 1.2 +/- 0.3, and Q_{rms-PS} ~ 15.3^{+3.7}_{-2.8} microK (projections of the two-parameter likelihood). These results are consistent with the Harrison-Zel'dovich n=1 model of amplitude Q_{rms-PS} ~ 18 microK detected with significance exceeding 14sigma (dQ/Q < 0.07). (A small power spectrum amplitude drop below the published 2-year results is predominantly due to the application of the new, extended Galaxy cuts.)Comment: 9 pages of text in LaTeX, 1 postscript Table, 4 postscript figures (2 color plates), submitted to The Astrophysical Journal (Letters

    Probing non-Gaussianities in the CMB on an incomplete sky using surrogates

    Full text link
    We demonstrate the feasibility to generate surrogates by Fourier-based methods for an incomplete data set. This is performed for the case of a CMB analysis, where astrophysical foreground emission, mainly present in the Galactic plane, is a major challenge. The shuffling of the Fourier phases for generating surrogates is now enabled by transforming the spherical harmonics into a new set of basis functions that are orthonormal on the cut sky. The results show that non-Gaussianities and hemispherical asymmetries in the CMB as identified in several former investigations, can still be detected even when the complete Galactic plane (|b| < 30{\deg}) is removed. We conclude that the Galactic plane cannot be the dominant source for these anomalies. The results point towards a violation of statistical isotropy.Comment: 9 pages, 13 figures, accepted by Physical Review
    corecore