22,994 research outputs found
Principal Components of CMB non-Gaussianity
The skew-spectrum statistic introduced by Munshi & Heavens (2010) has
recently been used in studies of non-Gaussianity from diverse cosmological data
sets including the detection of primary and secondary non-Gaussianity of Cosmic
Microwave Background (CMB) radiation. Extending previous work, focussed on
independent estimation, here we deal with the question of joint estimation of
multiple skew-spectra from the same or correlated data sets. We consider the
optimum skew-spectra for various models of primordial non-Gaussianity as well
as secondary bispectra that originate from the cross-correlation of secondaries
and lensing of CMB: coupling of lensing with the Integrated Sachs-Wolfe (ISW)
effect, coupling of lensing with thermal Sunyaev-Zeldovich (tSZ), as well as
from unresolved point-sources (PS). For joint estimation of various types of
non-Gaussianity, we use the PCA to construct the linear combinations of
amplitudes of various models of non-Gaussianity, e.g. that can be estimated from CMB
maps. Bias induced in the estimation of primordial non-Gaussianity due to
secondary non-Gaussianity is evaluated. The PCA approach allows one to infer
approximate (but generally accurate) constraints using CMB data sets on any
reasonably smooth model by use of a lookup table and performing a simple
computation. This principle is validated by computing constraints on the DBI
bispectrum using a PCA analysis of the standard templates.Comment: 17 pages, 5 figures, 4 tables. Matches published versio
Analyzing big time series data in solar engineering using features and PCA
In solar engineering, we encounter big time series data such as the satellite-derived irradiance data and string-level measurements from a utility-scale photovoltaic (PV) system. While storing and hosting big data are certainly possible using today’s data storage technology, it is challenging to effectively and efficiently visualize and analyze the data. We consider a data analytics algorithm to mitigate some of these challenges in this work. The algorithm computes a set of generic and/or application-specific features to characterize the time series, and subsequently uses principal component analysis to project these features onto a two-dimensional space. As each time series can be represented by features, it can be treated as a single data point in the feature space, allowing many operations to become more amenable. Three applications are discussed within the overall framework, namely (1) the PV system type identification, (2) monitoring network design, and (3) anomalous string detection. The proposed framework can be easily translated to many other solar engineer applications
Unsupervised Learning via Mixtures of Skewed Distributions with Hypercube Contours
Mixture models whose components have skewed hypercube contours are developed
via a generalization of the multivariate shifted asymmetric Laplace density.
Specifically, we develop mixtures of multiple scaled shifted asymmetric Laplace
distributions. The component densities have two unique features: they include a
multivariate weight function, and the marginal distributions are also
asymmetric Laplace. We use these mixtures of multiple scaled shifted asymmetric
Laplace distributions for clustering applications, but they could equally well
be used in the supervised or semi-supervised paradigms. The
expectation-maximization algorithm is used for parameter estimation and the
Bayesian information criterion is used for model selection. Simulated and real
data sets are used to illustrate the approach and, in some cases, to visualize
the skewed hypercube structure of the components
Recent advances in directional statistics
Mainstream statistical methodology is generally applicable to data observed
in Euclidean space. There are, however, numerous contexts of considerable
scientific interest in which the natural supports for the data under
consideration are Riemannian manifolds like the unit circle, torus, sphere and
their extensions. Typically, such data can be represented using one or more
directions, and directional statistics is the branch of statistics that deals
with their analysis. In this paper we provide a review of the many recent
developments in the field since the publication of Mardia and Jupp (1999),
still the most comprehensive text on directional statistics. Many of those
developments have been stimulated by interesting applications in fields as
diverse as astronomy, medicine, genetics, neurology, aeronautics, acoustics,
image analysis, text mining, environmetrics, and machine learning. We begin by
considering developments for the exploratory analysis of directional data
before progressing to distributional models, general approaches to inference,
hypothesis testing, regression, nonparametric curve estimation, methods for
dimension reduction, classification and clustering, and the modelling of time
series, spatial and spatio-temporal data. An overview of currently available
software for analysing directional data is also provided, and potential future
developments discussed.Comment: 61 page
Mixtures of Skew-t Factor Analyzers
In this paper, we introduce a mixture of skew-t factor analyzers as well as a
family of mixture models based thereon. The mixture of skew-t distributions
model that we use arises as a limiting case of the mixture of generalized
hyperbolic distributions. Like their Gaussian and t-distribution analogues, our
mixture of skew-t factor analyzers are very well-suited to the model-based
clustering of high-dimensional data. Imposing constraints on components of the
decomposed covariance parameter results in the development of eight flexible
models. The alternating expectation-conditional maximization algorithm is used
for model parameter estimation and the Bayesian information criterion is used
for model selection. The models are applied to both real and simulated data,
giving superior clustering results compared to a well-established family of
Gaussian mixture models
Dealing with temporal inconsistency in automated computer forensic profiling
Computer profiling is the automated forensic examination of a computer system in order to provide a human investigator with a characterisation of the activities that have taken place on that system. As part of this process, the logical components of the computer system – components such as users, files and applications - are enumerated and the relationships between them discovered and reported. This information is enriched with traces of historical activity drawn from system logs and from evidence of events found in the computer file system. A potential problem with the use of such information is that some of it may be inconsistent and contradictory thus compromising its value. This work examines the impact of temporal inconsistency in such information and discusses two types of temporal inconsistency that may arise – inconsistency arising out of the normal errant behaviour of a computer system, and inconsistency arising out of deliberate tampering by a suspect – and techniques for dealing with inconsistencies of the latter kind. We examine the impact of deliberate tampering through experiments conducted with prototype computer profiling software. Based on the results of these experiments, we discuss techniques which can be employed in computer profiling to deal with such temporal inconsistencies
Architecture of a network-in-the-Loop environment for characterizing AC power system behavior
This paper describes the method by which a large hardware-in-the-loop environment has been realized for three-phase ac power systems. The environment allows an entire laboratory power-network topology (generators, loads, controls, protection devices, and switches) to be placed in the loop of a large power-network simulation. The system is realized by using a realtime power-network simulator, which interacts with the hardware via the indirect control of a large synchronous generator and by measuring currents flowing from its terminals. These measured currents are injected into the simulation via current sources to close the loop. This paper describes the system architecture and, most importantly, the calibration methodologies which have been developed to overcome measurement and loop latencies. In particular, a new "phase advance" calibration removes the requirement to add unwanted components into the simulated network to compensate for loop delay. The results of early commissioning experiments are demonstrated. The present system performance limits under transient conditions (approximately 0.25 Hz/s and 30 V/s to contain peak phase-and voltage-tracking errors within 5. and 1%) are defined mainly by the controllability of the synchronous generator
- …