Search CORE

2,060 research outputs found

Electrostatic Field Classifier for Deficient Data

Author: A.P. Dempster
B. Gabrys
D. Ruta
J.L. Schafer
K. Torkkola
W. Outhwaite
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

This paper investigates the suitability of recently developed models based on the physical field phenomena for classification problems with incomplete datasets. An original approach to exploiting incomplete training data with missing features and labels, involving extensive use of electrostatic charge analogy, has been proposed. Classification of incomplete patterns has been investigated using a local dimensionality reduction technique, which aims at exploiting all available information rather than trying to estimate the missing values. The performance of all proposed methods has been tested on a number of benchmark datasets for a wide range of missing data scenarios and compared to the performance of some standard techniques. Several modifications of the original electrostatic field classifier aiming at improving speed and robustness in higher dimensional spaces are also discussed

Crossref

Bournemouth University Research Online

X-ray Lighthouses of the High-Redshift Universe. II. Further Snapshot Observations of the Most Luminous z>4 Quasars with Chandra

Author: C. Vignali
D. P. Schneider
Dempster A. P.
S. Kaspi
W. N. Brandt
Publication venue: 'University of Chicago Press'
Publication date: 01/01/2005
Field of study

We report on Chandra observations of a sample of 11 optically luminous (Mb<-28.5) quasars at z=3.96-4.55 selected from the Palomar Digital Sky Survey and the Automatic Plate Measuring Facility Survey. These are among the most luminous z>4 quasars known and hence represent ideal witnesses of the end of the "dark age ''. Nine quasars are detected by Chandra, with ~2-57 counts in the observed 0.5-8 keV band. These detections increase the number of X-ray detected AGN at z>4 to ~90; overall, Chandra has detected ~85% of the high-redshift quasars observed with snapshot (few kilosecond) observations. PSS 1506+5220, one of the two X-ray undetected quasars, displays a number of notable features in its rest-frame ultraviolet spectrum, the most prominent being broad, deep SiIV and CIV absorption lines. The average optical-to-X-ray spectral index for the present sample (=-1.88+/-0.05) is steeper than that typically found for z>4 quasars but consistent with the expected value from the known dependence of this spectral index on quasar luminosity. We present joint X-ray spectral fitting for a sample of 48 radio-quiet quasars in the redshift range 3.99-6.28 for which Chandra observations are available. The X-ray spectrum (~870 counts) is well parameterized by a power law with Gamma=1.93+0.10/-0.09 in the rest-frame ~2-40 keV band, and a tight upper limit of N_H~5x10^21 cm^-2 is obtained on any average intrinsic X-ray absorption. There is no indication of any significant evolution in the X-ray properties of quasars between redshifts zero and six, suggesting that the physical processes of accretion onto massive black holes have not changed over the bulk of cosmic time.Comment: 15 pages, 7 figures, accepted for publication in A

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

CERN Document Server

Open Access Repository

Statistical significance of communities in networks

Author: A. P. Dempster
Andrea Lancichinetti
D. E. Knuth
Filippo Radicchi
H. A. David
J. Beirlant
José J. Ramasco
R. Pastor-Satorras
W. W. Zachary
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2010
Field of study

Nodes in real-world networks are usually organized in local modules. These groups, called communities, are intuitively defined as sub-graphs with a larger density of internal connections than of external links. In this work, we introduce a new measure aimed at quantifying the statistical significance of single communities. Extreme and Order Statistics are used to predict the statistics associated with individual clusters in random graphs. These distributions allows us to define one community significance as the probability that a generic clustering algorithm finds such a group in a random graph. The method is successfully applied in the case of real-world networks for the evaluation of the significance of their communities.Comment: 9 pages, 8 figures, 2 tables. The software to calculate the C-score can be found at http://filrad.homelinux.org/cscor

arXiv.org e-Print Archive

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

The Time Machine: A Simulation Approach for Stochastic Trees

Author: Dempster A. P.
Edwards A. W. F.
Fearnhead P.
Gorur D.
Hudson R. R.
Kuhner M. K.
Stephens M.
Tavaré S.
Wilson I. J.
Publication venue: 'The Royal Society'
Publication date: 26/09/2010
Field of study

In the following paper we consider a simulation technique for stochastic trees. One of the most important areas in computational genetics is the calculation and subsequent maximization of the likelihood function associated to such models. This typically consists of using importance sampling (IS) and sequential Monte Carlo (SMC) techniques. The approach proceeds by simulating the tree, backward in time from observed data, to a most recent common ancestor (MRCA). However, in many cases, the computational time and variance of estimators are often too high to make standard approaches useful. In this paper we propose to stop the simulation, subsequently yielding biased estimates of the likelihood surface. The bias is investigated from a theoretical point of view. Results from simulation studies are also given to investigate the balance between loss of accuracy, saving in computing time and variance reduction.Comment: 22 Pages, 5 Figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

UCL Discovery

Pareto versus lognormal: a maximum entropy test

Author: A. P. Dempster
B. McBreen
C. Kleiber
D. Cox
D. Sornette
G. K. Zipf
G. Koop
J. Kapur
J. Sutton
Marco Bee
Massimo Riccaboni
P. Embrechts
R. Rubinstein
R. Serfling
S. P. Hubbell
Stefano Schiavo
W. Easterly
W. H. Greene
X. Gabaix
Y. Ijiri
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2011
Field of study

It is commonly found that distributions that seem to be lognormal over a broad range change to a power-law (Pareto) distribution for the last few percentiles. The distributions of many physical, natural, and social events (earthquake size, species abundance, income and wealth, as well as file, city, and firm sizes) display this structure. We present a test for the occurrence of power-law tails in statistical distributions based on maximum entropy. This methodology allows one to identify the true data-generating processes even in the case when it is neither lognormal nor Pareto. The maximum entropy approach is then compared with other widely used methods and applied to different levels of aggregation of complex systems. Our results provide support for the theory that distributions with lognormal body and Pareto tail can be generated as mixtures of lognormally distributed units

Lirias

Crossref

Archivio della ricerca della Scuola IMT Alti Studi Lucca

IMT Institutional Repository

Testing linear hypotheses in high-dimensional regressions

Author: Anderson T. W.
Bai Z. D.
Bai Z. D.
Bai Z. D.
Bai Z. D.
Bartlett M. S.
Box G. E.P.
Dandan Jiang
Dempster A. P.
Jian-feng Yao
Ledoit O.
Mathai A. M.
Schott J. R.
Shurong Zheng
Srivastava M. S.
Wilks S. S.
Wilks S. S.
Zheng S.
Zhidong Bai
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2012
Field of study

For a multivariate linear model, Wilk's likelihood ratio test (LRT) constitutes one of the cornerstone tools. However, the computation of its quantiles under the null or the alternative requires complex analytic approximations and more importantly, these distributional approximations are feasible only for moderate dimension of the dependent variable, say

p\le 20

. On the other hand, assuming that the data dimension

p

as well as the number

q

of regression variables are fixed while the sample size

n

grows, several asymptotic approximations are proposed in the literature for Wilk's \bLa including the widely used chi-square approximation. In this paper, we consider necessary modifications to Wilk's test in a high-dimensional context, specifically assuming a high data dimension

p

and a large sample size

n

. Based on recent random matrix theory, the correction we propose to Wilk's test is asymptotically Gaussian under the null and simulations demonstrate that the corrected LRT has very satisfactory size and power, surely in the large

p

and large

n

context, but also for moderately large data dimensions like

p=30

p=50

. As a byproduct, we give a reason explaining why the standard chi-square approximation fails for high-dimensional data. We also introduce a new procedure for the classical multiple sample significance test in MANOVA which is valid for high-dimensional data.Comment: Accepted 02/2012 for publication in "Statistics". 20 pages, 2 pages and 2 table

arXiv.org e-Print Archive

Crossref

HKU Scholars Hub

Quantum theory of incompatible observations

Author: Banaszek K
Dempster A P
Fisher R A
Frieden B R
Frieden B R
Helstrom C W
Holevo A S
Hradil Z
Hradil Z
Hradil Z
J Summhammer
Kullback S
Leonhardt U
Massar S
Rehácek J
Sakurai J J
Vardi Y
Welsch D-G
Wootters W K
Z Hradil
Publication venue: 'IOP Publishing'
Publication date: 15/11/1999
Field of study

Maximum likelihood principle is shown to be the best measure for relating the experimental data with the predictions of quantum theory.Comment: 3 page

arXiv.org e-Print Archive

Crossref

CERN Document Server

Quantum homodyne tomography with a priori constraints

Author: A. P. Dempster
A. Royer
B.-G. Englert
D. T. Smithey
G. M. D’Ariano
H. H. Barrett
H. Moya-Cessa
K. Banaszek
K. Banaszek
K. Lange
K. Vogel
Konrad Banaszek
M. Munroe
S. M. Tan
U. Leonhardt
U. Leonhardt
U. Leonhardt
V. Bužek
W. H. Richardson
W. T. Eadie
W. Vogel
Y. Vardi
Z. Hradil
Publication venue: 'American Physical Society (APS)'
Publication date: 25/01/1999
Field of study

I present a novel algorithm for reconstructing the Wigner function from homodyne statistics. The proposed method, based on maximum-likelihood estimation, is capable of compensating for detection losses in a numerically stable way.Comment: 4 pages, REVTeX, 2 figure

arXiv.org e-Print Archive

Crossref

Iterative algorithm for reconstruction of entangled states

Author: A. G. White
A. P. Dempster
A. S. Lane
G. M. D’ Ariano
J. H. Shapiro
J. Řeháček
J. Řeháček
K. Banaszek
K. R. W. Jones
M. Ježek
S. Kullback
U. Leonhardt
V. Bužek
V. Vedral
Y. Vardi
Z. Hradil
Z. Hradil
Z. Hradil
Z. Hradil
Z. Hradil
Publication venue: 'American Physical Society (APS)'
Publication date: 03/10/2000
Field of study

An iterative algorithm for the reconstruction of an unknown quantum state from the results of incompatible measurements is proposed. It consists of Expectation-Maximization step followed by a unitary transformation of the eigenbasis of the density matrix. The procedure has been applied to the reconstruction of the entangled pair of photons.Comment: 4 pages, no figures, some formulations changed, a minor mistake correcte

arXiv.org e-Print Archive

Crossref

CERN Document Server

Host Galaxy Evolution in Radio-Loud AGN

Author: Aragon-Salamanca A.
Bade N.
Blandford R. D.
Boyle B. J.
C. Megan Urry
Dempster A. P.
Dunlop J. S.
Falomo R.
Fried J. W.
Goudfrooij P.
Haehnelt M. G.
Kauffmann G.
Lilly S. J.
Lilly S. J.
Madau P.
Matthew O’Dowd
Rees M.
Silk J.
Stickel M.
VeŔon-Cetty M.-P.
Wang Y. P.
Publication venue: 'University of Chicago Press'
Publication date: 05/11/2004
Field of study

We investigate the luminosity evolution of the host galaxies of radio-loud AGN through Hubble Space Telescope imaging of 72 BL Lac objects, including new STIS imaging of nine z > 0.6 BL Lacs. With their intrinsically low accretion rates and their strongly beamed jets, BL Lacs provide a unique opportunity to probe host galaxy evolution independent of the biases and ambiguities implicit in quasar studies. We find that the host galaxies of BL Lacs evolve strongly, consistent with passive evolution from a period of active star formation in the range 0.5 <~ z <~ 2.5, and inconsistent with either passive evolution from a high formation redshift or a non-evolving population. This evolution is broadly consistent with that observed in the hosts of other radio-loud AGN, and inconsistent with the flatter luminosity evolution of quiescent early types and radio-quiet hosts. This indicates that active star formation, and hence galaxy interactions, are associated with the formation for radio-loud AGN, and that these host galaxies preferentially accrete less material after their formation epoch than galaxies without powerful radio jets. We discuss possible explanations for the link between merger history and the incidence of a radio jet.Comment: 37 pages, 8 figures, accepted for publication in ApJ, for full PDF incl. figures see http://www.ph.unimelb.edu.au/~modowd/papers/odowdurry2005.pd

arXiv.org e-Print Archive

Crossref