61,491 research outputs found
Edge-weighting of gene expression graphs
In recent years, considerable research efforts have been directed to micro-array technologies and their role in providing simultaneous information on expression profiles for thousands of genes. These data, when subjected to clustering and classification procedures, can assist in identifying patterns and providing insight on biological processes. To understand the properties of complex gene expression datasets, graphical representations can be used. Intuitively, the data can be represented in terms of a bipartite graph, with weighted edges corresponding to gene-sample node couples in the dataset. Biologically meaningful subgraphs can be sought, but performance can be influenced both by the search algorithm, and, by the graph-weighting scheme and both merit rigorous investigation. In this paper, we focus on edge-weighting schemes for bipartite graphical representation of gene expression. Two novel methods are presented: the first is based on empirical evidence; the second on a geometric distribution. The schemes are compared for several real datasets, assessing efficiency of performance based on four essential properties: robustness to noise and missing values, discrimination, parameter influence on scheme efficiency and reusability. Recommendations and limitations are briefly discussed
The Cluster Distribution as a Test of Dark Matter Models. IV: Topology and Geometry
We study the geometry and topology of the large-scale structure traced by
galaxy clusters in numerical simulations of a box of side 320 Mpc, and
compare them with available data on real clusters. The simulations we use are
generated by the Zel'dovich approximation, using the same methods as we have
used in the first three papers in this series. We consider the following models
to see if there are measurable differences in the topology and geometry of the
superclustering they produce: (i) the standard CDM model (SCDM); (ii) a CDM
model with (OCDM); (iii) a CDM model with a `tilted' power
spectrum having (TCDM); (iv) a CDM model with a very low Hubble
constant, (LOWH); (v) a model with mixed CDM and HDM (CHDM); (vi) a
flat low-density CDM model with and a non-zero cosmological
term (CDM). We analyse these models using a variety of
statistical tests based on the analysis of: (i) the Euler-Poincar\'{e}
characteristic; (ii) percolation properties; (iii) the Minimal Spanning Tree
construction. Taking all these tests together we find that the best fitting
model is CDM and, indeed, the others do not appear to be consistent
with the data. Our results demonstrate that despite their biased and extremely
sparse sampling of the cosmological density field, it is possible to use
clusters to probe subtle statistical diagnostics of models which go far beyond
the low-order correlation functions usually applied to study superclustering.Comment: 17 pages, 7 postscript figures, uses mn.sty, MNRAS in pres
Inferring Mechanisms for Global Constitutional Progress
Constitutions help define domestic political orders, but are known to be
influenced by two international mechanisms: one that reflects global temporal
trends in legal development, and another that reflects international network
dynamics such as shared colonial history. We introduce the provision space; the
growing set of all legal provisions existing in the world's constitutions over
time. Through this we uncover a third mechanism influencing constitutional
change: hierarchical dependencies between legal provisions, under which the
adoption of essential, fundamental provisions precedes more advanced
provisions. This third mechanism appears to play an especially important role
in the emergence of new political rights, and may therefore provide a useful
roadmap for advocates of those rights. We further characterise each legal
provision in terms of the strength of these mechanisms
Energy Correlation Functions for Jet Substructure
We show how generalized energy correlation functions can be used as a
powerful probe of jet substructure. These correlation functions are based on
the energies and pair-wise angles of particles within a jet, with (N+1)-point
correlators sensitive to N-prong substructure. Unlike many previous jet
substructure methods, these correlation functions do not require the explicit
identification of subjet regions. In addition, the correlation functions are
better probes of certain soft and collinear features that are masked by other
methods. We present three Monte Carlo case studies to illustrate the utility of
these observables: 2-point correlators for quark/gluon discrimination, 3-point
correlators for boosted W/Z/Higgs boson identification, and 4-point correlators
for boosted top quark identification. For quark/gluon discrimination, the
2-point correlator is particularly powerful, as can be understood via a
next-to-leading logarithmic calculation. For boosted 2-prong resonances the
benefit depends on the mass of the resonance.Comment: 45 pages, 28 figures, update to JHEP version, some minor typos fixed,
added discussion at end of section
Classification methods for Hilbert data based on surrogate density
An unsupervised and a supervised classification approaches for Hilbert random
curves are studied. Both rest on the use of a surrogate of the probability
density which is defined, in a distribution-free mixture context, from an
asymptotic factorization of the small-ball probability. That surrogate density
is estimated by a kernel approach from the principal components of the data.
The focus is on the illustration of the classification algorithms and the
computational implications, with particular attention to the tuning of the
parameters involved. Some asymptotic results are sketched. Applications on
simulated and real datasets show how the proposed methods work.Comment: 33 pages, 11 figures, 6 table
Tracking down hyper-boosted top quarks
The identification of hadronically decaying heavy states, such as vector
bosons, the Higgs, or the top quark, produced with large transverse boosts has
been and will continue to be a central focus of the jet physics program at the
Large Hadron Collider (LHC). At a future hadron collider working at an
order-of-magnitude larger energy than the LHC, these heavy states would be
easily produced with transverse boosts of several TeV. At these energies, their
decay products will be separated by angular scales comparable to individual
calorimeter cells, making the current jet substructure identification
techniques for hadronic decay modes not directly employable. In addition, at
the high energy and luminosity projected at a future hadron collider, there
will be numerous sources for contamination including initial- and final-state
radiation, underlying event, or pile-up which must be mitigated. We propose a
simple strategy to tag such "hyper-boosted" objects that defines jets with
radii that scale inversely proportional to their transverse boost and combines
the standard calorimetric information with charged track-based observables. By
means of a fast detector simulation, we apply it to top quark identification
and demonstrate that our method efficiently discriminates hadronically decaying
top quarks from light QCD jets up to transverse boosts of 20 TeV. Our results
open the way to tagging heavy objects with energies in the multi-TeV range at
present and future hadron colliders.Comment: 19 pages + appendices, 17 figures; v2: added references, updated
cross section tabl
- âŠ