34,400 research outputs found
Network Model Selection Using Task-Focused Minimum Description Length
Networks are fundamental models for data used in practically every
application domain. In most instances, several implicit or explicit choices
about the network definition impact the translation of underlying data to a
network representation, and the subsequent question(s) about the underlying
system being represented. Users of downstream network data may not even be
aware of these choices or their impacts. We propose a task-focused network
model selection methodology which addresses several key challenges. Our
approach constructs network models from underlying data and uses minimum
description length (MDL) criteria for selection. Our methodology measures
efficiency, a general and comparable measure of the network's performance of a
local (i.e. node-level) predictive task of interest. Selection on efficiency
favors parsimonious (e.g. sparse) models to avoid overfitting and can be
applied across arbitrary tasks and representations. We show stability,
sensitivity, and significance testing in our methodology
Combined flavor symmetry violation and lepton number violation in neutrino physics
Heavy singlet neutrinos admit Majorana masses which are not possible for the
Standard Model particles. This suggest new possibilities for generating the
masses and mixing angles of light neutrinos. We present a model of neutrino
physics which combines the source of lepton number violation with the flavor
symmetry responsible for the hierarchy in the charged lepton and quark sector.
This is accomplished by giving the scalar field effecting the lepton number
violation a nonzero charge under the horizontal flavor symmetry. We find an
economical model which is consistent with the measured values of the
atmospheric and solar neutrino mass-squares and mixing angles.Comment: 6 pages, no figures (published version
Network Model Selection for Task-Focused Attributed Network Inference
Networks are models representing relationships between entities. Often these
relationships are explicitly given, or we must learn a representation which
generalizes and predicts observed behavior in underlying individual data (e.g.
attributes or labels). Whether given or inferred, choosing the best
representation affects subsequent tasks and questions on the network. This work
focuses on model selection to evaluate network representations from data,
focusing on fundamental predictive tasks on networks. We present a modular
methodology using general, interpretable network models, task neighborhood
functions found across domains, and several criteria for robust model
selection. We demonstrate our methodology on three online user activity
datasets and show that network model selection for the appropriate network task
vs. an alternate task increases performance by an order of magnitude in our
experiments
Interjet Energy Flow/Event Shape Correlations
We identify a class of perturbatively computable measures of interjet energy
flow, which can be associated with well-defined color flow at short distances.
As an illustration, we calculate correlations between event shapes and the flow
of energy, Q_Omega, into an interjet angular region, Omega, in high-energy
two-jet e^+e^- -annihilation events. Laplace transforms with respect to the
event shapes suppress states with radiation at intermediate energy scales, so
that we may compute systematically logarithms of interjet energy flow. This
method provides a set of predictions on energy radiated between jets, as a
function of event shape and of the choice of the region Omega in which the
energy is measured. Non-global logarithms appear as corrections. We apply our
method to a continuous class of event shapes.Comment: 9 pages, 5 figures. Based on talk given by C.F. Berger at TH-2002,
International Conference on Theoretical Physics, Theme 2: "QCD, Hadron
dynamics, etc.", Paris, France, 2002. Slight changes to text, reference adde
Coulomb interacting Dirac fermions in disordered graphene
We study interacting Dirac quasiparticles in disordered graphene and find
that an interplay between the unscreened Coulomb interactions and
pseudo-relativistic quasiparticle kinematics can be best revealed in the
ballistic regime, whereas in the diffusive limit the behavior is qualitatively
(albeit, not quantitatively) similar to that of the ordinary 2DEG with
parabolic dispersion. We calculate the quasiparticle width and density of
states that can be probed by photoemission, tunneling, and magnetization
measurements.Comment: Latex, 4 page
Dynamical electroweak symmetry breaking with superheavy quarks and 2+1 composite Higgs model
Recently, a new class of models describing the quark mass hierarchy has been
introduced. In this class, while the t quark plays a minor role in electroweak
symmetry breaking (EWSB), it is crucial in providing the quark mass hierarchy.
In this paper, we analyze the dynamics of a particular model in this class, in
which the b' and t' quarks of the fourth family are mostly responsible for
dynamical EWSB. The low energy effective theory in this model is derived. It
has a clear signature, a 2 + 1 structure of composite Higgs doublets: two
nearly degenerate \Phi_{b'} and \Phi_{t'}, and a heavier top-Higgs resonance
\Phi_t \sim \bar{t}_{R}(t,b)_L. The properties of these composites are
described in detail, and it is shown that the model satisfies the electroweak
precision data constraints. The signatures of these composites at the Large
Hadron Collider are briefly discussed.Comment: 17 pages, 3 figures; v.2: references and clarifications added: PRD
versio
Entropy-scaling search of massive biological data
Many datasets exhibit a well-defined structure that can be exploited to
design faster search tools, but it is not always clear when such acceleration
is possible. Here, we introduce a framework for similarity search based on
characterizing a dataset's entropy and fractal dimension. We prove that
searching scales in time with metric entropy (number of covering hyperspheres),
if the fractal dimension of the dataset is low, and scales in space with the
sum of metric entropy and information-theoretic entropy (randomness of the
data). Using these ideas, we present accelerated versions of standard tools,
with no loss in specificity and little loss in sensitivity, for use in three
domains---high-throughput drug screening (Ammolite, 150x speedup), metagenomics
(MICA, 3.5x speedup of DIAMOND [3,700x BLASTX]), and protein structure search
(esFragBag, 10x speedup of FragBag). Our framework can be used to achieve
"compressive omics," and the general theory can be readily applied to data
science problems outside of biology.Comment: Including supplement: 41 pages, 6 figures, 4 tables, 1 bo
- …