396,341 research outputs found
Statistical inference with anchored Bayesian mixture of regressions models: A case study analysis of allometric data
We present a case study in which we use a mixture of regressions model to
improve on an ill-fitting simple linear regression model relating log brain
mass to log body mass for 100 placental mammalian species. The slope of this
regression model is of particular scientific interest because it corresponds to
a constant that governs a hypothesized allometric power law relating brain mass
to body mass. A specific line of investigation is to determine whether the
regression parameters vary across subgroups of related species.
We model these data using an anchored Bayesian mixture of regressions model,
which modifies the standard Bayesian Gaussian mixture by pre-assigning small
subsets of observations to given mixture components with probability one. These
observations (called anchor points) break the relabeling invariance typical of
exchangeable model specifications (the so-called label-switching problem). A
careful choice of which observations to pre-classify to which mixture
components is key to the specification of a well-fitting anchor model.
In the article we compare three strategies for the selection of anchor
points. The first assumes that the underlying mixture of regressions model
holds and assigns anchor points to different components to maximize the
information about their labeling. The second makes no assumption about the
relationship between x and y and instead identifies anchor points using a
bivariate Gaussian mixture model. The third strategy begins with the assumption
that there is only one mixture regression component and identifies anchor
points that are representative of a clustering structure based on case-deletion
importance sampling weights. We compare the performance of the three strategies
on the allometric data set and use auxiliary taxonomic information about the
species to evaluate the model-based classifications estimated from these
models
Pentapods with Mobility 2
In this paper we give a full classification of all pentapods with mobility 2,
where neither all platform anchor points nor all base anchor points are located
on a line. Therefore this paper solves the famous Borel-Bricard problem for
2-dimensional motions beside the excluded case of five collinear points with
spherical trajectories. But even for this special case we present three new
types as a side-result. Based on our study of pentapods, we also give a
complete list of all non-architecturally singular hexapods with 2-dimensional
self-motions.Comment: 18 pages, 5 figure
Chromatin loop anchors are associated with genome instability in cancer and recombination hotspots in the germline
Abstract Background Chromatin loops form a basic unit of interphase nuclear organization, with chromatin loop anchor points providing contacts between regulatory regions and promoters. However, the mutational landscape at these anchor points remains under-studied. Here, we describe the unusual patterns of somatic mutations and germline variation associated with loop anchor points and explore the underlying features influencing these patterns. Results Analyses of whole genome sequencing datasets reveal that anchor points are strongly depleted for single nucleotide variants (SNVs) in tumours. Despite low SNV rates in their genomic neighbourhood, anchor points emerge as sites of evolutionary innovation, showing enrichment for structural variant (SV) breakpoints and a peak of SNVs at focal CTCF sites within the anchor points. Both CTCF-bound and non-CTCF anchor points harbour an excess of SV breakpoints in multiple tumour types and are prone to double-strand breaks in cell lines. Common fragile sites, which are hotspots for genome instability, also show elevated numbers of intersecting loop anchor points. Recurrently disrupted anchor points are enriched for genes with functions in cell cycle transitions and regions associated with predisposition to cancer. We also discover a novel class of CTCF-bound anchor points which overlap meiotic recombination hotspots and are enriched for the core PRDM9 binding motif, suggesting that the anchor points have been foci for diversity generated during recent human evolution. Conclusions We suggest that the unusual chromatin environment at loop anchor points underlies the elevated rates of variation observed, marking them as sites of regulatory importance but also genomic fragility
No Fuss Distance Metric Learning using Proxies
We address the problem of distance metric learning (DML), defined as learning
a distance consistent with a notion of semantic similarity. Traditionally, for
this problem supervision is expressed in the form of sets of points that follow
an ordinal relationship -- an anchor point is similar to a set of positive
points , and dissimilar to a set of negative points , and a loss defined
over these distances is minimized. While the specifics of the optimization
differ, in this work we collectively call this type of supervision Triplets and
all methods that follow this pattern Triplet-Based methods. These methods are
challenging to optimize. A main issue is the need for finding informative
triplets, which is usually achieved by a variety of tricks such as increasing
the batch size, hard or semi-hard triplet mining, etc. Even with these tricks,
the convergence rate of such methods is slow. In this paper we propose to
optimize the triplet loss on a different space of triplets, consisting of an
anchor data point and similar and dissimilar proxy points which are learned as
well. These proxies approximate the original data points, so that a triplet
loss over the proxies is a tight upper bound of the original loss. This
proxy-based loss is empirically better behaved. As a result, the proxy-loss
improves on state-of-art results for three standard zero-shot learning
datasets, by up to 15% points, while converging three times as fast as other
triplet-based losses.Comment: To be presented in ICCV 201
- …
