2,733 research outputs found
Dimensionality Reduction for k-Means Clustering and Low Rank Approximation
We show how to approximate a data matrix with a much smaller
sketch that can be used to solve a general class of
constrained k-rank approximation problems to within error.
Importantly, this class of problems includes -means clustering and
unconstrained low rank approximation (i.e. principal component analysis). By
reducing data points to just dimensions, our methods generically
accelerate any exact, approximate, or heuristic algorithm for these ubiquitous
problems.
For -means dimensionality reduction, we provide relative
error results for many common sketching techniques, including random row
projection, column selection, and approximate SVD. For approximate principal
component analysis, we give a simple alternative to known algorithms that has
applications in the streaming setting. Additionally, we extend recent work on
column-based matrix reconstruction, giving column subsets that not only `cover'
a good subspace for \bv{A}, but can be used directly to compute this
subspace.
Finally, for -means clustering, we show how to achieve a
approximation by Johnson-Lindenstrauss projecting data points to just dimensions. This gives the first result that leverages the
specific structure of -means to achieve dimension independent of input size
and sublinear in
Uniform Sampling for Matrix Approximation
Random sampling has become a critical tool in solving massive matrix
problems. For linear regression, a small, manageable set of data rows can be
randomly selected to approximate a tall, skinny data matrix, improving
processing time significantly. For theoretical performance guarantees, each row
must be sampled with probability proportional to its statistical leverage
score. Unfortunately, leverage scores are difficult to compute.
A simple alternative is to sample rows uniformly at random. While this often
works, uniform sampling will eliminate critical row information for many
natural instances. We take a fresh look at uniform sampling by examining what
information it does preserve. Specifically, we show that uniform sampling
yields a matrix that, in some sense, well approximates a large fraction of the
original. While this weak form of approximation is not enough for solving
linear regression directly, it is enough to compute a better approximation.
This observation leads to simple iterative row sampling algorithms for matrix
approximation that run in input-sparsity time and preserve row structure and
sparsity at all intermediate steps. In addition to an improved understanding of
uniform sampling, our main proof introduces a structural result of independent
interest: we show that every matrix can be made to have low coherence by
reweighting a small subset of its rows
Recommended from our members
A New Green Salamander in the Southern Appalachians: Evolutionary History of Aneides aeneus and Implications for Management and Conservation with the Description of a Cryptic Micro-endemic Species (vol 107, pg 748, 2019)
Cambrian suspension-feeding tubicolous hemichordates
The combination of a meager fossil record of vermiform enteropneusts and their disparity with the tubicolous pterobranchs renders early hemichordate evolution conjectural. The middle Cambrian Oesia disjuncta from the Burgess Shale has been compared to annelids, tunicates and chaetognaths, but on the basis of abundant new material is now identified as a primitive hemichordate
Calcium channel blockers and breast cancer incidence: An updated systematic review and meta-analysis of the evidence
Controversy exists regarding the potential association between taking calcium channel blockers (CCBs) and the development of breast cancer. As a positive association would have important public health implications due to the widespread use of CCBs, this study aimed to incorporate new evidence to determine whether an association is likely to exist. We searched MEDLINE, EMBASE and the Cochrane Library to 28 June 2016 for relevant literature. References and citing articles were checked and authors contacted as necessary. Two authors independently selected articles and extracted data. Twenty-nine studies were reviewed; 26 were non-randomised studies (NRS). Meta-analysis of study data where adjustment for ‘confounding by indication’ was judged to be present suggests that an association, if any, is likely to be modest in magnitude (pooled odds/risk ratio 1.09 (95% confidence interval (CI) 1.03–1.15, I 2 = 0%, 8 sub-studies; pooled hazard ratio 0.99 (95% CI 0.94–1.03, I 2 = 35%, 9 sub-studies)). There are credible study data showing an increased relative risk with long-term use of CCBs, but the results of our meta-analysis and of meta-regression of log relative risk against minimum follow-up time are mixed. The current summative evidence does not support a clear association between taking CCBs and developing breast cancer. However, uncertainty remains, especially for long-term use and any association might not be uniform between different populations and/or breast cancer sub-types. We t hus recommend further NRS in settings where CCB use is highly prevalent and population-based cancer, prescription and health-registries exist, to resolve this continuing uncertainty. PROSPERO, CRD42015026712
Deeper, Wider, Sharper: Next-Generation Ground-Based Gravitational-Wave Observations of Binary Black Holes
Next-generation observations will revolutionize our understanding of binary
black holes and will detect new sources, such as intermediate-mass black holes.
Primary science goals include: Discover binary black holes throughout the
observable Universe; Reveal the fundamental properties of black holes; Uncover
the seeds of supermassive black holes.Comment: 14 pages, 3 figures, White Paper Submitted to Astro2020 (2020
Astronomy and Astrophysics Decadal Survey) by GWIC 3G Science Case Team
(GWIC: Gravitational Wave International Committee
Aneurysmal degeneration of the superficial femoral artery after remote endarterectomy
Superficial femoral artery reocclusion is the most common complication of remote endarterectomy with the Mollring device. We present the first reported case of a male patient who developed aneurysmal degeneration of the superficial femoral artery after a previous left common femoral endarterectomy and superficial femoral remote endarterectomy with popliteal stenting. He underwent thrombolysis with subsequent percutaneous transluminal angioplasty after developing acute left lower extremity ischemia. At 12-month follow-up, he was free of claudication symptoms. This case illustrates the need for close surveillance and discusses possible treatment options for patients with this rare complication
Impact of Cosmic Rays on Thermal Instability in the Circumgalactic Medium
Large reservoirs of cold (~10⁴ K) gas exist out to and beyond the virial radius in the circumgalactic medium (CGM) of all types of galaxies. Photoionization modeling suggests that cold CGM gas has significantly lower densities than expected by theoretical predictions based on thermal pressure equilibrium with hot CGM gas. In this work, we investigate the impact of cosmic-ray physics on the formation of cold gas via thermal instability. We use idealized three-dimensional magnetohydrodynamic simulations to follow the evolution of thermally unstable gas in a gravitationally stratified medium. We find that cosmic-ray pressure lowers the density and increases the size of cold gas clouds formed through thermal instability. We develop a simple model for how the cold cloud sizes and the relative densities of cold and hot gas depend on cosmic-ray pressure. Cosmic-ray pressure can help counteract gravity to keep cold gas in the CGM for longer, thereby increasing the predicted cold mass fraction and decreasing the predicted cold gas inflow rates. Efficient cosmic-ray transport, by streaming or diffusion, redistributes cosmic-ray pressure from the cold gas to the background medium, resulting in cold gas properties that are in between those predicted by simulations with inefficient transport and simulations without cosmic rays. We show that cosmic rays can significantly reduce galactic accretion rates and resolve the tension between theoretical models and observational constraints on the properties of cold CGM gas
- …