820 research outputs found
Spectrally approximating large graphs with smaller graphs
How does coarsening affect the spectrum of a general graph? We provide
conditions such that the principal eigenvalues and eigenspaces of a coarsened
and original graph Laplacian matrices are close. The achieved approximation is
shown to depend on standard graph-theoretic properties, such as the degree and
eigenvalue distributions, as well as on the ratio between the coarsened and
actual graph sizes. Our results carry implications for learning methods that
utilize coarsening. For the particular case of spectral clustering, they imply
that coarse eigenvectors can be used to derive good quality assignments even
without refinement---this phenomenon was previously observed, but lacked formal
justification.Comment: 22 pages, 10 figure
Classification via Incoherent Subspaces
This article presents a new classification framework that can extract
individual features per class. The scheme is based on a model of incoherent
subspaces, each one associated to one class, and a model on how the elements in
a class are represented in this subspace. After the theoretical analysis an
alternate projection algorithm to find such a collection is developed. The
classification performance and speed of the proposed method is tested on the AR
and YaleB databases and compared to that of Fisher's LDA and a recent approach
based on on minimisation. Finally connections of the presented scheme
to already existing work are discussed and possible ways of extensions are
pointed out.Comment: 22 pages, 2 figures, 4 table
The 2014 American State Litter Scorecard FINAL: USA's Dirtiest & Cleanest States Includes Statistics and Charts
A NEW State Litter "Scorecard" is released for the 2014 American Society for Public Administration (ASPA) Conference. Every three years, the Scorecard approximates each state's overall public spaces environmental quality through tried-and-true, hard-to-publicly obtain objective and subjective measures, resulting in a total overall jurisdictional score. Readers gain a realistic "picture" of "what's going on" within one or all of the 50 states. Illegal littering and dumping, found frequently on or near transportation paths, creates danger to public safety and health, with 800+ Americans dying each year by vehicle collisions with unmoved roadway debris. Because policy makers, public administrators and citizens are ever more involved in effectuating "green" outcomes, satisfactory public spaces waste removals are vital. Since 2008, major publications (the Boston Globe; TRAVEL+LEISURE; National Cooperative Highway Research Program's "Reducing Litter on Roadsides" Journal) have referred to the Scorecard, an ever valuable, trusted standard for improving debris/litter abatement in states and localities
Reverberant Audio Source Separation via Sparse and Low-Rank Modeling
The performance of audio source separation from underdetermined convolutive
mixture assuming known mixing filters can be significantly improved by using an
analysis sparse prior optimized by a reweighting l1 scheme and a wideband
datafidelity term, as demonstrated by a recent article. In this letter, we show
that the performance can be improved even more significantly by exploiting a
low-rank prior on the source spectrograms.We present a new algorithm to
estimate the sources based on i) an analysis sparse prior, ii) a reweighting
scheme so as to increase the sparsity, iii) a wideband data-fidelity term in a
constrained form, and iv) a low-rank constraint on the source spectrograms.
Evaluation on reverberant music mixtures shows that the resulting algorithm
improves state-of-the-art methods by more than 2 dB of signal-to-distortion
ratio
Compressive Embedding and Visualization using Graphs
Visualizing high-dimensional data has been a focus in data analysis
communities for decades, which has led to the design of many algorithms, some
of which are now considered references (such as t-SNE for example). In our era
of overwhelming data volumes, the scalability of such methods have become more
and more important. In this work, we present a method which allows to apply any
visualization or embedding algorithm on very large datasets by considering only
a fraction of the data as input and then extending the information to all data
points using a graph encoding its global similarity. We show that in most
cases, using only samples is sufficient to diffuse the
information to all data points. In addition, we propose quantitative
methods to measure the quality of embeddings and demonstrate the validity of
our technique on both synthetic and real-world datasets
Principal Patterns on Graphs: Discovering Coherent Structures in Datasets
Graphs are now ubiquitous in almost every field of research. Recently, new
research areas devoted to the analysis of graphs and data associated to their
vertices have emerged. Focusing on dynamical processes, we propose a fast,
robust and scalable framework for retrieving and analyzing recurring patterns
of activity on graphs. Our method relies on a novel type of multilayer graph
that encodes the spreading or propagation of events between successive time
steps. We demonstrate the versatility of our method by applying it on three
different real-world examples. Firstly, we study how rumor spreads on a social
network. Secondly, we reveal congestion patterns of pedestrians in a train
station. Finally, we show how patterns of audio playlists can be used in a
recommender system. In each example, relevant information previously hidden in
the data is extracted in a very efficient manner, emphasizing the scalability
of our method. With a parallel implementation scaling linearly with the size of
the dataset, our framework easily handles millions of nodes on a single
commodity server
Low-Rank Matrices on Graphs: Generalized Recovery & Applications
Many real world datasets subsume a linear or non-linear low-rank structure in
a very low-dimensional space. Unfortunately, one often has very little or no
information about the geometry of the space, resulting in a highly
under-determined recovery problem. Under certain circumstances,
state-of-the-art algorithms provide an exact recovery for linear low-rank
structures but at the expense of highly inscalable algorithms which use nuclear
norm. However, the case of non-linear structures remains unresolved. We revisit
the problem of low-rank recovery from a totally different perspective,
involving graphs which encode pairwise similarity between the data samples and
features. Surprisingly, our analysis confirms that it is possible to recover
many approximate linear and non-linear low-rank structures with recovery
guarantees with a set of highly scalable and efficient algorithms. We call such
data matrices as \textit{Low-Rank matrices on graphs} and show that many real
world datasets satisfy this assumption approximately due to underlying
stationarity. Our detailed theoretical and experimental analysis unveils the
power of the simple, yet very novel recovery framework \textit{Fast Robust PCA
on Graphs
Fast Approximate Spectral Clustering for Dynamic Networks
Spectral clustering is a widely studied problem, yet its complexity is
prohibitive for dynamic graphs of even modest size. We claim that it is
possible to reuse information of past cluster assignments to expedite
computation. Our approach builds on a recent idea of sidestepping the main
bottleneck of spectral clustering, i.e., computing the graph eigenvectors, by
using fast Chebyshev graph filtering of random signals. We show that the
proposed algorithm achieves clustering assignments with quality approximating
that of spectral clustering and that it can yield significant complexity
benefits when the graph dynamics are appropriately bounded
- âŠ