250 research outputs found
Similarity-Aware Spectral Sparsification by Edge Filtering
In recent years, spectral graph sparsification techniques that can compute
ultra-sparse graph proxies have been extensively studied for accelerating
various numerical and graph-related applications. Prior nearly-linear-time
spectral sparsification methods first extract low-stretch spanning tree from
the original graph to form the backbone of the sparsifier, and then recover
small portions of spectrally-critical off-tree edges to the spanning tree to
significantly improve the approximation quality. However, it is not clear how
many off-tree edges should be recovered for achieving a desired spectral
similarity level within the sparsifier. Motivated by recent graph signal
processing techniques, this paper proposes a similarity-aware spectral graph
sparsification framework that leverages efficient spectral off-tree edge
embedding and filtering schemes to construct spectral sparsifiers with
guaranteed spectral similarity (relative condition number) level. An iterative
graph densification scheme is introduced to facilitate efficient and effective
filtering of off-tree edges for highly ill-conditioned problems. The proposed
method has been validated using various kinds of graphs obtained from public
domain sparse matrix collections relevant to VLSI CAD, finite element analysis,
as well as social and data networks frequently studied in many machine learning
and data mining applications
Towards Scalable Spectral Clustering via Spectrum-Preserving Sparsification
Eigenvalue decomposition of Laplacian matrices for large nearest-neighbor (NN)graphs is the major computational bottleneck in spectral clustering (SC). To fundamentally address this computational challenge in SC, we propose a scalable spectral sparsification framework that enables to construct nearly-linear-sized ultra-sparse NN graphs with guaranteed preservation of key eigenvalues and eigenvectors of the original Laplacian. The proposed method is based on the latest theoretical results in spectral graph theory and thus can be applied to robustly handle general undirected graphs. By leveraging a nearly-linear time spectral graph topology sparsification phase and a subgraph scaling phase via stochastic gradient descent (SGD) iterations, our approach allows computing tree-like NN graphs that can serve as high-quality proxies of the original NN graphs, leading to highly-scalable and accurate SC of large data sets. Our extensive experimental results on a variety of public domain data sets show dramatically improved performance when compared with state-of-the-art SC methods
HIGH-PERFORMANCE SPECTRAL METHODS FOR COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS
Recent research shows that by leveraging the key spectral properties of eigenvalues and eigenvectors of graph Laplacians, more efficient algorithms can be developed for tackling many graph-related computing tasks. In this dissertation, spectral methods are utilized for achieving faster algorithms in the applications of very-large-scale integration (VLSI) computer-aided design (CAD)
First, a scalable algorithmic framework is proposed for effective-resistance preserving spectral reduction of large undirected graphs. The proposed method allows computing much smaller graphs while preserving the key spectral (structural) properties of the original graph. Our framework is built upon the following three key components: a spectrum-preserving node aggregation and reduction scheme, a spectral graph sparsification framework with iterative edge weight scaling, as well as effective-resistance preserving post-scaling and iterative solution refinement schemes. We show that the resultant spectrally-reduced graphs can robustly preserve the first few nontrivial eigenvalues and eigenvectors of the original graph Laplacian and thus allow for developing highly-scalable spectral graph partitioning and circuit simulation algorithms.
Based on the framework of the spectral graph reduction, a Sparsified graph-theoretic Algebraic Multigrid (SAMG) is proposed for solving large Symmetric Diagonally Dominant (SDD) matrices. The proposed SAMG framework allows efficient construction of nearly-linear sized graph Laplacians for coarse-level problems while maintaining good spectral approximation during the AMG setup phase by leveraging a scalable spectral graph sparsification engine. Our experimental results show that the proposed method can offer more scalable performance than existing graph-theoretic AMG solvers for solving large SDD matrices in integrated circuit (IC) simulations, 3D-IC thermal analysis, image processing, finite element analysis as well as data mining and machine learning applications.
Finally, the spectral methods are applied to power grid and thermal integrity verification applications. This dissertation introduces a vectorless power grid and thermal integrity verification framework that allows computing worst-case voltage drop or thermal profiles across the entire chip under a set of local and global workload (power density) constraints. To address the computational challenges introduced by the large 3D mesh-structured thermal grids, we apply the spectral graph reduction approach for highly-scalable vectorless thermal (or power grids) verification of large chip designs. The effectiveness and efficiency of our approach have been demonstrated through extensive experiments
Exploring the Long Tail
The migration of datasets online has created a near-infinite inventory for big name retailers such as Amazon and Netflix, giving rise to recommendation systems to assist users in navigating the massive catalog. This has also allowed for the possibility of retailers storing much less popular, uncommon items which would not appear in a more traditional brick-and-mortar setting due to the cost of storage. Nevertheless, previous work has highlighted the profit potential which lies in the so-called long tail\u27\u27 of niche, unpopular items. Unfortunately, due to the limited amount of data in this subset of the inventory, recommendation systems often struggle to make useful suggestions within the long tail, lending them prone to a popularity bias.
Our work explores different approaches which recommendation systems typically employ and evaluate the performance of each approach on various subsets of the Netflix Prize data to the end of determining where each approach performs best. We survey collaborative filtering approaches, content-based filtering approaches, and hybrid mechanisms utilizing both of the previous methods. We analyze their behavior on the most popular items, the least popular items, and a composite of the two subsets, and we judge their performance based on the quality of the clusters they produce
Unsupervised Domain Adaptation using Graph Transduction Games
Unsupervised domain adaptation (UDA) amounts to assigning class labels to the
unlabeled instances of a dataset from a target domain, using labeled instances
of a dataset from a related source domain. In this paper, we propose to cast
this problem in a game-theoretic setting as a non-cooperative game and
introduce a fully automatized iterative algorithm for UDA based on graph
transduction games (GTG). The main advantages of this approach are its
principled foundation, guaranteed termination of the iterative algorithms to a
Nash equilibrium (which corresponds to a consistent labeling condition) and
soft labels quantifying the uncertainty of the label assignment process. We
also investigate the beneficial effect of using pseudo-labels from linear
classifiers to initialize the iterative process. The performance of the
resulting methods is assessed on publicly available object recognition
benchmark datasets involving both shallow and deep features. Results of
experiments demonstrate the suitability of the proposed game-theoretic approach
for solving UDA tasks.Comment: Oral IJCNN 201
Algorithms and Software for the Analysis of Large Complex Networks
The work presented intersects three main areas, namely graph algorithmics, network science and applied software engineering. Each computational method discussed relates to one of the main tasks of data analysis: to extract structural features from network data, such as methods for community detection; or to transform network data, such as methods to sparsify a network and reduce its size while keeping essential properties; or to realistically model networks through generative models
- …