186 research outputs found
Improving Spectral Clustering Using Spectrum-Preserving Node Reduction
Spectral clustering is one of the most popular clustering methods. However,
the high computational cost due to the involved eigen-decomposition procedure
can immediately hinder its applications in large-scale tasks. In this paper we
use spectrum-preserving node reduction to accelerate eigen-decomposition and
generate concise representations of data sets. Specifically, we create a small
number of pseudonodes based on spectral similarity. Then, standard spectral
clustering algorithm is performed on the smaller node set. Finally, each data
point in the original data set is assigned to the cluster as its representative
pseudo-node. The proposed framework run in nearly-linear time. Meanwhile, the
clustering accuracy can be significantly improved by mining concise
representations. The experimental results show dramatically improved clustering
performance when compared with state-of-the-art methods
HIGH PERFORMANCE SPECTRAL METHODS FOR GRAPH-BASED MACHINE LEARNING
Graphs play a critical role in machine learning and data mining fields. The success of graph-based machine learning algorithms highly depends on the quality of the underlying graphs. Desired graphs should have two characteristics: 1) they should be able to well-capture the underlying structures of the data sets. 2) they should be sparse enough so that the downstream algorithms can be performed efficiently on them.
This dissertation first studies the application of a two-phase spectrum-preserving spectral sparsification method that enables to construct very sparse sparsifiers with guaranteed preservation of original graph spectra for spectral clustering. Experiments show that the computational challenge due to the eigen-decomposition procedure in spectral clustering can be fundamentally addressed.
We then propose a highly-scalable spectral graph learning approach GRASPEL. GRASPEL can learn high-quality graphs from high dimensional input data. Compared with prior state-of-the-art graph learning and construction methods , GRASPEL leads to substantially improved algorithm performance
Towards Scalable Spectral Clustering via Spectrum-Preserving Sparsification
Eigenvalue decomposition of Laplacian matrices for large nearest-neighbor (NN)graphs is the major computational bottleneck in spectral clustering (SC). To fundamentally address this computational challenge in SC, we propose a scalable spectral sparsification framework that enables to construct nearly-linear-sized ultra-sparse NN graphs with guaranteed preservation of key eigenvalues and eigenvectors of the original Laplacian. The proposed method is based on the latest theoretical results in spectral graph theory and thus can be applied to robustly handle general undirected graphs. By leveraging a nearly-linear time spectral graph topology sparsification phase and a subgraph scaling phase via stochastic gradient descent (SGD) iterations, our approach allows computing tree-like NN graphs that can serve as high-quality proxies of the original NN graphs, leading to highly-scalable and accurate SC of large data sets. Our extensive experimental results on a variety of public domain data sets show dramatically improved performance when compared with state-of-the-art SC methods
Does Forest Industries in China Become Cleaner? A Prospective of Embodied Carbon Emission
Forests and the forest products industry contribute to climate change mitigation by sequestering carbon from the atmosphere and storing it in biomass, and by fabricating products that substitute other, more greenhouse-gas-emission-intensive materials and energy. This study investigates primary wood-working industries (panel, furniture, pulp and paper) in order to determine the development of carbon emissions in China during the last two decades. The input–output approach is used and the factors driving the changes in CO2 emissions are analyzed by Index Decomposition Analysis–Log Mean Divisia Index (LMDI). The results show that carbon emissions in forest product industries have been declining during the last twenty years and that the driving factor of this change is the energy intensity of production and economic input, which have changed dramatically
Modularity-Guided Graph Topology Optimization And Self-Boosting Clustering
Existing modularity-based community detection methods attempt to find
community memberships which can lead to the maximum of modularity in a fixed
graph topology. In this work, we propose to optimize the graph topology through
the modularity maximization process. We introduce a modularity-guided graph
optimization approach for learning sparse high modularity graph from
algorithmically generated clustering results by iterative pruning edges between
two distant clusters. To the best of our knowledge, this represents a first
attempt for using modularity to guide graph topology learning. Extensive
experiments conducted on various real-world data sets show that our method
outperforms the state-of-the-art graph construction methods by a large margin.
Our experiments show that with increasing modularity, the accuracy of
graph-based clustering algorithm is simultaneously increased, demonstrating the
validity of modularity theory through numerical experimental results of
real-world data sets. From clustering perspective, our method can also be seen
as a self-boosting clustering method
Does Forest Industries in China Become Cleaner? A Prospective of Embodied Carbon Emission
Forests and the forest products industry contribute to climate change mitigation by sequestering carbon from the atmosphere and storing it in biomass, and by fabricating products that substitute other, more greenhouse-gas-emission-intensive materials and energy. This study investigates primary wood-working industries (panel, furniture, pulp and paper) in order to determine the development of carbon emissions in China during the last two decades. The input–output approach is used and the factors driving the changes in CO2 emissions are analyzed by Index Decomposition Analysis–Log Mean Divisia Index (LMDI). The results show that carbon emissions in forest product industries have been declining during the last twenty years and that the driving factor of this change is the energy intensity of production and economic input, which have changed dramatically
- …