Search CORE

389 research outputs found

Low rank methods for optimizing clustering

Author: Hou Yangyang
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/2016
Field of study

Complex optimization models and problems in machine learning often have the majority of information in a low rank subspace. By careful exploitation of these low rank structures in clustering problems, we find new optimization approaches that reduce the memory and computational cost. We discuss two cases where this arises. First, we consider the NEO-K-Means (Non-Exhaustive, Overlapping K-Means) objective as a way to address overlapping and outliers in an integrated fashion. Optimizing this discrete objective is NP-hard, and even though there is a convex relaxation of the objective, straightforward convex optimization approaches are too expensive for large datasets. We utilize low rank structures in the solution matrix of the convex formulation and use a low-rank factorization of the solution matrix directly as a practical alternative. The resulting optimization problem is non-convex, but has a smaller number of solution variables, and can be locally optimized using an augmented Lagrangian method. In addition, we consider two fast multiplier methods to accelerate the convergence of the augmented Lagrangian scheme: a proximal method of multipliers and an alternating direction method of multipliers. For the proximal augmented Lagrangian, we show a convergence result for the non-convex case with bound-constrained subproblems. When the clustering performance is evaluated on real-world datasets, we show this technique is effective in finding the ground-truth clusters and cohesive overlapping communities in real-world networks. The second case is where the low-rank structure appears in the objective function. Inspired by low rank matrix completion techniques, we propose a low rank symmetric matrix completion scheme to approximate a kernel matrix. For the kernel k-means problem, we show empirically that the clustering performance with the approximation is comparable to the full kernel k-means

Purdue E-Pubs

Large-Scale Sensor Network Localization via Rigid Subnetwork Registration

Author: Chaudhury Kunal N.
Khoo Yuehaw
Singer Amit
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 15/01/2015
Field of study

In this paper, we describe an algorithm for sensor network localization (SNL) that proceeds by dividing the whole network into smaller subnetworks, then localizes them in parallel using some fast and accurate algorithm, and finally registers the localized subnetworks in a global coordinate system. We demonstrate that this divide-and-conquer algorithm can be used to leverage existing high-precision SNL algorithms to large-scale networks, which could otherwise only be applied to small-to-medium sized networks. The main contribution of this paper concerns the final registration phase. In particular, we consider a least-squares formulation of the registration problem (both with and without anchor constraints) and demonstrate how this otherwise non-convex problem can be relaxed into a tractable convex program. We provide some preliminary simulation results for large-scale SNL demonstrating that the proposed registration algorithm (together with an accurate localization scheme) offers a good tradeoff between run time and accuracy.Comment: 5 pages, 8 figures, 1 table. To appear in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, April 19-24, 201

arXiv.org e-Print Archive

CiteSeerX

Princeton University Open Access Repository

Crossref

Open Access Repository of IISc Research Publications

Recommended from our members

Overlapping community detection in massive social networks

Author: Whang Joyce Jiyoung
Publication venue
Publication date: 11/02/2016
Field of study

Massive social networks have become increasingly popular in recent years. Community detection is one of the most important techniques for the analysis of such complex networks. A community is a set of cohesive vertices that has more connections inside the set than outside. In many social and information networks, these communities naturally overlap. For instance, in a social network, each vertex in a graph corresponds to an individual who usually participates in multiple communities. In this thesis, we propose scalable overlapping community detection algorithms that effectively identify high quality overlapping communities in various real-world networks. We first develop an efficient overlapping community detection algorithm using a seed set expansion approach. The key idea of this algorithm is to find good seeds and then greedily expand these seeds using a personalized PageRank clustering scheme. Experimental results show that our algorithm significantly outperforms other state-of-the-art overlapping community detection methods in terms of run time, cohesiveness of communities, and ground-truth accuracy. To develop more principled methods, we formulate the overlapping community detection problem as a non-exhaustive, overlapping graph clustering problem where clusters are allowed to overlap with each other, and some nodes are allowed to be outside of any cluster. To tackle this non-exhaustive, overlapping clustering problem, we propose a simple and intuitive objective function that captures the issues of overlap and non-exhaustiveness in a unified manner. To optimize the objective, we develop not only fast iterative algorithms but also more sophisticated algorithms using a low-rank semidefinite programming technique. Our experimental results show that the new objective and the algorithms are effective in finding ground-truth clusterings that have varied overlap and non-exhaustiveness. We extend our non-exhaustive, overlapping clustering techniques to co-clustering where the goal is to simultaneously identify a clustering of the rows as well as the columns of a data matrix. As an example application, consider recommender systems where users have ratings on items. This can be represented by a bipartite graph where users and items are denoted by two different types of nodes, and the ratings are denoted by weighted edges between the users and the items. In this case, co-clustering would be a simultaneous clustering of users and items. We propose a new co-clustering objective function and an efficient co-clustering algorithm that is able to identify overlapping clusters as well as outliers on both types of the nodes in the bipartite graph. We show that our co-clustering algorithm is able to effectively capture the underlying co-clustering structure of the data, which results in boosting the performance of a standard one-dimensional clustering. Finally, we study the design of parallel data-driven algorithms, which enables us to further increase the scalability of our overlapping community detection algorithms. Using PageRank as a model problem, we look at three algorithm design axes: work activation, data access pattern, and scheduling. We investigate the impact of different algorithm design choices. Using these design axes, we design and test a variety of PageRank implementations finding that data-driven, push-based algorithms are able to achieve a significantly superior scalability than standard PageRank implementations. The design choices affect both single-threaded performance as well as parallel scalability. The lessons learned from this study not only guide efficient implementations of many graph mining algorithms but also provide a framework for designing new scalable algorithms, especially for large-scale community detection.Computer Science

Texas ScholarWorks

Linear Precoding in Cooperative MIMO Cellular Networks with Limited Coordination Clusters

Author: Huang Howard
Ng Chris T. K.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

In a cooperative multiple-antenna downlink cellular network, maximization of a concave function of user rates is considered. A new linear precoding technique called soft interference nulling (SIN) is proposed, which performs at least as well as zero-forcing (ZF) beamforming. All base stations share channel state information, but each user's message is only routed to those that participate in the user's coordination cluster. SIN precoding is particularly useful when clusters of limited sizes overlap in the network, in which case traditional techniques such as dirty paper coding or ZF do not directly apply. The SIN precoder is computed by solving a sequence of convex optimization problems. SIN under partial network coordination can outperform ZF under full network coordination at moderate SNRs. Under overlapping coordination clusters, SIN precoding achieves considerably higher throughput compared to myopic ZF, especially when the clusters are large.Comment: 13 pages, 5 figure

arXiv.org e-Print Archive

CiteSeerX

Crossref