Search CORE

16 research outputs found

SIAM Data Mining Brings It to Annual Meeting

Author: Bhowmick Sanjukta
Buluç Aydın
Caceres Rajmonda
Crouser R. Jordan
Gadepally Vijay
Kepner Jeremy
Miller Ben
Webster Jennifer
Publication venue: Smith ScholarWorks
Publication date: 01/01/2017
Field of study

The Data Mining Activity Group is one of SIAM\u27s most vibrant and dynamic activity groups. To better share our enthusiasm for data mining with the broader SIAM community, our activity group organized six minisymposia at the 2016 Annual Meeting. These minisymposia included 48 talks organized by 11 SIAM members on - GraphBLAS (Aydın Buluç) - Algorithms and statistical methods for noisy network analysis (Sanjukta Bhowmick & Ben Miller) - Inferring networks from non-network data (Rajmonda Caceres, Ivan Brugere & Tanya Y. Berger-Wolf) - Visual analytics (Jordan Crouser) - Mining in graph data (Jennifer Webster, Mahantesh Halappanavar & Emilie Hogan) - Scientific computing and big data (Vijay Gadepally) These minisymposia were well received by the broader SIAM community, and below are some of the key highlights

Smith College: Smith ScholarWorks

Graph-based Semi-Supervised & Active Learning for Edge Flows

Author: Benson Austin R.
Jia Junteng
Schaub Michael T.
Segarra Santiago
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

We present a graph-based semi-supervised learning (SSL) method for learning edge flows defined on a graph. Specifically, given flow measurements on a subset of edges, we want to predict the flows on the remaining edges. To this end, we develop a computational framework that imposes certain constraints on the overall flows, such as (approximate) flow conservation. These constraints render our approach different from classical graph-based SSL for vertex labels, which posits that tightly connected nodes share similar labels and leverages the graph structure accordingly to extrapolate from a few vertex labels to the unlabeled vertices. We derive bounds for our method's reconstruction error and demonstrate its strong performance on synthetic and real-world flow networks from transportation, physical infrastructure, and the Web. Furthermore, we provide two active learning algorithms for selecting informative edges on which to measure flow, which has applications for optimal sensor deployment. The first strategy selects edges to minimize the reconstruction error bound and works well on flows that are approximately divergence-free. The second approach clusters the graph and selects bottleneck edges that cross cluster-boundaries, which works well on flows with global trends

arXiv.org e-Print Archive

Crossref

Oxford University Research Archive

Distributed and Asynchronous Methods for Semi-supervised Learning

Author: Avrachenkov Konstantin
Borkar Vivek S.
Saboo Krishnakant
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

International audienceWe propose two asynchronously distributed approaches for graph-based semi-supervised learning. The first approach is based on stochastic approximation, whereas the second approach is based on randomized Kaczmarz algorithm. In addition to the possibility of distributed implementation, both approaches can be naturally applied online to streaming data. We analyse both approaches theoretically and by experiments. It appears that there is no clear winner and we provide indications about cases of superiority for each approach

Crossref

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Local Hypergraph Clustering using Capacity Releasing Diffusion

Author: Gleich David F.
Ibrahim Rania
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2020
Field of study

Local graph clustering is an important machine learning task that aims to find a well-connected cluster near a set of seed nodes. Recent results have revealed that incorporating higher order information significantly enhances the results of graph clustering techniques. The majority of existing research in this area focuses on spectral graph theory-based techniques. However, an alternative perspective on local graph clustering arises from using max-flow and min-cut on the objectives, which offer distinctly different guarantees. For instance, a new method called capacity releasing diffusion (CRD) was recently proposed and shown to preserve local structure around the seeds better than spectral methods. The method was also the first local clustering technique that is not subject to the quadratic Cheeger inequality by assuming a good cluster near the seed nodes. In this paper, we propose a local hypergraph clustering technique called hypergraph CRD (HG-CRD) by extending the CRD process to cluster based on higher order patterns, encoded as hyperedges of a hypergraph. Moreover, we theoretically show that HG-CRD gives results about a quantity called motif conductance, rather than a biased version used in previous experiments. Experimental results on synthetic datasets and real world graphs show that HG-CRD enhances the clustering quality.Comment: 18 pages, 6 figure

arXiv.org e-Print Archive

Directory of Open Access Journals