Unsupervised Instance and Subnetwork Selection for Network Data

Bogdanov, Petko; Larsen, Melinda; Moskwa, Nicholas; Zhang, Lin

Unsupervised Instance and Subnetwork Selection for Network Data

Authors: Petko Bogdanov
Melinda Larsen
Nicholas Moskwa
Lin Zhang
Publication date: 24 December 2022
Publisher

Abstract

Unlike tabular data, features in network data are interconnected within a domain-specific graph. Examples of this setting include gene expression overlaid on a protein interaction network (PPI) and user opinions in a social network. Network data is typically high-dimensional (large number of nodes) and often contains outlier snapshot instances and noise. In addition, it is often non-trivial and time-consuming to annotate instances with global labels (e.g., disease or normal). How can we jointly select discriminative subnetworks and representative instances for network data without supervision? We address these challenges within an unsupervised framework for joint subnetwork and instance selection in network data, called UISS, via a convex self-representation objective. Given an unlabeled network dataset, UISS identifies representative instances while ignoring outliers. It outperforms state-of-the-art baselines on both discriminative subnetwork selection and representative instance selection, achieving up to 10% accuracy improvement on all real-world data sets we use for evaluation. When employed for exploratory analysis in RNA-seq network samples from multiple studies it produces interpretable and informative summaries

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2212.12771

Last time updated on 16/01/2023