Search CORE

1,730 research outputs found

The Expressive Power of Graph Neural Networks: A Survey

Author: Fan Changjun
Huang Jincai
Huang Kuihua
Liu Shixuan
Liu Zhong
Zhang Bingxu
Zhao Xiang
Publication venue
Publication date: 16/08/2023
Field of study

Graph neural networks (GNNs) are effective machine learning models for many graph-related applications. Despite their empirical success, many research efforts focus on the theoretical limitations of GNNs, i.e., the GNNs expressive power. Early works in this domain mainly focus on studying the graph isomorphism recognition ability of GNNs, and recent works try to leverage the properties such as subgraph counting and connectivity learning to characterize the expressive power of GNNs, which are more practical and closer to real-world. However, no survey papers and open-source repositories comprehensively summarize and discuss models in this important direction. To fill the gap, we conduct a first survey for models for enhancing expressive power under different forms of definition. Concretely, the models are reviewed based on three categories, i.e., Graph feature enhancement, Graph topology enhancement, and GNNs architecture enhancement

arXiv.org e-Print Archive

Information Markets and Nonmarkets

Author: Bergemann Dirk
Ottaviani Marco
Publication venue: EliScholar – A Digital Platform for Scholarly Publishing at Yale
Publication date: 01/01/2021
Field of study

As large amounts of data become available and can be communicated more easily and processed more e¤ectively, information has come to play a central role for economic activity and welfare in our age. This essay overviews contributions to the industrial organization of information markets and nonmarkets, while attempting to maintain a balance between foundational frameworks and more recent developments. We start by reviewing mechanism-design approaches to modeling the trade of information. We then cover ratings, predictions, and recommender systems. We turn to forecasting contests, prediction markets, and other institutions designed for collecting and aggregating information from decentralized participants. Finally, we discuss science as a prototypical information nonmarket with participants who interact in a non-anonymous way to produce and disseminate information. We aim to make the reader familiar with the central notions and insights in this burgeoning literature and also point to some open critical questions that future research will have to address

Archivio istituzionale della Ricerca - Bocconi

Yale University

Machine Unlearning: A Survey

Author: Xu Heng
Yu Philip S.
Zhang Lefeng
Zhou Wanlei
Zhu Tianqing
Publication venue
Publication date: 06/06/2023
Field of study

Machine learning has attracted widespread attention and evolved into an enabling technology for a wide range of highly successful applications, such as intelligent computer vision, speech recognition, medical diagnosis, and more. Yet a special need has arisen where, due to privacy, usability, and/or the right to be forgotten, information about some specific samples needs to be removed from a model, called machine unlearning. This emerging technology has drawn significant interest from both academics and industry due to its innovation and practicality. At the same time, this ambitious problem has led to numerous research efforts aimed at confronting its challenges. To the best of our knowledge, no study has analyzed this complex topic or compared the feasibility of existing unlearning solutions in different kinds of scenarios. Accordingly, with this survey, we aim to capture the key concepts of unlearning techniques. The existing solutions are classified and summarized based on their characteristics within an up-to-date and comprehensive review of each category's advantages and limitations. The survey concludes by highlighting some of the outstanding issues with unlearning techniques, along with some feasible directions for new research opportunities

arXiv.org e-Print Archive

MobilityMirror: Bias-Adjusted Transportation Datasets

Author: A Datta
A Ghosh
C Dwork
C Dwork
DA McFarland
DB Rubin
I Markovsky
I Zliobaite
J Xu
K Kirkpatrick
L Sweeney
M Hay
R Chen
S Barocas
S Ma
X Meng
X Xiao
Y Zhang
Y-A Montjoye de
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 24/01/2019
Field of study

We describe customized synthetic datasets for publishing mobility data. Private companies are providing new transportation modalities, and their data is of high value for integrative transportation research, policy enforcement, and public accountability. However, these companies are disincentivized from sharing data not only to protect the privacy of individuals (drivers and/or passengers), but also to protect their own competitive advantage. Moreover, demographic biases arising from how the services are delivered may be amplified if released data is used in other contexts. We describe a model and algorithm for releasing origin-destination histograms that removes selected biases in the data using causality-based methods. We compute the origin-destination histogram of the original dataset then adjust the counts to remove undesirable causal relationships that can lead to discrimination or violate contractual obligations with data owners. We evaluate the utility of the algorithm on real data from a dockless bike share program in Seattle and taxi data in New York, and show that these adjusted transportation datasets can retain utility while removing bias in the underlying data.Comment: Presented at BIDU 2018 workshop and published in Springer Communications in Computer and Information Science vol 92

arXiv.org e-Print Archive

Crossref

Search Efficient Binary Network Embedding

Author: Yin Jie
Zhang Chengqi
Zhang Daokun
Zhu Xingquan
Publication venue
Publication date: 13/01/2019
Field of study

Traditional network embedding primarily focuses on learning a dense vector representation for each node, which encodes network structure and/or node content information, such that off-the-shelf machine learning algorithms can be easily applied to the vector-format node representations for network analysis. However, the learned dense vector representations are inefficient for large-scale similarity search, which requires to find the nearest neighbor measured by Euclidean distance in a continuous vector space. In this paper, we propose a search efficient binary network embedding algorithm called BinaryNE to learn a sparse binary code for each node, by simultaneously modeling node context relations and node attribute relations through a three-layer neural network. BinaryNE learns binary node representations efficiently through a stochastic gradient descent based online learning algorithm. The learned binary encoding not only reduces memory usage to represent each node, but also allows fast bit-wise comparisons to support much quicker network node search compared to Euclidean distance or other distance measures. Our experiments and comparisons show that BinaryNE not only delivers more than 23 times faster search speed, but also provides comparable or better search quality than traditional continuous vector based network embedding methods

arXiv.org e-Print Archive