Search CORE

1,248 research outputs found

Journal of general virology – introduction to ‘ICTV virus taxonomy profiles’

Author: Davison Andrew J.
Publication venue: 'Microbiology Society'
Publication date: 01/01/2017
Field of study

No abstract available

Crossref

PubMed Central

Enlighten

Challenges and solutions in replicability and provenance tracking for simulation projects

Author: Davison Andrew P
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Springer - Publisher Connector

PubMed Central

The design and implementation of a PROLOG interpreter /

Author: Davison Andrew
Publication venue: Lehigh Preserve
Publication date
Field of study

Lehigh University: Lehigh Preserve

Asymptotics of Network Embeddings Learned via Subsampling

Author: Austern Morgane
Davison Andrew
Publication venue
Publication date: 05/07/2021
Field of study

Network data are ubiquitous in modern machine learning, with tasks of interest including node classification, node clustering and link prediction. A frequent approach begins by learning an Euclidean embedding of the network, to which algorithms developed for vector-valued data are applied. For large networks, embeddings are learned using stochastic gradient methods where the sub-sampling scheme can be freely chosen. Despite the strong empirical performance of such methods, they are not well understood theoretically. Our work encapsulates representation methods using a subsampling approach, such as node2vec, into a single unifying framework. We prove, under the assumption that the graph is exchangeable, that the distribution of the learned embedding vectors asymptotically decouples. Moreover, we characterize the asymptotic distribution and provided rates of convergence, in terms of the latent parameters, which includes the choice of loss function and the embedding dimension. This provides a theoretical foundation to understand what the embedding vectors represent and how well these methods perform on downstream tasks. Notably, we observe that typically used loss functions may lead to shortcomings, such as a lack of Fisher consistency.Comment: 98 pages, 3 figures, 1 tabl

arXiv.org e-Print Archive