Search CORE

162,575 research outputs found

Bayesian Fused Lasso regression for dynamic binary networks

Author: Betancourt Brenda
Boyd Naomi
Rodríguez Abel
Publication venue: 'Informa UK Limited'
Publication date: 03/10/2017
Field of study

We propose a multinomial logistic regression model for link prediction in a time series of directed binary networks. To account for the dynamic nature of the data we employ a dynamic model for the model parameters that is strongly connected with the fused lasso penalty. In addition to promoting sparseness, this prior allows us to explore the presence of change points in the structure of the network. We introduce fast computational algorithms for estimation and prediction using both optimization and Bayesian approaches. The performance of the model is illustrated using simulated data and data from a financial trading network in the NYMEX natural gas futures market. Supplementary material containing the trading network data set and code to implement the algorithms is available online

arXiv.org e-Print Archive

FigShare

Large scale bias and the inaccuracy of the peak-background split

Author: Bardeen
Bernardeau
Bond
Boylan-Kolchin
Cole
Cooray
Crocce
Desjacques
Fry
Gaztanaga
Jenkins
Kaiser
Kim
Lam
Lee
M. Manera
Martino
Matsubara
Mo
Mo
Ohta
Peacock
Press
R. Scoccimarro
Ravi K. Sheth
Reed
Reed
Scoccimarro
Scoccimarro
Seljak
Seljak
Sheth
Sheth
Sheth
Sheth
Sheth
Smith
Smith
Smith
Springel
Sánchez
Sánchez
Tinker
Warren
White
Publication venue: 'Wiley'
Publication date: 01/01/2009
Field of study

The peak-background split argument is commonly used to relate the abundance of dark matter halos to their spatial clustering. Testing this argument requires an accurate determination of the halo mass function. We present a Maximum Likelihood method for fitting parametric functional forms to halo abundances which differs from previous work because it does not require binned counts. Our conclusions do not depend on whether we use our method or more conventional ones. In addition, halo abundances depend on how halos are defined. Our conclusions do not depend on the choice of link length associated with the friends-of-friends halo-finder, nor do they change if we identify halos using a spherical overdensity algorithm instead. The large scale halo bias measured from the matter-halo cross spectrum b_x and the halo autocorrelation function b_xi (on scales k~0.03h/Mpc and r ~50 Mpc/h) can differ by as much as 5% for halos that are significantly more massive than the characteristic mass M*. At these large masses, the peak background split estimate of the linear bias factor b1 is 3-5% smaller than b_xi, which is 5% smaller than b_x. We discuss the origin of these discrepancies: deterministic nonlinear local bias, with parameters determined by the peak-background split argument, is unable to account for the discrepancies we see. A simple linear but nonlocal bias model, motivated by peaks theory, may also be difficult to reconcile with our measurements. More work on such nonlocal bias models may be needed to understand the nature of halo bias at this level of precision.Comment: MNRAS accepted. New section with Spherical Overdensity identified halos included. Appendix enlarge

arXiv.org e-Print Archive

CiteSeerX

Crossref

Representation Learning for Attributed Multiplex Heterogeneous Network

Author: Bhagat Smriti
Bojchevski Aleksandar
Hamilton Will
Huang Xiao
Kingma Diederik P
Lin Zhouhan
Mikolov Tomas
Mikolov Tomas
Tang Lei
Taskar Ben
Thomas
Yang Cheng
Yang Zhilin
Zhang Hongming
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 20/05/2019
Field of study

Network embedding (or graph embedding) has been widely used in many real-world applications. However, existing methods mainly focus on networks with single-typed nodes/edges and cannot scale well to handle large networks. Many real-world networks consist of billions of nodes and edges of multiple types, and each node is associated with different attributes. In this paper, we formalize the problem of embedding learning for the Attributed Multiplex Heterogeneous Network and propose a unified framework to address this problem. The framework supports both transductive and inductive learning. We also give the theoretical analysis of the proposed framework, showing its connection with previous works and proving its better expressiveness. We conduct systematical evaluations for the proposed framework on four different genres of challenging datasets: Amazon, YouTube, Twitter, and Alibaba. Experimental results demonstrate that with the learned embeddings from the proposed framework, we can achieve statistically significant improvements (e.g., 5.99-28.23% lift by F1 scores; p<<0.01, t-test) over previous state-of-the-art methods for link prediction. The framework has also been successfully deployed on the recommendation system of a worldwide leading e-commerce company, Alibaba Group. Results of the offline A/B tests on product recommendation further confirm the effectiveness and efficiency of the framework in practice.Comment: Accepted to KDD 2019. Website: https://sites.google.com/view/gatn

arXiv.org e-Print Archive

Crossref

Network Kriging

Author: Chua David B.
Crovella Mark
Kolaczyk Eric D.
Publication venue
Publication date: 01/01/2005
Field of study

Network service providers and customers are often concerned with aggregate performance measures that span multiple network paths. Unfortunately, forming such network-wide measures can be difficult, due to the issues of scale involved. In particular, the number of paths grows too rapidly with the number of endpoints to make exhaustive measurement practical. As a result, it is of interest to explore the feasibility of methods that dramatically reduce the number of paths measured in such situations while maintaining acceptable accuracy. We cast the problem as one of statistical prediction--in the spirit of the so-called `kriging' problem in spatial statistics--and show that end-to-end network properties may be accurately predicted in many cases using a surprisingly small set of carefully chosen paths. More precisely, we formulate a general framework for the prediction problem, propose a class of linear predictors for standard quantities of interest (e.g., averages, totals, differences) and show that linear algebraic methods of subset selection may be used to effectively choose which paths to measure. We characterize the performance of the resulting methods, both analytically and numerically. The success of our methods derives from the low effective rank of routing matrices as encountered in practice, which appears to be a new observation in its own right with potentially broad implications on network measurement generally.Comment: 16 pages, 9 figures, single-space

arXiv.org e-Print Archive

CiteSeerX

Boston University Institutional Repository (OpenBU)