162,575 research outputs found

    Bayesian Fused Lasso regression for dynamic binary networks

    Full text link
    We propose a multinomial logistic regression model for link prediction in a time series of directed binary networks. To account for the dynamic nature of the data we employ a dynamic model for the model parameters that is strongly connected with the fused lasso penalty. In addition to promoting sparseness, this prior allows us to explore the presence of change points in the structure of the network. We introduce fast computational algorithms for estimation and prediction using both optimization and Bayesian approaches. The performance of the model is illustrated using simulated data and data from a financial trading network in the NYMEX natural gas futures market. Supplementary material containing the trading network data set and code to implement the algorithms is available online

    Large scale bias and the inaccuracy of the peak-background split

    Full text link
    The peak-background split argument is commonly used to relate the abundance of dark matter halos to their spatial clustering. Testing this argument requires an accurate determination of the halo mass function. We present a Maximum Likelihood method for fitting parametric functional forms to halo abundances which differs from previous work because it does not require binned counts. Our conclusions do not depend on whether we use our method or more conventional ones. In addition, halo abundances depend on how halos are defined. Our conclusions do not depend on the choice of link length associated with the friends-of-friends halo-finder, nor do they change if we identify halos using a spherical overdensity algorithm instead. The large scale halo bias measured from the matter-halo cross spectrum b_x and the halo autocorrelation function b_xi (on scales k~0.03h/Mpc and r ~50 Mpc/h) can differ by as much as 5% for halos that are significantly more massive than the characteristic mass M*. At these large masses, the peak background split estimate of the linear bias factor b1 is 3-5% smaller than b_xi, which is 5% smaller than b_x. We discuss the origin of these discrepancies: deterministic nonlinear local bias, with parameters determined by the peak-background split argument, is unable to account for the discrepancies we see. A simple linear but nonlocal bias model, motivated by peaks theory, may also be difficult to reconcile with our measurements. More work on such nonlocal bias models may be needed to understand the nature of halo bias at this level of precision.Comment: MNRAS accepted. New section with Spherical Overdensity identified halos included. Appendix enlarge

    Representation Learning for Attributed Multiplex Heterogeneous Network

    Full text link
    Network embedding (or graph embedding) has been widely used in many real-world applications. However, existing methods mainly focus on networks with single-typed nodes/edges and cannot scale well to handle large networks. Many real-world networks consist of billions of nodes and edges of multiple types, and each node is associated with different attributes. In this paper, we formalize the problem of embedding learning for the Attributed Multiplex Heterogeneous Network and propose a unified framework to address this problem. The framework supports both transductive and inductive learning. We also give the theoretical analysis of the proposed framework, showing its connection with previous works and proving its better expressiveness. We conduct systematical evaluations for the proposed framework on four different genres of challenging datasets: Amazon, YouTube, Twitter, and Alibaba. Experimental results demonstrate that with the learned embeddings from the proposed framework, we can achieve statistically significant improvements (e.g., 5.99-28.23% lift by F1 scores; p<<0.01, t-test) over previous state-of-the-art methods for link prediction. The framework has also been successfully deployed on the recommendation system of a worldwide leading e-commerce company, Alibaba Group. Results of the offline A/B tests on product recommendation further confirm the effectiveness and efficiency of the framework in practice.Comment: Accepted to KDD 2019. Website: https://sites.google.com/view/gatn

    Network Kriging

    Full text link
    Network service providers and customers are often concerned with aggregate performance measures that span multiple network paths. Unfortunately, forming such network-wide measures can be difficult, due to the issues of scale involved. In particular, the number of paths grows too rapidly with the number of endpoints to make exhaustive measurement practical. As a result, it is of interest to explore the feasibility of methods that dramatically reduce the number of paths measured in such situations while maintaining acceptable accuracy. We cast the problem as one of statistical prediction--in the spirit of the so-called `kriging' problem in spatial statistics--and show that end-to-end network properties may be accurately predicted in many cases using a surprisingly small set of carefully chosen paths. More precisely, we formulate a general framework for the prediction problem, propose a class of linear predictors for standard quantities of interest (e.g., averages, totals, differences) and show that linear algebraic methods of subset selection may be used to effectively choose which paths to measure. We characterize the performance of the resulting methods, both analytically and numerically. The success of our methods derives from the low effective rank of routing matrices as encountered in practice, which appears to be a new observation in its own right with potentially broad implications on network measurement generally.Comment: 16 pages, 9 figures, single-space
    • …
    corecore