441 research outputs found

    Recovery Conditions and Sampling Strategies for Network Lasso

    Full text link
    The network Lasso is a recently proposed convex optimization method for machine learning from massive network structured datasets, i.e., big data over networks. It is a variant of the well-known least absolute shrinkage and selection operator (Lasso), which is underlying many methods in learning and signal processing involving sparse models. Highly scalable implementations of the network Lasso can be obtained by state-of-the art proximal methods, e.g., the alternating direction method of multipliers (ADMM). By generalizing the concept of the compatibility condition put forward by van de Geer and Buehlmann as a powerful tool for the analysis of plain Lasso, we derive a sufficient condition, i.e., the network compatibility condition, on the underlying network topology such that network Lasso accurately learns a clustered underlying graph signal. This network compatibility condition relates the location of the sampled nodes with the clustering structure of the network. In particular, the NCC informs the choice of which nodes to sample, or in machine learning terms, which data points provide most information if labeled.Comment: nominated as student paper award finalist at Asilomar 2017. arXiv admin note: substantial text overlap with arXiv:1704.0210

    A Comparative Analysis of Graph Signal Recovery Methods for Big Data Networks

    Get PDF
    Graph signal processing, signal recovery, semi-supervised learning, traGraph-based signal recovery (GSR) techniques have been successfully used in different domains for labelling complete graphs from partial subsets of given labels. Much research has been devoted to finding new efficient approaches for solving this learning problem. However, we have identified a lack of research in empirically comparing different GSR methods on big data graphs. In this work, we implement highly scalable versions of five state-of-the-art methods, which we benchmark under identical conditions on a number of real and synthetic datasets. We perform a comprehensive evaluation of these methods in terms of accuracy, scalability, robustness to noise and graph topology as well as sampling set selection. We find that recently proposed methods based on TV minimization outperform more classical approaches that measure the graphs smoothness through the quadratic form. We draw other interesting conclusions and discuss merits and faults of each of the methods studied

    Benchmarking Network Embedding Models for Link Prediction: Are We Making Progress?

    Get PDF
    Network embedding methods map a network's nodes to vectors in an embedding space, in such a way that these representations are useful for estimating some notion of similarity or proximity between pairs of nodes in the network. The quality of these node representations is then showcased through results of downstream prediction tasks. Commonly used benchmark tasks such as link prediction, however, present complex evaluation pipelines and an abundance of design choices. This, together with a lack of standardized evaluation setups can obscure the real progress in the field. In this paper, we aim to shed light on the state-of-the-art of network embedding methods for link prediction and show, using a consistent evaluation pipeline, that only thin progress has been made over the last years. The newly conducted benchmark that we present here, including 17 embedding methods, also shows that many approaches are outperformed even by simple heuristics. Finally, we argue that standardized evaluation tools can repair this situation and boost future progress in this field

    EvalNE : a framework for evaluating network embeddings on link prediction

    Get PDF
    In this paper, we present EvalNE, a Python toolbox for evaluating network embedding methods on link prediction tasks. Link prediction is one of the most popular choices for evaluating the quality of network embeddings. However, the complexity of this task requires a carefully designed evaluation pipeline to provide consistent, reproducible and comparable results. EvalNE simplifies this process by providing automation and abstraction of tasks such as hyper-parameter tuning and model validation, edge sampling and negative edge sampling, computation of edge embeddings from node embeddings, and evaluation metrics. The toolbox allows for the evaluation of any off-the-shelf embedding method without the need to write extra code. Moreover, it can also be used for evaluating link prediction methods and integrates several link prediction heuristics as baselines. Finally, demonstrating the usefulness of EvalNE in practice, we conduct an extensive analysis where we replicate the experimental sections of several influential papers in the community

    CSNE : Conditional Signed Network Embedding

    Get PDF
    Signed networks are mathematical structures that encode positive and negative relations between entities such as friend/foe or trust/distrust. Recently, several papers studied the construction of useful low-dimensional representations (embeddings) of these networks for the prediction of missing relations or signs. Existing embedding methods for sign prediction generally enforce different notions of status or balance theories in their optimization function. These theories, however, are often inaccurate or incomplete, which negatively impacts method performance. In this context, we introduce conditional signed network embedding (CSNE). Our probabilistic approach models structural information about the signs in the network separately from fine-grained detail. Structural information is represented in the form of a prior, while the embedding itself is used for capturing fine-grained information. These components are then integrated in a rigorous manner. CSNE's accuracy depends on the existence of sufficiently powerful structural priors for modelling signed networks, currently unavailable in the literature. Thus, as a second main contribution, which we find to be highly valuable in its own right, we also introduce a novel approach to construct priors based on the Maximum Entropy (MaxEnt) principle. These priors can model the polarity of nodes (degree to which their links are positive) as well as signed triangle counts (a measure of the degree structural balance holds to in a network). Experiments on a variety of real-world networks confirm that CSNE outperforms the state-of-the-art on the task of sign prediction. Moreover, the MaxEnt priors on their own, while less accurate than full CSNE, achieve accuracies competitive with the state-of-the-art at very limited computational cost, thus providing an excellent runtime-accuracy trade-off in resource-constrained situations

    Network representation learning for link prediction : are we improving upon simple heuristics?

    Get PDF
    Network representation learning has become an active research area in recent years with many new methods showcasing their performance on downstream prediction tasks such as Link Prediction. Despite the efforts of the community to ensure reproducibility of research by providing method implementations, important issues remain. The complexity of the evaluation pipelines and abundance of design choices have led to difficulties in quantifying the progress in the field and identifying the state-of-the-art. In this work, we analyse 17 network embedding methods on 7 real-world datasets and find, using a consistent evaluation pipeline, only thin progress over the recent years. Also, many embedding methods are outperformed by simple heuristics. Finally, we discuss how standardized evaluation tools can repair this situation and boost progress in this field

    Block-Approximated Exponential Random Graphs

    Get PDF
    An important challenge in the field of exponential random graphs (ERGs) is the fitting of non-trivial ERGs on large graphs. By utilizing fast matrix block-approximation techniques, we propose an approximative framework to such non-trivial ERGs that result in dyadic independence (i.e., edge independent) distributions, while being able to meaningfully model both local information of the graph (e.g., degrees) as well as global information (e.g., clustering coefficient, assortativity, etc.) if desired. This allows one to efficiently generate random networks with similar properties as an observed network, and the models can be used for several downstream tasks such as link prediction. Our methods are scalable to sparse graphs consisting of millions of nodes. Empirical evaluation demonstrates competitiveness in terms of both speed and accuracy with state-of-the-art methods -- which are typically based on embedding the graph into some low-dimensional space -- for link prediction, showcasing the potential of a more direct and interpretable probabalistic model for this task.Comment: Accepted for DSAA 2020 conferenc

    A Systematic Evaluation of Node Embedding Robustness

    Full text link
    Node embedding methods map network nodes to low dimensional vectors that can be subsequently used in a variety of downstream prediction tasks. The popularity of these methods has significantly increased in recent years, yet, their robustness to perturbations of the input data is still poorly understood. In this paper, we assess the empirical robustness of node embedding models to random and adversarial poisoning attacks. Our systematic evaluation covers representative embedding methods based on Skip-Gram, matrix factorization, and deep neural networks. We compare edge addition, deletion and rewiring strategies computed using network properties as well as node labels. We also investigate the effect of label homophily and heterophily on robustness. We report qualitative results via embedding visualization and quantitative results in terms of downstream node classification and network reconstruction performances. We found that node classification suffers from higher performance degradation as opposed to network reconstruction, and that degree-based and label-based attacks are on average the most damaging

    Antibiotic Drug Delivery Systems for the Intracellular Targeting of Bacterial Pathogens

    Get PDF
    Intracellular bacterial pathogens are hard to treat because of the inability of conventional antimicrobial agents belonging to widely used classes, like aminoglycosides and ő≤-lactams, fluoroquinolones, or macrolides to penetrate, accumulate, or be retained in the mammalian cells. The increasing problem of antibiotic resistance complicates more the treatment of the diseases caused by these agents. In many cases, the increase in therapeutic doses and treatment duration is accompanied by the occurrence of severe side effects. Taking into account the huge financial investment associated with bringing a new antibiotic to the market and the limited lifetime of antibiotics, the design of drug delivery systems to enable the targeting of antibiotics inside the cells, to improve their activity in different intracellular niches at different pH and oxygen concentrations, and to achieve a reduced dosage and frequency of administration could represent a prudent choice. An ideal drug delivery system should possess several properties, such as antimicrobial activity, biodegradability, and biocompatibility, making it suitable for use in biomedical and pharmaceutical formulations. This approach will allow reviving old antibiotics rendered useless by resistance or toxicity, rescuing the last line therapy antibiotics by increasing the therapeutic index, widening the antimicrobial spectrum of antibiotics scaffolds that failed due to membrane permeability problems, and thus reducing the gap between increasingly drug-resistant pathogens and the development of new antibiotics. Different improved drug carriers have been developed for treating intracellular pathogens, including antibiotics loaded into liposomes, microspheres, polymeric carriers, and nanoplexes. The purpose of this chapter is to present the limitations of each class of antibiotics in targeting intracellular pathogens and the main research directions for the development of drug delivery systems for the intracellular release of antibiotics
    • ‚Ķ