22,717 research outputs found

    Reconstructing dynamical networks via feature ranking

    Full text link
    Empirical data on real complex systems are becoming increasingly available. Parallel to this is the need for new methods of reconstructing (inferring) the topology of networks from time-resolved observations of their node-dynamics. The methods based on physical insights often rely on strong assumptions about the properties and dynamics of the scrutinized network. Here, we use the insights from machine learning to design a new method of network reconstruction that essentially makes no such assumptions. Specifically, we interpret the available trajectories (data) as features, and use two independent feature ranking approaches -- Random forest and RReliefF -- to rank the importance of each node for predicting the value of each other node, which yields the reconstructed adjacency matrix. We show that our method is fairly robust to coupling strength, system size, trajectory length and noise. We also find that the reconstruction quality strongly depends on the dynamical regime

    A multi-class approach for ranking graph nodes: models and experiments with incomplete data

    Get PDF
    After the phenomenal success of the PageRank algorithm, many researchers have extended the PageRank approach to ranking graphs with richer structures beside the simple linkage structure. In some scenarios we have to deal with multi-parameters data where each node has additional features and there are relationships between such features. This paper stems from the need of a systematic approach when dealing with multi-parameter data. We propose models and ranking algorithms which can be used with little adjustments for a large variety of networks (bibliographic data, patent data, twitter and social data, healthcare data). In this paper we focus on several aspects which have not been addressed in the literature: (1) we propose different models for ranking multi-parameters data and a class of numerical algorithms for efficiently computing the ranking score of such models, (2) by analyzing the stability and convergence properties of the numerical schemes we tune a fast and stable technique for the ranking problem, (3) we consider the issue of the robustness of our models when data are incomplete. The comparison of the rank on the incomplete data with the rank on the full structure shows that our models compute consistent rankings whose correlation is up to 60% when just 10% of the links of the attributes are maintained suggesting the suitability of our model also when the data are incomplete

    MGL2Rank: Learning to Rank the Importance of Nodes in Road Networks Based on Multi-Graph Fusion

    Full text link
    Identifying important nodes with strong propagation capabilities in road networks is a significant topic in the field of urban planning. However, existing methods for evaluating the importance of nodes in traffic network consider only topological information and traffic volumes, ignoring the diversity of characteristics in road networks, such as the number of lanes and average speed of road segments, limiting their performance. To solve this problem, we propose a graph learning-based framework (MGL2Rank) that integrates the rich characteristics of road network for ranking the importance of nodes. In this framework, we first develop an embedding module that contains a sampling algorithm (MGWalk) and an encoder network to learn latent representation for each road segment. MGWalk utilizes multi-graph fusion to capture the topology of the road network and establish associations among road segments based on their attributes. Then, we use the obtained node representation to learn the importance ranking of road segments. Finally, we construct a synthetic dataset for ranking tasks based on the regional road network of Shenyang city, and our ranking results on this dataset demonstrate the effectiveness of our proposed method. The data and source code of MGL2Rank are available at https://github.com/ZJ726

    Integrating and Ranking Uncertain Scientific Data

    Get PDF
    Mediator-based data integration systems resolve exploratory queries by joining data elements across sources. In the presence of uncertainties, such multiple expansions can quickly lead to spurious connections and incorrect results. The BioRank project investigates formalisms for modeling uncertainty during scientific data integration and for ranking uncertain query results. Our motivating application is protein function prediction. In this paper we show that: (i) explicit modeling of uncertainties as probabilities increases our ability to predict less-known or previously unknown functions (though it does not improve predicting the well-known). This suggests that probabilistic uncertainty models offer utility for scientific knowledge discovery; (ii) small perturbations in the input probabilities tend to produce only minor changes in the quality of our result rankings. This suggests that our methods are robust against slight variations in the way uncertainties are transformed into probabilities; and (iii) several techniques allow us to evaluate our probabilistic rankings efficiently. This suggests that probabilistic query evaluation is not as hard for real-world problems as theory indicates
    • 

    corecore