16,840 research outputs found

    Defining and identifying communities in networks

    Full text link
    The investigation of community structures in networks is an important issue in many domains and disciplines. This problem is relevant for social tasks (objective analysis of relationships on the web), biological inquiries (functional studies in metabolic, cellular or protein networks) or technological problems (optimization of large infrastructures). Several types of algorithm exist for revealing the community structure in networks, but a general and quantitative definition of community is still lacking, leading to an intrinsic difficulty in the interpretation of the results of the algorithms without any additional non-topological information. In this paper we face this problem by introducing two quantitative definitions of community and by showing how they are implemented in practice in the existing algorithms. In this way the algorithms for the identification of the community structure become fully self-contained. Furthermore, we propose a new local algorithm to detect communities which outperforms the existing algorithms with respect to the computational cost, keeping the same level of reliability. The new algorithm is tested on artificial and real-world graphs. In particular we show the application of the new algorithm to a network of scientific collaborations, which, for its size, can not be attacked with the usual methods. This new class of local algorithms could open the way to applications to large-scale technological and biological applications.Comment: Revtex, final form, 14 pages, 6 figure

    An Oracle Approach for Interaction Neighborhood Estimation in Random Fields

    Full text link
    We consider the problem of interaction neighborhood estimation from the partial observation of a finite number of realizations of a random field. We introduce a model selection rule to choose estimators of conditional probabilities among natural candidates. Our main result is an oracle inequality satisfied by the resulting estimator. We use then this selection rule in a two-step procedure to evaluate the interacting neighborhoods. The selection rule selects a small prior set of possible interacting points and a cutting step remove from this prior set the irrelevant points. We also prove that the Ising models satisfy the assumptions of the main theorems, without restrictions on the temperature, on the structure of the interacting graph or on the range of the interactions. It provides therefore a large class of applications for our results. We give a computationally efficient procedure in these models. We finally show the practical efficiency of our approach in a simulation study.Comment: 36 pages, 10 figure

    A structural Markov property for decomposable graph laws that allows control of clique intersections

    Full text link
    We present a new kind of structural Markov property for probabilistic laws on decomposable graphs, which allows the explicit control of interactions between cliques, so is capable of encoding some interesting structure. We prove the equivalence of this property to an exponential family assumption, and discuss identifiability, modelling, inferential and computational implications.Comment: 10 pages, 3 figures; updated from V1 following journal review, new more explicit title and added section on inferenc

    Negative association in uniform forests and connected graphs

    Full text link
    We consider three probability measures on subsets of edges of a given finite graph GG, namely those which govern, respectively, a uniform forest, a uniform spanning tree, and a uniform connected subgraph. A conjecture concerning the negative association of two edges is reviewed for a uniform forest, and a related conjecture is posed for a uniform connected subgraph. The former conjecture is verified numerically for all graphs GG having eight or fewer vertices, or having nine vertices and no more than eighteen edges, using a certain computer algorithm which is summarised in this paper. Negative association is known already to be valid for a uniform spanning tree. The three cases of uniform forest, uniform spanning tree, and uniform connected subgraph are special cases of a more general conjecture arising from the random-cluster model of statistical mechanics.Comment: With minor correction

    Optimal model-free prediction from multivariate time series

    Get PDF
    Forecasting a time series from multivariate predictors constitutes a challenging problem, especially using model-free approaches. Most techniques, such as nearest-neighbor prediction, quickly suffer from the curse of dimensionality and overfitting for more than a few predictors which has limited their application mostly to the univariate case. Therefore, selection strategies are needed that harness the available information as efficiently as possible. Since often the right combination of predictors matters, ideally all subsets of possible predictors should be tested for their predictive power, but the exponentially growing number of combinations makes such an approach computationally prohibitive. Here a prediction scheme that overcomes this strong limitation is introduced utilizing a causal pre-selection step which drastically reduces the number of possible predictors to the most predictive set of causal drivers making a globally optimal search scheme tractable. The information-theoretic optimality is derived and practical selection criteria are discussed. As demonstrated for multivariate nonlinear stochastic delay processes, the optimal scheme can even be less computationally expensive than commonly used sub-optimal schemes like forward selection. The method suggests a general framework to apply the optimal model-free approach to select variables and subsequently fit a model to further improve a prediction or learn statistical dependencies. The performance of this framework is illustrated on a climatological index of El Ni\~no Southern Oscillation.Comment: 14 pages, 9 figure
    • …
    corecore