279 research outputs found

    A Triclustering Approach for Time Evolving Graphs

    Full text link
    This paper introduces a novel technique to track structures in time evolving graphs. The method is based on a parameter free approach for three-dimensional co-clustering of the source vertices, the target vertices and the time. All these features are simultaneously segmented in order to build time segments and clusters of vertices whose edge distributions are similar and evolve in the same way over the time segments. The main novelty of this approach lies in that the time segments are directly inferred from the evolution of the edge distribution between the vertices, thus not requiring the user to make an a priori discretization. Experiments conducted on a synthetic dataset illustrate the good behaviour of the technique, and a study of a real-life dataset shows the potential of the proposed approach for exploratory data analysis

    Two-level histograms for dealing with outliers and heavy tail distributions

    Full text link
    Histograms are among the most popular methods used in exploratory analysis to summarize univariate distributions. In particular, irregular histograms are good non-parametric density estimators that require very few parameters: the number of bins with their lengths and frequencies. Many approaches have been proposed in the literature to infer these parameters, either assuming hypotheses about the underlying data distributions or exploiting a model selection approach. In this paper, we focus on the G-Enum histogram method, which exploits the Minimum Description Length (MDL) principle to build histograms without any user parameter and achieves state-of-the art performance w.r.t accuracy; parsimony and computation time. We investigate on the limits of this method in the case of outliers or heavy-tailed distributions. We suggest a two-level heuristic to deal with such cases. The first level exploits a logarithmic transformation of the data to split the data set into a list of data subsets with a controlled range of values. The second level builds a sub-histogram for each data subset and aggregates them to obtain a complete histogram. Extensive experiments show the benefits of the approach.Comment: 30 pages, 47 figure

    Computing with functions in the ball

    Full text link
    A collection of algorithms in object-oriented MATLAB is described for numerically computing with smooth functions defined on the unit ball in the Chebfun software. Functions are numerically and adaptively resolved to essentially machine precision by using a three-dimensional analogue of the double Fourier sphere method to form "ballfun" objects. Operations such as function evaluation, differentiation, integration, fast rotation by an Euler angle, and a Helmholtz solver are designed. Our algorithms are particularly efficient for vector calculus operations, and we describe how to compute the poloidal-toroidal and Helmholtz--Hodge decomposition of a vector field defined on the ball.Comment: 23 pages, 9 figure

    Discovering Patterns in Time-Varying Graphs: A Triclustering Approach

    Get PDF
    International audienceThis paper introduces a novel technique to track structures in time varying graphs. The method uses a maximum a posteriori approach for adjusting a three-dimensional co-clustering of the source vertices, the destination vertices and the time, to the data under study, in a way that does not require any hyper-parameter tuning. The three dimensions are simultaneously segmented in order to build clusters of source vertices, destination vertices and time segments where the edge distributions across clusters of vertices follow the same evolution over the time segments. The main novelty of this approach lies in that the time segments are directly inferred from the evolution of the edge distribution between the vertices, thus not requiring the user to make any a priori quantization. Experiments conducted on artificial data illustrate the good behavior of the technique, and a study of a real-life data set shows the potential of the proposed approach for exploratory data analysis

    Bifurcation analysis of a two-dimensional magnetic Rayleigh-B\'enard problem

    Full text link
    We perform bifurcation analysis of a two-dimensional magnetic Rayleigh-B\'enard problem using a numerical technique called deflated continuation. Our aim is to study the influence of the magnetic field on the bifurcation diagram as the Chandrasekhar number QQ increases, and compare it to the standard (non-magnetic) Rayleigh-B\'enard problem. We compute steady states at a high Chandrasekhar number of Q=103Q=10^3 over a range of the Rayleigh number 0≤Ra≤1050\leq \text{Ra}\leq 10^5. These solutions are obtained by combining deflation with a continuation of steady states at low Chandrasekhar number, which allows us to explore the influence of the strength of the magnetic field as QQ increases from low coupling, where the magnetic effect is almost negligible, to strong coupling at Q=103Q=10^3. We discover a large profusion of states with rich dynamics and observe a complex bifurcation structure with several pitchfork, Hopf and saddle-node bifurcations. Our numerical simulations show that the onset of bifurcations in the problem is delayed when QQ increases, while solutions with fluid velocity patterns aligning with the background vertical magnetic field are privileged. Additionally, we report a branch of states that stabilizes at high magnetic coupling, suggesting that one may take advantage of the magnetic field to discriminate solutions.Comment: 14 pages, 7 figure

    Triclustering pour la détection de structures temporelles dans les graphes

    No full text
    International audienceThis paper introduces a novel technique to track structures in time evolving graphs. The method is based on a parameter free approach for three-dimensional co-clustering of the source vertices, the target vertices and the time. All these features are simultaneously segmented in order to build time segments and clusters of vertices whose edge distributions are similar and evolve in the same way over the time segments. The main novelty of this approach lies in that the time segments are directly inferred from the evolution of the edge distribution between the vertices, thus not requiring the user to make an a priori discretization. Experiments conducted on a synthetic dataset illustrate the good behaviour of the technique, and a study of a real-life dataset shows the potential of the proposed approach for exploratory data analysi

    Étude des corrélations spatio-temporelles des appels mobiles en France

    No full text
    International audienceNous proposons dans cet article de présenter une application d'analyse d'une base de données de grande taille issue du secteur des télécommunications. Le problème consiste à segmenter un territoire et caractériser les zones ainsi définies grâce au comportement des habitants en terme de téléphonie mobile. Nous disposons pour cela d'un réseau d'appels inter-antennes construit pendant une période de cinq mois sur l'ensemble de la France. Nous proposons une analyse en deux phases. La première couple les antennes émettrices dont les appels sont similairement distribués sur les antennes réceptrices et vice versa. Une projection de ces groupes d'antennes sur une carte de France permet une visualisation des corrélations entre la géographie du territoire et le comportement de ses habitants en terme de téléphonie. La seconde phase découpe l'année en périodes entre lesquelles on observe un changement de distributions d'appels sortant des groupes d'antennes. On peut ainsi caractériser l'évolution temporelle du comportement des usagers de mobiles dans chacune des zones du pays

    Segmentation géographique par étude d'un journal d'appels téléphoniques

    No full text
    National audienceDans cet article, il est question de segmentation géographique par l'étude d'un journal d'appels agrégés par ville. Au lieu de réaliser directement un clustering de nœuds, nous proposons ici de faire du coclustering sur les arcs, définis comme des instances bidimensionnelles décrites par deux variables : le nœud source et le nœud cible. Une fois la segmentation optimale obtenue, les clusters sont fusionnés successivement de manière à détériorer le moins possible le modèle de clustering. Des expérimentations ont été menées sur un journal d'appel de l'opérateur de télécommunications Belge Mobistar
    • …
    corecore