3,973 research outputs found

    A network approach to topic models

    Full text link
    One of the main computational and scientific challenges in the modern age is to extract useful information from unstructured texts. Topic models are one popular machine-learning approach which infers the latent topical structure of a collection of documents. Despite their success --- in particular of its most widely used variant called Latent Dirichlet Allocation (LDA) --- and numerous applications in sociology, history, and linguistics, topic models are known to suffer from severe conceptual and practical problems, e.g. a lack of justification for the Bayesian priors, discrepancies with statistical properties of real texts, and the inability to properly choose the number of topics. Here we obtain a fresh view on the problem of identifying topical structures by relating it to the problem of finding communities in complex networks. This is achieved by representing text corpora as bipartite networks of documents and words. By adapting existing community-detection methods -- using a stochastic block model (SBM) with non-parametric priors -- we obtain a more versatile and principled framework for topic modeling (e.g., it automatically detects the number of topics and hierarchically clusters both the words and documents). The analysis of artificial and real corpora demonstrates that our SBM approach leads to better topic models than LDA in terms of statistical model selection. More importantly, our work shows how to formally relate methods from community detection and topic modeling, opening the possibility of cross-fertilization between these two fields.Comment: 22 pages, 10 figures, code available at https://topsbm.github.io

    Sampling motif-constrained ensembles of networks

    Full text link
    The statistical significance of network properties is conditioned on null models which satisfy spec- ified properties but that are otherwise random. Exponential random graph models are a principled theoretical framework to generate such constrained ensembles, but which often fail in practice, either due to model inconsistency, or due to the impossibility to sample networks from them. These problems affect the important case of networks with prescribed clustering coefficient or number of small connected subgraphs (motifs). In this paper we use the Wang-Landau method to obtain a multicanonical sampling that overcomes both these problems. We sample, in polynomial time, net- works with arbitrary degree sequences from ensembles with imposed motifs counts. Applying this method to social networks, we investigate the relation between transitivity and homophily, and we quantify the correlation between different types of motifs, finding that single motifs can explain up to 60% of the variation of motif profiles.Comment: Updated version, as published in the journal. 7 pages, 5 figures, one Supplemental Materia

    Trajectories in a space with a spherically symmetric dislocation

    Full text link
    We consider a new type of defect in the scope of linear elasticity theory, using geometrical methods. This defect is produced by a spherically symmetric dislocation, or ball dislocation. We derive the induced metric as well as the affine connections and curvature tensors. Since the induced metric is discontinuous, one can expect ambiguity coming from these quantities, due to products between delta functions or its derivatives, plaguing a description of ball dislocations based on the Geometric Theory of Defects. However, exactly as in the previous case of cylindric defect, one can obtain some well-defined physical predictions of the induced geometry. In particular, we explore some properties of test particle trajectories around the defect and show that these trajectories are curved but can not be circular orbits.Comment: 11 pages, 3 figure

    Symmetry aspects of fermions coupled to torsion and electromagnetic fields

    Get PDF
    We study and explore the symmetry properties of fermions coupled to dynamical torsion and electromagnetic fields. The stability of the theory upon radiative corrections as well as the presence of anomalies are investigated.Comment: 9 pages, LaTe

    Caracterização de abóboras quanto aos teores carotenóides totais, alfa e beta-caroteno.

    Get PDF
    Esse trabalho teve como objetivo caracterizar variedades locais de abóboras de diferentes origens para os teores de carotenóides totais, alfa e beta caroteno.bitstream/item/57215/1/BPD-78.pd

    Distribution of epicenters in the Olami-Feder-Christensen model

    Full text link
    We show that the well established Olami-Feder-Christensen (OFC) model for the dynamics of earthquakes is able to reproduce a new striking property of real earthquake data. Recently, it has been pointed out by Abe and Suzuki that the epicenters of earthquakes could be connected in order to generate a graph, with properties of a scale-free network of the Barabasi-Albert type. However, only the non conservative version of the Olami-Feder-Christensen model is able to reproduce this behavior. The conservative version, instead, behaves like a random graph. Besides indicating the robustness of the model to describe earthquake dynamics, those findings reinforce that conservative and non conservative versions of the OFC model are qualitatively different. Also, we propose a completely new dynamical mechanism that, even without an explicit rule of preferential attachment, generates a free scale network. The preferential attachment is in this case a ``by-product'' of the long term correlations associated with the self-organized critical state. The detailed study of the properties of this network can reveal new aspects of the dynamics of the OFC model, contributing to the understanding of self-organized criticality in non conserving models.Comment: 7 pages, 7 figure

    Divergência genética entre acessos de batata-doce utilizando descritores morfoagronômicos das raízes.

    Get PDF
    Objetivou-se caracterizar morfoagronômicamente 23 genótipos de batata-doce do banco ativo de germoplasma mantido na Embrapa Hortaliças; utilizar estas características para avaliar a variabilidade genética entre os materiais pela aplicação dos métodos de Análise por agrupamento hierárquico e análise por componentes principais e estimar parâmetros populacionais

    Superdiffusion of massive particles induced by multi-scale velocity fields

    Full text link
    We study drag-induced diffusion of massive particles in scale-free velocity fields, where superdiffusive behavior emerges due to the scale-free size distribution of the vortices of the underlying velocity field. The results show qualitative resemblance to what is observed in fluid systems, namely the diffusive exponent for the mean square separation of pairs of particles and the preferential concentration of the particles, both as a function of the response time.Comment: 5 pages, 5 figures. Accepted for publication in EP
    • …
    corecore