845 research outputs found

    Network properties of written human language

    Get PDF
    We investigate the nature of written human language within the framework of complex network theory. In particular, we analyse the topology of Orwell's \textit{1984} focusing on the local properties of the network, such as the properties of the nearest neighbors and the clustering coefficient. We find a composite power law behavior for both the average nearest neighbor's degree and average clustering coefficient as a function of the vertex degree. This implies the existence of different functional classes of vertices. Furthermore we find that the second order vertex correlations are an essential component of the network architecture. To model our empirical results we extend a previously introduced model for language due to Dorogovtsev and Mendes. We propose an accelerated growing network model that contains three growth mechanisms: linear preferential attachment, local preferential attachment and the random growth of a pre-determined small finite subset of initial vertices. We find that with these elementary stochastic rules we are able to produce a network showing syntactic-like structures

    Testing the robustness of laws of polysemy and brevity versus frequency

    Get PDF
    The pioneering research of G.K. Zipf on the relationship between word frequency and other word features led to the formulation of various linguistic laws. Here we focus on a couple of them: the meaning-frequency law, i.e. the tendency of more frequent words to be more polysemous, and the law of abbreviation, i.e. the tendency of more frequent words to be shorter. Here we evaluate the robustness of these laws in contexts where they have not been explored yet to our knowledge. The recovery of the laws again in new conditions provides support for the hypothesis that they originate from abstract mechanisms.Peer ReviewedPostprint (author's final draft

    Investigating people: a qualitative analysis of the search behaviours of open-source intelligence analysts

    Get PDF
    The Internet and the World Wide Web have become integral parts of the lives of many modern individuals, enabling almost instantaneous communication, sharing and broadcasting of thoughts, feelings and opinions. Much of this information is publicly facing, and as such, it can be utilised in a multitude of online investigations, ranging from employee vetting and credit checking to counter-terrorism and fraud prevention/detection. However, the search needs and behaviours of these investigators are not well documented in the literature. In order to address this gap, an in-depth qualitative study was carried out in cooperation with a leading investigation company. The research contribution is an initial identification of Open-Source Intelligence investigator search behaviours, the procedures and practices that they undertake, along with an overview of the difficulties and challenges that they encounter as part of their domain. This lays the foundation for future research in to the varied domain of Open-Source Intelligence gathering

    Zipf's Law in Gene Expression

    Get PDF
    Using data from gene expression databases on various organisms and tissues, including yeast, nematodes, human normal and cancer tissues, and embryonic stem cells, we found that the abundances of expressed genes exhibit a power-law distribution with an exponent close to -1, i.e., they obey Zipf's law. Furthermore, by simulations of a simple model with an intra-cellular reaction network, we found that Zipf's law of chemical abundance is a universal feature of cells where such a network optimizes the efficiency and faithfulness of self-reproduction. These findings provide novel insights into the nature of the organization of reaction dynamics in living cells.Comment: revtex, 11 pages, 3 figures, submitted to Phys. Rev. Let

    Power-law distributions from additive preferential redistributions

    Full text link
    We introduce a non-growth model that generates the power-law distribution with the Zipf exponent. There are N elements, each of which is characterized by a quantity, and at each time step these quantities are redistributed through binary random interactions with a simple additive preferential rule, while the sum of quantities is conserved. The situation described by this model is similar to those of closed NN-particle systems when conservative two-body collisions are only allowed. We obtain stationary distributions of these quantities both analytically and numerically while varying parameters of the model, and find that the model exhibits the scaling behavior for some parameter ranges. Unlike well-known growth models, this alternative mechanism generates the power-law distribution when the growth is not expected and the dynamics of the system is based on interactions between elements. This model can be applied to some examples such as personal wealths, city sizes, and the generation of scale-free networks when only rewiring is allowed.Comment: 12 pages, 4 figures; Changed some expressions and notations; Added more explanations and changed the order of presentation in Sec.III while results are the sam

    Gravity vs radiation model: on the importance of scale and heterogeneity in commuting flows

    Full text link
    We test the recently introduced radiation model against the gravity model for the system composed of England and Wales, both for commuting patterns and for public transportation flows. The analysis is performed both at macroscopic scales, i.e. at the national scale, and at microscopic scales, i.e. at the city level. It is shown that the thermodynamic limit assumption for the original radiation model significantly underestimates the commuting flows for large cities. We then generalize the radiation model, introducing the correct normalisation factor for finite systems. We show that even if the gravity model has a better overall performance the parameter-free radiation model gives competitive results, especially for large scales.Comment: in press Phys. Rev. E, 201

    Universal scaling in sports ranking

    Full text link
    Ranking is a ubiquitous phenomenon in the human society. By clicking the web pages of Forbes, you may find all kinds of rankings, such as world's most powerful people, world's richest people, top-paid tennis stars, and so on and so forth. Herewith, we study a specific kind, sports ranking systems in which players' scores and prize money are calculated based on their performances in attending various tournaments. A typical example is tennis. It is found that the distributions of both scores and prize money follow universal power laws, with exponents nearly identical for most sports fields. In order to understand the origin of this universal scaling we focus on the tennis ranking systems. By checking the data we find that, for any pair of players, the probability that the higher-ranked player will top the lower-ranked opponent is proportional to the rank difference between the pair. Such a dependence can be well fitted to a sigmoidal function. By using this feature, we propose a simple toy model which can simulate the competition of players in different tournaments. The simulations yield results consistent with the empirical findings. Extensive studies indicate the model is robust with respect to the modifications of the minor parts.Comment: 8 pages, 7 figure

    Bidding process in online auctions and winning strategy:rate equation approach

    Full text link
    Online auctions have expanded rapidly over the last decade and have become a fascinating new type of business or commercial transaction in this digital era. Here we introduce a master equation for the bidding process that takes place in online auctions. We find that the number of distinct bidders who bid kk times, called the kk-frequent bidder, up to the tt-th bidding progresses as nk(t)tk2.4n_k(t)\sim tk^{-2.4}. The successfully transmitted bidding rate by the kk-frequent bidder is obtained as qk(t)k1.4q_k(t) \sim k^{-1.4}, independent of tt for large tt. This theoretical prediction is in agreement with empirical data. These results imply that bidding at the last moment is a rational and effective strategy to win in an eBay auction.Comment: 4 pages, 6 figure

    Complex network analysis of literary and scientific texts

    Full text link
    We present results from our quantitative study of statistical and network properties of literary and scientific texts written in two languages: English and Polish. We show that Polish texts are described by the Zipf law with the scaling exponent smaller than the one for the English language. We also show that the scientific texts are typically characterized by the rank-frequency plots with relatively short range of power-law behavior as compared to the literary texts. We then transform the texts into their word-adjacency network representations and find another difference between the languages. For the majority of the literary texts in both languages, the corresponding networks revealed the scale-free structure, while this was not always the case for the scientific texts. However, all the network representations of texts were hierarchical. We do not observe any qualitative and quantitative difference between the languages. However, if we look at other network statistics like the clustering coefficient and the average shortest path length, the English texts occur to possess more clustered structure than do the Polish ones. This result was attributed to differences in grammar of both languages, which was also indicated in the Zipf plots. All the texts, however, show network structure that differs from any of the Watts-Strogatz, the Barabasi-Albert, and the Erdos-Renyi architectures

    A Prototype Model of Stock Exchange

    Full text link
    A prototype model of stock market is introduced and studied numerically. In this self-organized system, we consider only the interaction among traders without external influences. Agents trade according to their own strategy, to accumulate his assets by speculating on the price's fluctuations which are produced by themselves. The model reproduced rather realistic price histories whose statistical properties are also similar to those observed in real markets.Comment: LaTex, 4 pages, 4 Encapsulated Postscript figures, uses psfi
    corecore