591 research outputs found

    Can Zipf's law be adapted to normalize microarrays?

    Get PDF
    BACKGROUND: Normalization is the process of removing non-biological sources of variation between array experiments. Recent investigations of data in gene expression databases for varying organisms and tissues have shown that the majority of expressed genes exhibit a power-law distribution with an exponent close to -1 (i.e. obey Zipf's law). Based on the observation that our single channel and two channel microarray data sets also followed a power-law distribution, we were motivated to develop a normalization method based on this law, and examine how it compares with existing published techniques. A computationally simple and intuitively appealing technique based on this observation is presented. RESULTS: Using pairwise comparisons using MA plots (log ratio vs. log intensity), we compared this novel method to previously published normalization techniques, namely global normalization to the mean, the quantile method, and a variation on the loess normalization method designed specifically for boutique microarrays. Results indicated that, for single channel microarrays, the quantile method was superior with regard to eliminating intensity-dependent effects (banana curves), but Zipf's law normalization does minimize this effect by rotating the data distribution such that the maximal number of data points lie on the zero of the log ratio axis. For two channel boutique microarrays, the Zipf's law normalizations performed as well as, or better than existing techniques. CONCLUSION: Zipf's law normalization is a useful tool where the Quantile method cannot be applied, as is the case with microarrays containing functionally specific gene sets (boutique arrays)

    Zipf's Law and Avoidance of Excessive Synonymy

    Full text link
    Zipf's law states that if words of language are ranked in the order of decreasing frequency in texts, the frequency of a word is inversely proportional to its rank. It is very robust as an experimental observation, but to date it escaped satisfactory theoretical explanation. We suggest that Zipf's law may arise from the evolution of word semantics dominated by expansion of meanings and competition of synonyms.Comment: 47 pages; fixed reference list missing in v.

    Sociological and Economic Inequality and the Second Law

    Get PDF
    There are two fair ways to distribute particles in boxes. The first one is the Casino’s way, namely an equal chance to any box. The second one is the thermodynamic way, namely an equal chance to any different configuration of particles and boxes. The second way, calculated here, yields an uneven distribution of the particles in the boxes. It is shown that this distribution fits well to sociological phenomena, such as to the distribution of votes in polls and the distribution of wealth. This distribution yields the Benford law (the distribution of digits in numerical data), as a private case.wealth distribution ; Power law; Zipf law;Thermodynamics

    Modeling fractal structure of city-size distributions using correlation function

    Get PDF
    Zipf's law is one the most conspicuous empirical facts for cities, however, there is no convincing explanation for the scaling relation between rank and size and its scaling exponent. Based on the idea from general fractals and scaling, this paper proposes a dual competition hypothesis of city develop to explain the value intervals and the special value, 1, of the power exponent. Zipf's law and Pareto's law can be mathematically transformed into one another. Based on the Pareto distribution, a frequency correlation function can be constructed. By scaling analysis and multifractals spectrum, the parameter interval of Pareto exponent is derived as (0.5, 1]; Based on the Zipf distribution, a size correlation function can be built, and it is opposite to the first one. By the second correlation function and multifractals notion, the Pareto exponent interval is derived as [1, 2). Thus the process of urban evolution falls into two effects: one is Pareto effect indicating city number increase (external complexity), and the other Zipf effect indicating city size growth (internal complexity). Because of struggle of the two effects, the scaling exponent varies from 0.5 to 2; but if the two effects reach equilibrium with each other, the scaling exponent approaches 1. A series of mathematical experiments on hierarchical correlation are employed to verify the models and a conclusion can be drawn that if cities in a given region follow Zipf's law, the frequency and size correlations will follow the scaling law. This theory can be generalized to interpret the inverse power-law distributions in various fields of physical and social sciences.Comment: 18 pages, 3 figures, 3 table

    Power Law Scaling in the World Income Distribution

    Get PDF
    We show that over the period 1960-1997, the range comprised between the 30th and the 85th percentiles of the world income distribution expressed in terms of GDP per capita invariably scales down as a Pareto distribution. Furthermore, the time path of the power law exponent displays a negatively sloped trend. Our findings suggest that the cross-country average growth process appears to be scale invariant but for countries in the tails of the world income distribution, and that the relative volatility of smaller countries' growth processes have increased over time.Growth

    Research on Zipf\u27s Law of Hot Events in Search Engines

    Get PDF
    This paper focuses on the amount of searching and browsing of hot events in China and finds that the searching index sequences of daily hot events and weekly hot events are in line with Zipf\u27s law. Through continuous collection of large data samples of multiple dates , We find that the Zipf index of the searching index series for daily hot events fluctuates in a very small range.Through Zipf analysis, we find that only a few events maintain long-term heat. A few events will be the focus of most people, while a few will focus on some directional events. So Zipf distribution describes the balance of economic propensity of sender and receiver during the transmission of information. This research is of some reference to commercial activities that make use of hot events for e-commerce
    • …
    corecore