591 research outputs found
Can Zipf's law be adapted to normalize microarrays?
BACKGROUND: Normalization is the process of removing non-biological sources of variation between array experiments. Recent investigations of data in gene expression databases for varying organisms and tissues have shown that the majority of expressed genes exhibit a power-law distribution with an exponent close to -1 (i.e. obey Zipf's law). Based on the observation that our single channel and two channel microarray data sets also followed a power-law distribution, we were motivated to develop a normalization method based on this law, and examine how it compares with existing published techniques. A computationally simple and intuitively appealing technique based on this observation is presented. RESULTS: Using pairwise comparisons using MA plots (log ratio vs. log intensity), we compared this novel method to previously published normalization techniques, namely global normalization to the mean, the quantile method, and a variation on the loess normalization method designed specifically for boutique microarrays. Results indicated that, for single channel microarrays, the quantile method was superior with regard to eliminating intensity-dependent effects (banana curves), but Zipf's law normalization does minimize this effect by rotating the data distribution such that the maximal number of data points lie on the zero of the log ratio axis. For two channel boutique microarrays, the Zipf's law normalizations performed as well as, or better than existing techniques. CONCLUSION: Zipf's law normalization is a useful tool where the Quantile method cannot be applied, as is the case with microarrays containing functionally specific gene sets (boutique arrays)
Zipf's Law and Avoidance of Excessive Synonymy
Zipf's law states that if words of language are ranked in the order of
decreasing frequency in texts, the frequency of a word is inversely
proportional to its rank. It is very robust as an experimental observation, but
to date it escaped satisfactory theoretical explanation. We suggest that Zipf's
law may arise from the evolution of word semantics dominated by expansion of
meanings and competition of synonyms.Comment: 47 pages; fixed reference list missing in v.
Sociological and Economic Inequality and the Second Law
There are two fair ways to distribute particles in boxes. The first one is the Casino’s way, namely an equal chance to any box. The second one is the thermodynamic way, namely an equal chance to any different configuration of particles and boxes. The second way, calculated here, yields an uneven distribution of the particles in the boxes. It is shown that this distribution fits well to sociological phenomena, such as to the distribution of votes in polls and the distribution of wealth. This distribution yields the Benford law (the distribution of digits in numerical data), as a private case.wealth distribution ; Power law; Zipf law;Thermodynamics
Modeling fractal structure of city-size distributions using correlation function
Zipf's law is one the most conspicuous empirical facts for cities, however,
there is no convincing explanation for the scaling relation between rank and
size and its scaling exponent. Based on the idea from general fractals and
scaling, this paper proposes a dual competition hypothesis of city develop to
explain the value intervals and the special value, 1, of the power exponent.
Zipf's law and Pareto's law can be mathematically transformed into one another.
Based on the Pareto distribution, a frequency correlation function can be
constructed. By scaling analysis and multifractals spectrum, the parameter
interval of Pareto exponent is derived as (0.5, 1]; Based on the Zipf
distribution, a size correlation function can be built, and it is opposite to
the first one. By the second correlation function and multifractals notion, the
Pareto exponent interval is derived as [1, 2). Thus the process of urban
evolution falls into two effects: one is Pareto effect indicating city number
increase (external complexity), and the other Zipf effect indicating city size
growth (internal complexity). Because of struggle of the two effects, the
scaling exponent varies from 0.5 to 2; but if the two effects reach equilibrium
with each other, the scaling exponent approaches 1. A series of mathematical
experiments on hierarchical correlation are employed to verify the models and a
conclusion can be drawn that if cities in a given region follow Zipf's law, the
frequency and size correlations will follow the scaling law. This theory can be
generalized to interpret the inverse power-law distributions in various fields
of physical and social sciences.Comment: 18 pages, 3 figures, 3 table
Power Law Scaling in the World Income Distribution
We show that over the period 1960-1997, the range comprised between the 30th and the 85th percentiles of the world income distribution expressed in terms of GDP per capita invariably scales down as a Pareto distribution. Furthermore, the time path of the power law exponent displays a negatively sloped trend. Our findings suggest that the cross-country average growth process appears to be scale invariant but for countries in the tails of the world income distribution, and that the relative volatility of smaller countries' growth processes have increased over time.Growth
Research on Zipf\u27s Law of Hot Events in Search Engines
This paper focuses on the amount of searching and browsing of hot events in China and finds that the searching index sequences of daily hot events and weekly hot events are in line with Zipf\u27s law. Through continuous collection of large data samples of multiple dates , We find that the Zipf index of the searching index series for daily hot events fluctuates in a very small range.Through Zipf analysis, we find that only a few events maintain long-term heat. A few events will be the focus of most people, while a few will focus on some directional events. So Zipf distribution describes the balance of economic propensity of sender and receiver during the transmission of information. This research is of some reference to commercial activities that make use of hot events for e-commerce
- …