29,894 research outputs found
The Nature of the Chemical Process. 1. Symmetry Evolution - Revised Information Theory, Similarity Principle and Ugly Symmetry
Three laws of information theory have been proposed. Labeling by introducing
nonsymmetry and formatting by introducing symmetry are defined. The function L
(L=lnw, w is the number of microstates, or the sum of entropy and information,
L=S+I) of the universe is a constant (the first law of information theory). The
entropy S of the universe tends toward a maximum (the second law law of
information theory). For a perfect symmetric static structure, the information
is zero and the static entropy is the maximum (the third law law of information
theory). Based on the Gibbs inequality and the second law of the revised
information theory we have proved the similarity principle (a continuous higher
similarity-higher entropy relation after the rejection of the Gibbs paradox)
and proved the Curie-Rosen symmetry principle (a higher symmetry-higher
stability relation) as a special case of the similarity principle. Some
examples in chemical physics have been given. Spontaneous processes of all
kinds of molecular interaction, phase separation and phase transition,
including symmetry breaking and the densest molecular packing and
crystallization, are all driven by information minimization or symmetry
maximization. The evolution of the universe in general and evolution of life in
particular can be quantitatively considered as a series of symmetry breaking
processes. The two empirical rules - similarity rule and complementarity rule -
have been given a theoretical foundation. All kinds of periodicity in space and
time are symmetries and contribute to the stability. Symmetry is beautiful
because it renders stability. However, symmetry is in principle ugly because it
is associated with information loss.Comment: 29 pages, 14 figure
Alternative Kullback-Leibler information entropy for enantiomers
In our series of studies on quantifying chirality, a new chirality measure is proposed in this work based on the Kullback-Leibler information entropy. The index computes the extra information that the shape function of one enantiomer carries over a normalized shape function of the racemate, while in our previous studies the shape functions of the R and S enantiomers were used considering one as reference for the other. Besides being mathematically more elegant (symmetric, positive definite, zero in the case of a nonchiral system), this new index bears a more direct relation with chirality oriented experimental measurements such as circular dichroism (CD) and optical rotation measurements, where the racemate is frequently used as a reference, The five chiral halomethanes holding one asymmetric carbon atom and H, F, Cl, Br, and I as substituents have been analyzed. A comparison with our calculated optical rotation and with Avnir's Continuous Chirality Measure (CCM) is computed. The results show that with this index the emphasis lies on the differences between the noncoinciding substituents
Element-centric clustering comparison unifies overlaps and hierarchy
Clustering is one of the most universal approaches for understanding complex
data. A pivotal aspect of clustering analysis is quantitatively comparing
clusterings; clustering comparison is the basis for many tasks such as
clustering evaluation, consensus clustering, and tracking the temporal
evolution of clusters. In particular, the extrinsic evaluation of clustering
methods requires comparing the uncovered clusterings to planted clusterings or
known metadata. Yet, as we demonstrate, existing clustering comparison measures
have critical biases which undermine their usefulness, and no measure
accommodates both overlapping and hierarchical clusterings. Here we unify the
comparison of disjoint, overlapping, and hierarchically structured clusterings
by proposing a new element-centric framework: elements are compared based on
the relationships induced by the cluster structure, as opposed to the
traditional cluster-centric philosophy. We demonstrate that, in contrast to
standard clustering similarity measures, our framework does not suffer from
critical biases and naturally provides unique insights into how the clusterings
differ. We illustrate the strengths of our framework by revealing new insights
into the organization of clusters in two applications: the improved
classification of schizophrenia based on the overlapping and hierarchical
community structure of fMRI brain networks, and the disentanglement of various
social homophily factors in Facebook social networks. The universality of
clustering suggests far-reaching impact of our framework throughout all areas
of science
Entropy-scaling search of massive biological data
Many datasets exhibit a well-defined structure that can be exploited to
design faster search tools, but it is not always clear when such acceleration
is possible. Here, we introduce a framework for similarity search based on
characterizing a dataset's entropy and fractal dimension. We prove that
searching scales in time with metric entropy (number of covering hyperspheres),
if the fractal dimension of the dataset is low, and scales in space with the
sum of metric entropy and information-theoretic entropy (randomness of the
data). Using these ideas, we present accelerated versions of standard tools,
with no loss in specificity and little loss in sensitivity, for use in three
domains---high-throughput drug screening (Ammolite, 150x speedup), metagenomics
(MICA, 3.5x speedup of DIAMOND [3,700x BLASTX]), and protein structure search
(esFragBag, 10x speedup of FragBag). Our framework can be used to achieve
"compressive omics," and the general theory can be readily applied to data
science problems outside of biology.Comment: Including supplement: 41 pages, 6 figures, 4 tables, 1 bo
- …