Search CORE

29,894 research outputs found

The Nature of the Chemical Process. 1. Symmetry Evolution - Revised Information Theory, Similarity Principle and Ugly Symmetry

Author: Atkin
Brillouin
Duan
Elliott
Frank
Halliday
Hargittai
Lewis
Ma
Pauli
Prigogine
Rosen
Schrödinger
Shannon
Shu-Kun Lin
Watson
Weyl
Zabrodsky
Publication venue
Publication date: 01/01/2001
Field of study

Three laws of information theory have been proposed. Labeling by introducing nonsymmetry and formatting by introducing symmetry are defined. The function L (L=lnw, w is the number of microstates, or the sum of entropy and information, L=S+I) of the universe is a constant (the first law of information theory). The entropy S of the universe tends toward a maximum (the second law law of information theory). For a perfect symmetric static structure, the information is zero and the static entropy is the maximum (the third law law of information theory). Based on the Gibbs inequality and the second law of the revised information theory we have proved the similarity principle (a continuous higher similarity-higher entropy relation after the rejection of the Gibbs paradox) and proved the Curie-Rosen symmetry principle (a higher symmetry-higher stability relation) as a special case of the similarity principle. Some examples in chemical physics have been given. Spontaneous processes of all kinds of molecular interaction, phase separation and phase transition, including symmetry breaking and the densest molecular packing and crystallization, are all driven by information minimization or symmetry maximization. The evolution of the universe in general and evolution of life in particular can be quantitatively considered as a series of symmetry breaking processes. The two empirical rules - similarity rule and complementarity rule - have been given a theoretical foundation. All kinds of periodicity in space and time are symmetries and contribute to the stability. Symmetry is beautiful because it renders stability. However, symmetry is in principle ugly because it is associated with information loss.Comment: 29 pages, 14 figure

arXiv.org e-Print Archive

Directory of Open Access Journals

Alternative Kullback-Leibler information entropy for enantiomers

Author: Borgoo Alex
Bultinck Patrick
Geerlings Paul
Janssens Sara
Van Alsenoy Christian
Publication venue: 'American Chemical Society (ACS)'
Publication date: 01/01/2010
Field of study

In our series of studies on quantifying chirality, a new chirality measure is proposed in this work based on the Kullback-Leibler information entropy. The index computes the extra information that the shape function of one enantiomer carries over a normalized shape function of the racemate, while in our previous studies the shape functions of the R and S enantiomers were used considering one as reference for the other. Besides being mathematically more elegant (symmetric, positive definite, zero in the case of a nonchiral system), this new index bears a more direct relation with chirality oriented experimental measurements such as circular dichroism (CD) and optical rotation measurements, where the racemate is frequently used as a reference, The five chiral halomethanes holding one asymmetric carbon atom and H, F, Cl, Br, and I as substituents have been analyzed. A comparison with our calculated optical rotation and with Avnir's Continuous Chirality Measure (CCM) is computed. The results show that with this index the emphasis lies on the differences between the noncoinciding substituents

Element-centric clustering comparison unifies overlaps and hierarchy

Author: Ahn Yong-Yeol
Gates Alexander J.
Hetrick William P.
Wood Ian B.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 12/06/2019
Field of study

Clustering is one of the most universal approaches for understanding complex data. A pivotal aspect of clustering analysis is quantitatively comparing clusterings; clustering comparison is the basis for many tasks such as clustering evaluation, consensus clustering, and tracking the temporal evolution of clusters. In particular, the extrinsic evaluation of clustering methods requires comparing the uncovered clusterings to planted clusterings or known metadata. Yet, as we demonstrate, existing clustering comparison measures have critical biases which undermine their usefulness, and no measure accommodates both overlapping and hierarchical clusterings. Here we unify the comparison of disjoint, overlapping, and hierarchically structured clusterings by proposing a new element-centric framework: elements are compared based on the relationships induced by the cluster structure, as opposed to the traditional cluster-centric philosophy. We demonstrate that, in contrast to standard clustering similarity measures, our framework does not suffer from critical biases and naturally provides unique insights into how the clusterings differ. We illustrate the strengths of our framework by revealing new insights into the organization of clusters in two applications: the improved classification of schizophrenia based on the overlapping and hierarchical community structure of fMRI brain networks, and the disentanglement of various social homophily factors in Facebook social networks. The universality of clustering suggests far-reaching impact of our framework throughout all areas of science

arXiv.org e-Print Archive

IUScholarWorks Open

Entropy-scaling search of massive biological data

Author: Berger Bonnie
Daniels Noah M.
Danko David Christian
Yu Y. William
Publication venue: 'Elsevier BV'
Publication date: 01/06/2015
Field of study

Many datasets exhibit a well-defined structure that can be exploited to design faster search tools, but it is not always clear when such acceleration is possible. Here, we introduce a framework for similarity search based on characterizing a dataset's entropy and fractal dimension. We prove that searching scales in time with metric entropy (number of covering hyperspheres), if the fractal dimension of the dataset is low, and scales in space with the sum of metric entropy and information-theoretic entropy (randomness of the data). Using these ideas, we present accelerated versions of standard tools, with no loss in specificity and little loss in sensitivity, for use in three domains---high-throughput drug screening (Ammolite, 150x speedup), metagenomics (MICA, 3.5x speedup of DIAMOND [3,700x BLASTX]), and protein structure search (esFragBag, 10x speedup of FragBag). Our framework can be used to achieve "compressive omics," and the general theory can be readily applied to data science problems outside of biology.Comment: Including supplement: 41 pages, 6 figures, 4 tables, 1 bo

arXiv.org e-Print Archive