Search CORE

52 research outputs found

Shannon Meets Carnot: Generalized Second Thermodynamic Law

Author: Blahut R. E.
Carnot S.
Cover T. M.
Gallager R. G.
I. Kanter
Mendoza E.
Nishimori H.
O. Shental
Reichl L. E.
Reif F.
Shannon C. E.
Shental O. Kanter I.
Tishby N. Pereira F. C. Bialek W.
Publication venue: 'IOP Publishing'
Publication date: 23/06/2008
Field of study

The classical thermodynamic laws fail to capture the behavior of systems with energy Hamiltonian which is an explicit function of the temperature. Such Hamiltonian arises, for example, in modeling information processing systems, like communication channels, as thermal systems. Here we generalize the second thermodynamic law to encompass systems with temperature-dependent energy levels,

dQ=TdS+dT

, where

denotes averaging over the Boltzmann distribution and reveal a new definition to the basic notion of temperature. This generalization enables to express, for instance, the mutual information of the Gaussian channel as a consequence of the fundamental laws of nature - the laws of thermodynamics

arXiv.org e-Print Archive

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

Parallel vs. Sequential Belief Propagation Decoding of LDPC Codes over GF(q) and Markov Sources

Author: Davey
Gallager
H. Efraim
H. Kfir
I. Kanter
Kabashima
Kanter
Kfir
Kfir
Mackay
Montanari
N. Yacov
O. Shental
Pearl
Richardson
Sharon
Sourlas
Tong
Vicente
Publication venue: 'Elsevier BV'
Publication date: 16/05/2006
Field of study

A sequential updating scheme (SUS) for belief propagation (BP) decoding of LDPC codes over Galois fields,

GF(q)

, and correlated Markov sources is proposed, and compared with the standard parallel updating scheme (PUS). A thorough experimental study of various transmission settings indicates that the convergence rate, in iterations, of the BP algorithm (and subsequently its complexity) for the SUS is about one half of that for the PUS, independent of the finite field size

q

. Moreover, this 1/2 factor appears regardless of the correlations of the source and the channel's noise model, while the error correction performance remains unchanged. These results may imply on the 'universality' of the one half convergence speed-up of SUS decoding

arXiv.org e-Print Archive

Crossref

High-resolution microbial community reconstruction by integrating short reads from multiple 16S rRNA regions

Author: Amir A
Elgart M
Shamir O
Shental N
Soen Y
Stern S
Turnbaugh Peter
Turnbaugh PJ
Zeisel A
Zuk O
Publication venue: 'Oxford University Press (OUP)'
Publication date: 07/11/2013
Field of study

The emergence of massively parallel sequencing technology has revolutionized microbial profiling, allowing the unprecedented comparison of microbial diversity across time and space in a wide range of host-associated and environmental ecosystems. Although the high-throughput nature of such methods enables the detection of low-frequency bacteria, these advances come at the cost of sequencing read length, limiting the phylogenetic resolution possible by current methods. Here, we present a generic approach for integrating short reads from large genomic regions, thus enabling phylogenetic resolution far exceeding current methods. The approach is based on a mapping to a statistical model that is later solved as a constrained optimization problem. We demonstrate the utility of this method by analyzing human saliva and Drosophila samples, using Illumina single-end sequencing of a 750 bp amplicon of the 16S rRNA gene. Phylogenetic resolution is significantly extended while reducing the number of falsely detected bacteria, as compared with standard single-region Roche 454 Pyrosequencing. Our approach can be seamlessly applied to simultaneous sequencing of multiple genes providing a higher resolution view of the composition and activity of complex microbial communities

Crossref

Harvard University - DASH

PubMed Central

eScholarship - University of California

An information theoretic approach to statistical dependence: copula information

Author: Amari S.-I.
Caticha A.
Dotsenko V.
Efron B.
Jaynes E. T.
Ma J. Sun Z.
Mari D. D.
Nelsen R. B.
Opper M.
R. S. Calsaverini
R. Vicente
Shental O.
Publication venue: 'IOP Publishing'
Publication date: 01/01/2009
Field of study

We discuss the connection between information and copula theories by showing that a copula can be employed to decompose the information content of a multivariate distribution into marginal and dependence components, with the latter quantified by the mutual information. We define the information excess as a measure of deviation from a maximum entropy distribution. The idea of marginal invariant dependence measures is also discussed and used to show that empirical linear correlation underestimates the amplitude of the actual correlation in the case of non-Gaussian marginals. The mutual information is shown to provide an upper bound for the asymptotic empirical log-likelihood of a copula. An analytical expression for the information excess of T-copulas is provided, allowing for simple model identification within this family. We illustrate the framework in a financial data set.Comment: to appear in Europhysics Letter

arXiv.org e-Print Archive

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

RCAAP - Repositório Científico de Acesso Aberto de Portugal

Universidade de São Paulo

Optimal Location of Sources in Transportation Networks

Author: Bickson D Dolev D Shental O Siegel P H Wolf J K
C H Yeung
Devaney R L
Garey M R
K Y Michael Wong
Kopparapu C
Montanari A
Mézard M
Pearl J
Rardin R L
Selman B
Toulouse G
Whyte W
Wong K Y M
Yeung C H
Publication venue: 'IOP Publishing'
Publication date: 01/01/2010
Field of study

We consider the problem of optimizing the locations of source nodes in transportation networks. A reduction of the fraction of surplus nodes induces a glassy transition. In contrast to most constraint satisfaction problems involving discrete variables, our problem involves continuous variables which lead to cavity fields in the form of functions. The one-step replica symmetry breaking (1RSB) solution involves solving a stable distribution of functionals, which is in general infeasible. In this paper, we obtain small closed sets of functional cavity fields and demonstrate how functional recursions are converted to simple recursions of probabilities, which make the 1RSB solution feasible. The physical results in the replica symmetric (RS) and the 1RSB frameworks are thus derived and the stability of the RS and 1RSB solutions are examined.Comment: 38 pages, 18 figure

arXiv.org e-Print Archive

Crossref

Hong Kong University of Science and Technology Institutional Repository

Fermions and Loops on Graphs. I. Loop Calculus for Determinant

Author: Berezin F
Chernyak V Y
Chertkov M
Chertkov M
Cseke B Heskes T
Faddeev L
Gallager R
Hartmann A K
Johnson J
MacKay D J
Malioutov D M
Mezard M
Michael Chertkov
Moallemi C C Van Roy B
Pearl J
Richardson T
Rue H
Shental O Siegel P H Wolf J K Bickson D Dolev D
Vladimir Y Chernyak
Publication venue: 'IOP Publishing'
Publication date: 20/11/2008
Field of study

This paper is the first in the series devoted to evaluation of the partition function in statistical models on graphs with loops in terms of the Berezin/fermion integrals. The paper focuses on a representation of the determinant of a square matrix in terms of a finite series, where each term corresponds to a loop on the graph. The representation is based on a fermion version of the Loop Calculus, previously introduced by the authors for graphical models with finite alphabets. Our construction contains two levels. First, we represent the determinant in terms of an integral over anti-commuting Grassman variables, with some reparametrization/gauge freedom hidden in the formulation. Second, we show that a special choice of the gauge, called BP (Bethe-Peierls or Belief Propagation) gauge, yields the desired loop representation. The set of gauge-fixing BP conditions is equivalent to the Gaussian BP equations, discussed in the past as efficient (linear scaling) heuristics for estimating the covariance of a sparse positive matrix.Comment: 11 pages, 1 figure; misprints correcte

arXiv.org e-Print Archive

Crossref

Identification of rare alleles and their carriers using compressed se(que)nsing

Author: A. Amir
Bodmer
Cohen
Gnirke
Harris
Hirschhorn
Ingman
Li
Lin
Lustig
Macgregor
Mardis
Margulies
Martinez Barrio
McClellan
N. Shental
Ng
Norton
O. Zuk
Out
Rosa-Rosa
Shaw
Sladek
Yang
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

Identification of rare variants by resequencing is important both for detecting novel variations and for screening individuals for known disease alleles. New technologies enable low-cost resequencing of target regions, although it is still prohibitive to test more than a few individuals. We propose a novel pooling design that enables the recovery of novel or known rare alleles and their carriers in groups of individuals. The method is based on a Compressed Sensing (CS) approach, which is general, simple and efficient. CS allows the use of generic algorithmic tools for simultaneous identification of multiple variants and their carriers. We model the experimental procedure and show via computer simulations that it enables the recovery of rare alleles and their carriers in larger groups than were possible before. Our approach can also be combined with barcoding techniques to provide a feasible solution based on current resequencing costs. For example, when targeting a small enough genomic region (∼100 bp) and using only ∼10 sequencing lanes and ∼10 distinct barcodes per lane, one recovers the identity of 4 rare allele carriers out of a population of over 4000 individuals. We demonstrate the performance of our approach over several publicly available experimental data sets

CiteSeerX

Crossref

PubMed Central

PepDist: A New Framework for Protein-Peptide Binding Prediction based on Learning Peptide Distance Functions

Author: A Bar-Hilel
A Sette
A Sette
AP Dempster
AY Hung
CA Janeway
Chen Yanover
D Klein
DR Flower
DR Madden
E Xing
H Mamitsuka
HG Rammensee
JW Yewdell
K Gulukota
K WagstafF
K Yu
M Andersen
M Bhasin
M Bilenko
MS Venkatarajan
N Shental
O Schueler-Furman
P Donnes
PA Reche
RE Schapire
RE Schapire
S Buus
T Bailey
T Hertz
T Hertz
Tomer Hertz
U Wiedemann
V Brusic
V Brusic
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Many different aspects of cellular signalling, trafficking and targeting mechanisms are mediated by interactions between proteins and peptides. Representative examples are MHC-peptide complexes in the immune system. Developing computational methods for protein-peptide binding prediction is therefore an important task with applications to vaccine and drug design. METHODS: Previous learning approaches address the binding prediction problem using traditional margin based binary classifiers. In this paper we propose PepDist: a novel approach for predicting binding affinity. Our approach is based on learning peptide-peptide distance functions. Moreover, we suggest to learn a single peptide-peptide distance function over an entire family of proteins (e.g. MHC class I). This distance function can be used to compute the affinity of a novel peptide to any of the proteins in the given family. In order to learn these peptide-peptide distance functions, we formalize the problem as a semi-supervised learning problem with partial information in the form of equivalence constraints. Specifically, we propose to use DistBoost [1,2], which is a semi-supervised distance learning algorithm. RESULTS: We compare our method to various state-of-the-art binding prediction algorithms on MHC class I and MHC class II datasets. In almost all cases, our method outperforms all of its competitors. One of the major advantages of our novel approach is that it can also learn an affinity function over proteins for which only small amounts of labeled peptides exist. In these cases, our method's performance gain, when compared to other computational methods, is even more pronounced. We have recently uploaded the PepDist webserver which provides binding prediction of peptides to 35 different MHC class I alleles. The webserver which can be found at is powered by a prediction engine which was trained using the framework presented in this paper. CONCLUSION: The results obtained suggest that learning a single distance function over an entire family of proteins achieves higher prediction accuracy than learning a set of binary classifiers for each of the proteins separately. We also show the importance of obtaining information on experimentally determined non-binders. Learning with real non-binders generalizes better than learning with randomly generated peptides that are assumed to be non-binders. This suggests that information about non-binding peptides should also be published and made publicly available

Crossref

Springer - Publisher Connector

PubMed Central

Hypertension, pregnancy and weather: is seasonality involved?

Author: Algert CS
Bodnar LM
Brena Melo
Elongi JP
Gerber Y
Isabela Coutinho
José Natal Figueiroa
Khan KS
Leila Katz
Magann EF
Magnu P
Makhseed M
Melania Amorim
Mignini LE
Okafor U
Roberts JM
Shental O
Siffel C
Subramaniam V
Tam WH
TePoel MR
Publication venue: 'FapUNIFESP (SciELO)'
Publication date: 01/01/2014
Field of study

Crossref

Importance of Post-Translational Modifications for Functionality of a Chloroplast-Localized Carbonic Anhydrase (CAH1) in Arabidopsis thaliana

Background: The Arabidopsis CAH1 alpha-type carbonic anhydrase is one of the few plant proteins known to be targeted to the chloroplast through the secretory pathway. CAH1 is post-translationally modified at several residues by the attachment of N-glycans, resulting in a mature protein harbouring complex-type glycans. The reason of why trafficking through this non-canonical pathway is beneficial for certain chloroplast resident proteins is not yet known. Therefore, to elucidate the significance of glycosylation in trafficking and the effect of glycosylation on the stability and function of the protein, epitope-labelled wild type and mutated versions of CAH1 were expressed in plant cells. Methodology/Principal Findings: Transient expression of mutant CAH1 with disrupted glycosylation sites showed that the protein harbours four, or in certain cases five, N-glycans. While the wild type protein trafficked through the secretory pathway to the chloroplast, the non-glycosylated protein formed aggregates and associated with the ER chaperone BiP, indicating that glycosylation of CAH1 facilitates folding and ER-export. Using cysteine mutants we also assessed the role of disulphide bridge formation in the folding and stability of CAH1. We found that a disulphide bridge between cysteines at positions 27 and 191 in the mature protein was required for correct folding of the protein. Using a mass spectrometric approach we were able to measure the enzymatic activity of CAH1 protein. Under circumstances where protein N-glycosylation is blocked in vivo, the activity of CAH1 is completely inhibited. Conclusions/Significance: We show for the first time the importance of post-translational modifications such as N-glycosylation and intramolecular disulphide bridge formation in folding and trafficking of a protein from the secretory pathway to the chloroplast in higher plants. Requirements for these post-translational modifications for a fully functional native protein explain the need for an alternative route to the chloroplast.This work was supported by the Swedish Research Council (VR), the Kempe Foundations and Carl Tryggers Foundation to GS, and grant numbers BIO2006-08946 and BIO2009-11340 from the Spanish Ministerio de Ciencia e Innovación (MICINN) to A

Public Library of Science (PLOS)

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Publikationer från Umeå universitet

Directory of Open Access Journals

PubMed Central

Biblos-e Archivo