914 research outputs found
Homologous Codes for Multiple Access Channels
Building on recent development by Padakandla and Pradhan, and by Lim, Feng,
Pastore, Nazer, and Gastpar, this paper studies the potential of structured
nested coset coding as a complete replacement for random coding in network
information theory. The roles of two techniques used in nested coset coding to
generate nonuniform codewords, namely, shaping and channel transformation, are
clarified and illustrated via the simple example of the two-sender multiple
access channel. While individually deficient, the optimal combination of
shaping and channel transformation is shown to achieve the same performance as
traditional random codes for the general two-sender multiple access channel.
The achievability proof of the capacity region is extended to the multiple
access channels with more than two senders, and with one or more receivers. A
quantization argument consistent with the construction of nested coset codes is
presented to prove achievability for their Gaussian counterparts. These results
open up new possibilities of utilizing nested coset codes with the same
generator matrix for a broader class of applications
Recommended from our members
Coding mechanisms for communication and compression : analysis of wireless channels and DNA sequencing
textThis thesis comprises of two related but distinct components: Coding arguments for communication channels and information-theoretic analysis for haplotype assembly. The common thread for both problems is utilizing information and coding theoretic principles in understanding their underlying mechanisms. For the first class of problems, I study two practical challenges that prevent optimal discrete codes utilizing in real communication and compression systems, namely, coding over analog alphabet and fading. In particular, I use an expansion coding scheme to convert the original analog channel coding and source coding problems into a set of independent discrete subproblems. By adopting optimal discrete codes over the expanded levels, this low-complexity coding scheme can approach Shannon limit perfectly or in ratio. Meanwhile, I design a polar coding scheme to deal with the unstable state of fading channels. This novel coding mechanism of hierarchically utilizing different types of polar codes has been proved to be ergodic capacity achievable for several fading systems, without channel state information known at the transmitter. For the second class of problems, I build an information-theoretic view for haplotype assembly. More precisely, the recovery of the target pair of haplotype sequences using short reads is rephrased as the joint source-channel coding problem. Two binary messages, representing haplotypes and chromosome memberships of reads, are encoded and transmitted over a channel with erasures and errors, where the channel model reflects salient features of highthroughput sequencing. The focus is on determining the required number of reads for reliable haplotype reconstruction.Electrical and Computer Engineerin
Neuronal energy consumption: biophysics, efficiency and evolution
Electrical and chemical signaling within and between neurons consumes energy. Recent studies have sought to refine our understanding of the processes that consume energy and their relationship to information processing by coupling experiments with computational models and energy budgets. These studies have produced insights into both how neurons and neural circuits function, and why they evolved to function in the way they do
Fast decoders for qudit topological codes
Qudit toric codes are a natural higher-dimensional generalization of the well-
studied qubit toric code. However, standard methods for error correction of
the qubit toric code are not applicable to them. Novel decoders are needed. In
this paper we introduce two renormalization group decoders for qudit codes and
analyse their error correction thresholds and efficiency. The first decoder is
a generalization of a 'hard-decisions' decoder due to Bravyi and Haah
(arXiv:1112.3252). We modify this decoder to overcome a percolation effect
which limits its threshold performance for many-level quantum systems. The
second decoder is a generalization of a 'soft-decisions' decoder due to Poulin
and Duclos-Cianci (2010 Phys. Rev. Lett. 104 050504), with a small cell size
to optimize the efficiency of implementation in the high dimensional case. In
each case, we estimate thresholds for the uncorrelated bit-flip error model
and provide a comparative analysis of the performance of both these approaches
to error correction of qudit toric codes
Protein Function Prediction using Phylogenomics, Domain Architecture Analysis, Data Integration, and Lexical Scoring
“As the number of sequenced genomes rapidly grows, the overwhelming majority of protein products can only be annotated computationally.” (Radivojac, Clark, Oron, et al. 2013) With this goal, three new protein function annotation tools were developed, which produce trustworthy and concise protein annotations, are easy to obtain and install, and are capable of processing large sets of proteins with reasonable computational resource demands. Especially for high throughput analysis e.g. on genome scale, these tools improve over existing tools both in ease of use and accuracy. They are dubbed: • Automated Assignment of Human Readable Descriptions (AHRD) (github.com/groupschoof/AHRD; Hallab, Klee, Srinivas, and Schoof 2014), • AHRD on gene clusters, and • Phylogenetic predictions of Gene Ontology (GO) terms with specific calibrations (PhyloFun v2). “AHRD” assigns human readable descriptions (HRDs) to query proteins and was developed to mimic the decision making process of an expert curator. To this end it processes the descriptions of reference proteins obtained by searching selected databases with BLAST (Altschul, Madden, Schaffer, et al. 1997). Here, the trust a user puts into results found in each of these databases can be weighted separately. In the next step the descriptions of the found homologous proteins are filtered, removing accessions, species information, and finally discarding uninformative candidate descriptions like e.g. “putative protein”. Afterwards a dictionary of meaningful words is constructed from those found in the remaining candidates. In this, another filter is applied to ignore words, not conveying information like e.g. the word “protein” itself. In a lexical approach each word is assigned a score based on its frequency in all candidate descriptions, the sequence alignment quality associated with the candidate reference proteins, and finally the already mentioned trust put into the database the reference was obtained from. Subsequently each candidate description is assigned a score, which is computed from the respective scores of the meaningful words contained in that candidate. Also incorporated into this score is the description’s frequency among all regarded candidates. In the final step the highest scoring description is assigned to the query protein. The performance of this lexical algorithm, implemented in “AHRD”, was subsequently compared with that of competitive methods, which were Blast2GO and “best Blast”, where the latter “best Blast” simply passes the description of the best scoring hit to the query protein. To enable this comparison of performance, and in lack of a robust evaluation procedure, a new method to measure the accuracy of textual human readable protein descriptions was developed and applied with success. In this, the accuracy of each assigned competitive description was inferred with the frequently used “F-measure”, the harmonic mean of precision and recall, which we computed regarding meaningful words appearing in both the reference and the assigned descriptions as true positives. The results showed that “AHRD” not only outperforms its competitors by far, but also is very robust and thus does not require its users to use carefully selected parameters. In fact, AHRD’s robustness was demonstrated through cross validation and use of three different reference sets. The second annotation tool “AHRD on gene clusters” uses conserved protein domains from the InterPro database (Apweiler, Attwood, Bairoch, et al. 2000) to annotate clusters of homologous proteins. In a first step the domains found in each cluster are filtered, such that only the most informative are retained. For example are family descriptions discarded, if more detailed sub-family descriptions are also found annotated to members of the cluster. Subsequently, the most frequent candidate description is assigned, favoring those of type “family” over “domain”. Finally the third tool “PhyloFun (v2)” was developed to annotate large sets of query proteins with terms from the Gene Ontology. This work focussed on extending the “Belief propagation” (Pearl 1988) algorithm implemented in the “Sifter” annotation tool (Engelhardt, Jordan, Muratore, and Brenner 2005; Engelhardt, Jordan, Srouji, and Brenner 2011). Jöcker had developed a phylogenetic pipeline generating the input that was fed into the Sifter program. This pipeline executes stringent sequence similarity searches in a database of selected reference proteins, and reconstruct a phylogenetic tree from the found orthologs and inparalogs. This tree is than used by the Sifter program and interpreted as a “Bayesian Network” into which the GO term annotations of the homologous reference proteins are fed as “diagnostic evidence” (Pearl 1988). Subsequently the current strength of belief, the probability of this evidence being also the true state of ancestral tree nodes, is then spread recursively through the tree towards its root, and then vice versa towards the tips. These, of course, include the query protein, which in the final step is annotated with those GO terms that have the strongest belief. Note that during this recursive belief propagation a given GO term’s annotation probability depends on both the length of the currently processed branch, as well as the type of evolutionary event that took place. This event can be one of “speciation” or “duplication”, such that function mutation becomes more likely on longer branches and particularly after “duplication” events. A particular goal in extending this algorithm was to base the annotation probability of a given GO term not on a preconceived model of function evolution among homologous proteins as implemented in Sifter, but instead to compute these GO term annotation probabilities based on empirical measurements. To achieve this, calibrations were computed for each GO term separately, and reference proteins annotated with a given GO term were investigated such that the probability of function loss could be assessed empirically for decreasing sequence homology among related proteins. A second goal was to overcome errors in the identification of the type of evolutionary events. These errors arose from missing knowledge in terms of true species trees, which, in version 1 of the PhyloFun pipeline, are compared with the actual protein trees in order to tell “duplication” from “speciation” events (Zmasek and Eddy 2001). As reliable reference species trees are sparse or in many cases not available, the part of the algorithm incorporating the type of evolutionary event was discarded. Finally, the third goal postulated for the development of PhyloFun’s version 2 was to enable easy installation, usage, and calibration on latest available knowledge. This was motivated by observations made during the application of the first version of PhyloFun, in which maintaining the knowledge-base was almost not feasible. This obstacle was overcome in version 2 of PhyloFun by obtaining required reference data directly from publicly available databases. The accuracy and performance of the new PhyloFun version 2 was assessed and compared with selected competitive methods. These were chosen based on their widespread usage, as well as their applicability on large sets of query proteins without them surpassing reasonable time and computational resource requirements. The measurement of each method’s performance was carried out on a “gold standard”, obtained from the Uniprot/Swissprot public database (Boeckmann, Bairoch, Apweiler, et al. 2003), of 1000 selected reference proteins, all of which had GO term annotations made by expert curators and mostly based on experimental verifications. Subsequently the performance assessment was executed with a slightly modified version of the “Critical Assessment of Function Annotation experiment (CAFA)” experiment (Radivojac, Clark, Oron, et al. 2013). CAFA compares the performance of different protein function annotation tools on a worldwide scale using a provided set of reference proteins. In this, the predictions the competitors deliver are evaluated using the already introduced “F-measure”. Our performance evaluation of PhyloFun’s protein annotations interestingly showed that PhyloFun outperformed all of its competitors. Its use is recommended furthermore by the highly accurate phylogenetic trees the pipeline computes for each query and the found homologous reference proteins. In conclusion, three new premium tools addressing important matters in the computational prediction of protein function were developed and, in two cases, their performance assessed. Here, both AHRD and PhyloFun (v2) outperformed their competitors. Further arguments for the usage of all three tools are, that they are easy to install and use, as well as being reasonably resource demanding. Because of these results the publications of AHRD and PhyloFun (v2) are in preparation, even while AHRD already is applied by different researchers worldwide
On the schedulability of deadline-constrained traffic in TDMA Wireless Mesh Networks
In this paper, we evaluate the schedulability of traffic with arbitrary end-to-end deadline constraints in Wireless Mesh Networks (WMNs). We formulate the problem as a mixed integer linear optimization problem, and show that, depending on the flow aggregation policy used in the network, the problem can be either convex or non-convex. We optimally solve the problem in both cases, and prove that the schedulability does depend on the aggregation policy. This allows us to derive rules of thumb to identify which policy improves the schedulability with a given traffic. Furthermore, we propose a heuristic solution strategy that allows good suboptimal solutions to the scheduling problem to be computed in relatively small times, comparable to those required for online admission control in relatively large WMNs
Recommended from our members
Modelling the evolution of biological complexity with a two-dimensional lattice self-assembly process
Self-assembling systems are prevalent across numerous scales of nature, lying at the heart of diverse physical and biological phenomena.
Individual protein subunits self-assembling into complexes is often a vital first step of biological processes.
Errors during protein assembly, due to mutations or misfolds, can have devastating effects and are responsible for an assortment of protein diseases, known as proteopathies.
With proteins exhibiting endless layers of complexity, building any all-encompassing model is unrealistic.
Coarse-grained models, despite not faithfully capturing every detail of the original system, have massive potential to assist understanding complex phenomenon.
A principal actor in self-assembly is the binding interactions between subunits, and so geometric constraints, polarity, kinetic forces, etc. can often be marginalised.
This work explores how self-assembly and its outcomes are inextricably tied to the involved interactions through the use of a two-dimensional lattice polyomino model.
%Armed with this tractable model, we can probe how dynamics acting on evolution are reflected in interaction properties.
First, this thesis addresses how the interaction characteristics of self-assembly building blocks determine what structures they form.
Specifically, if the same structures are consistently produced and remain finite in size.
Assembly graphs store subunit interaction information and are used in classifying these two properties, the determinism and boundedness respectively.
Arbitrary sets of building blocks are classified without the costly overhead of repeated stochastic assembling, improving both the analysis speed and accuracy.
Furthermore, assembly graphs naturally integrate combinatorial and graph techniques, enabling a wider range of future polyomino studies.
The second part narrows in on implications of nondeterministic assembly on interaction strength evolution.
Generalising subunit binding sites with mutable binary strings introduces such interaction strengths into the polyomino model.
Deterministic assemblies obey analytic expectations.
Conversely, interactions in nondeterministic assemblies rapidly diverge from equilibrium to minimise assembly inconsistency.
Optimal interaction strengths during assembly are also reflected in evolution.
Transitions between certain polyominoes are strongly forbidden when interaction strengths are misaligned.
The third aspect focuses on genetic duplication, an evolutionary event observed in organisms across all taxa.
Through polyomino evolutions, a duplication-heteromerisation pathway emerges as an efficient process.
This pathway exploits the advantages of both self-interactions and pairwise-interactions, and accelerates evolution by avoiding complexity bottlenecks.
Several simulation predictions are successfully validated against a large data set of protein complexes.
These results focus on coarse-grained models rather than quantified biological insight.
Despite this, they reinforce existing observations of protein complexes, as well as posing several new mechanisms for the evolution of biological complexity
- …