48,853 research outputs found
Thermodynamically Stable DNA Code Design using a Similarity Significance Model
DNA code design aims to generate a set of DNA sequences (codewords) with
minimum likelihood of undesired hybridizations among sequences and their
reverse-complement (RC) pairs (cross-hybridization). Inspired by the distinct
hybridization affinities (or stabilities) of perfect double helix constructed
by individual single-stranded DNA (ssDNA) and its RC pair, we propose a novel
similarity significance (SS) model to measure the similarity between DNA
sequences. Particularly, instead of directly measuring the similarity of two
sequences by any metric/approach, the proposed SS works in a way to evaluate
how more likely will the undesirable hybridizations occur over the desirable
hybridizations in the presence of the two measured sequences and their RC
pairs. With this SS model, we construct thermodynamically stable DNA codes
subject to several combinatorial constraints using a sorting-based algorithm.
The proposed scheme results in DNA codes with larger code sizes and wider free
energy gaps (hence better cross-hybridization performance) compared to the
existing methods.Comment: To appear in ISIT 202
On Critical Relative Distance of DNA Codes for Additive Stem Similarity
We consider DNA codes based on the nearest-neighbor (stem) similarity model
which adequately reflects the "hybridization potential" of two DNA sequences.
Our aim is to present a survey of bounds on the rate of DNA codes with respect
to a thermodynamically motivated similarity measure called an additive stem
similarity. These results yield a method to analyze and compare known samples
of the nearest neighbor "thermodynamic weights" associated to stacked pairs
that occurred in DNA secondary structures.Comment: 5 or 6 pages (compiler-dependable), 0 figures, submitted to 2010 IEEE
International Symposium on Information Theory (ISIT 2010), uses IEEEtran.cl
Coding limits on the number of transcription factors
Transcription factor proteins bind specific DNA sequences to control the
expression of genes. They contain DNA binding domains which belong to several
super-families, each with a specific mechanism of DNA binding. The total number
of transcription factors encoded in a genome increases with the number of genes
in the genome. Here, we examined the number of transcription factors from each
super-family in diverse organisms.
We find that the number of transcription factors from most super-families
appears to be bounded. For example, the number of winged helix factors does not
generally exceed 300, even in very large genomes. The magnitude of the maximal
number of transcription factors from each super-family seems to correlate with
the number of DNA bases effectively recognized by the binding mechanism of that
super-family. Coding theory predicts that such upper bounds on the number of
transcription factors should exist, in order to minimize cross-binding errors
between transcription factors. This theory further predicts that factors with
similar binding sequences should tend to have similar biological effect, so
that errors based on mis-recognition are minimal. We present evidence that
transcription factors with similar binding sequences tend to regulate genes
with similar biological functions, supporting this prediction.
The present study suggests limits on the transcription factor repertoire of
cells, and suggests coding constraints that might apply more generally to the
mapping between binding sites and biological function.Comment: http://www.weizmann.ac.il/complex/tlusty/papers/BMCGenomics2006.pdf
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1590034/
http://www.biomedcentral.com/1471-2164/7/23
MirBot: A collaborative object recognition system for smartphones using convolutional neural networks
MirBot is a collaborative application for smartphones that allows users to
perform object recognition. This app can be used to take a photograph of an
object, select the region of interest and obtain the most likely class (dog,
chair, etc.) by means of similarity search using features extracted from a
convolutional neural network (CNN). The answers provided by the system can be
validated by the user so as to improve the results for future queries. All the
images are stored together with a series of metadata, thus enabling a
multimodal incremental dataset labeled with synset identifiers from the WordNet
ontology. This dataset grows continuously thanks to the users' feedback, and is
publicly available for research. This work details the MirBot object
recognition system, analyzes the statistics gathered after more than four years
of usage, describes the image classification methodology, and performs an
exhaustive evaluation using handcrafted features, convolutional neural codes
and different transfer learning techniques. After comparing various models and
transformation methods, the results show that the CNN features maintain the
accuracy of MirBot constant over time, despite the increasing number of new
classes. The app is freely available at the Apple and Google Play stores.Comment: Accepted in Neurocomputing, 201
- …