50 research outputs found
Subwords in reverse-complement order
We examine finite words over an alphabet of pairs of letters, where each word is identical with its {it reverse complement} (where ). We seek the smallest such that every word of length composed from , is uniquely determined by the set of its subwords of length up to . Our almost sharp result () is an analogue of a classical result for ``normal\u27\u27 words.
This classical problem originally was posed by M.P. Sch"utzenberger and I. Simon, and gained a considerable interest for several researchers, foremost by Vladimir Levenshtein.
Our problem has its roots in bioinformatics
A finite word poset : In honor of Aviezri Fraenkel on the occasion of his 70th birthday
Our word posets have �nite words of bounded length as their elements, with
the words composed from a �nite alphabet. Their partial ordering follows from the
inclusion of a word as a subsequence of another word. The elemental combinatorial
properties of such posets are established. Their automorphism groups are determined
(along with similar result for the word poset studied by Burosch, Frank and
R¨ohl [4]) and a BLYM inequality is veri�ed (via the normalized matching property)
Recommended from our members
Assessing rotation-invariant feature classification for automated wildebeest population counts
Accurate and on-demand animal population counts are the holy grail for wildlife conservation organizations throughout the world because they enable fast and responsive adaptive management policies. While the collection of image data from camera traps, satellites, and manned or unmanned aircraft has advanced significantly, the detection and identification of animals within images remains a major bottleneck since counting is primarily conducted by dedicated enumerators or citizen scientists. Recent developments in the field of computer vision suggest a potential resolution to this issue through the use of rotation-invariant object descriptors combined with machine learning algorithms. Here we implement an algorithm to detect and count wildebeest from aerial images collected in the Serengeti National Park in 2009 as part of the biennial wildebeest count. We find that the per image error rates are greater than, but comparable to, two separate human counts. For the total count, the algorithm is more accurate than both manual counts, suggesting that human counters have a tendency to systematically over or under count images. While the accuracy of the algorithm is not yet at an acceptable level for fully automatic counts, our results show this method is a promising avenue for further research and we highlight specific areas where future research should focus in order to develop fast and accurate enumeration of aerial count data. If combined with a bespoke image collection protocol, this approach may yield a fully automated wildebeest count in the near future
Integrin Clustering Is Driven by Mechanical Resistance from the Glycocalyx and the Substrate
Integrins have emerged as key sensory molecules that translate chemical and physical cues from the extracellular matrix (ECM) into biochemical signals that regulate cell behavior. Integrins function by clustering into adhesion plaques, but the molecular mechanisms that drive integrin clustering in response to interaction with the ECM remain unclear. To explore how deformations in the cell-ECM interface influence integrin clustering, we developed a spatial-temporal simulation that integrates the micro-mechanics of the cell, glycocalyx, and ECM with a simple chemical model of integrin activation and ligand interaction. Due to mechanical coupling, we find that integrin-ligand interactions are highly cooperative, and this cooperativity is sufficient to drive integrin clustering even in the absence of cytoskeletal crosslinking or homotypic integrin-integrin interactions. The glycocalyx largely mediates this cooperativity and hence may be a key regulator of integrin function. Remarkably, integrin clustering in the model is naturally responsive to the chemical and physical properties of the ECM, including ligand density, matrix rigidity, and the chemical affinity of ligand for receptor. Consistent with experimental observations, we find that integrin clustering is robust on rigid substrates with high ligand density, but is impaired on substrates that are highly compliant or have low ligand density. We thus demonstrate how integrins themselves could function as sensory molecules that begin sensing matrix properties even before large multi-molecular adhesion complexes are assembled
Finishing the euchromatic sequence of the human genome
The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead
Teachers’ Work: Institutional Isomorphism and Cultural Variation in the U.S., Germany, and Japan
Systems of linear congruences with individual moduli
AbstractConsider an n×n matrix A, with integer elements, a column vector x of n integer indeterminates, and a column vector Q of n integers greater than unity. Ax modulo Q constitutes another n-vector b of nonnegative integers. The elemental feature of interest for such systems is whether they are regular (i.e., nonsingular): whether b uniquely determines x modulo Q. Let Pσ denote the permutation matrix corresponding to a permutation σ of {1,2,…,n}. Then, for the special case of all pairs of elements of Q having the same greatest common factor, it is established that regularity obtains if and only if there exists a permutation σ so that PσAPσT is a triangular matrix with each element on the main diagonal coprime to its respective modulus (from PσQ). To resolve systems with general Q, a set of moduli is first derived from each original modulus by factoring it into prime-power factors. We introduce a corresponding regularity-preserving transformation of A and Q into an A′ and Q′: the latter containing, exclusively, prime-power moduli. Elementary transformations of A′ preserving regularity modulo Q′—denoted equivalences—are introduced. A′ is shown to be regular modulo Q′ if and only if there exists a permutation σ so that PσA′PσT is equivalent to a triangular matrix, having each element on the main diagonal coprime to its respective modulus (from PσQ′). Whence, regularity is fully resolved for general systems. An algorithm for solving an arbitrary regular system Ax≡b (modQ) is, furthermore, implicit in these results
Group Testing With DNA Chips: Generating Designs and Decoding Experiments
DNA microarrays are a valuable tool for massively parallel DNA-DNA hybridization experiments. Currently, most applications rely on the existence of sequence-specific oligonucleotide probes. In large families of closely related target sequences, such as different virus subtypes, the high degree of similarity often makes it impossible to find a unique probe for every target. Fortunately, this is unnecessary. We propose a microarray design methodology based on a group testing approach. While probes might bind to multiple targets simultaneously, a properly chosen probe set can still unambiguously distinguish the presence of one target set from the presence of a different target set. Our method is the first one that explicitly takes cross-hybridization and experimental errors into account while accommodating several targets. The approach consists of three steps: (1) Pre-selection of probe candidates, (2) Generation of a suitable group testing design, and (3) Decoding of hybridization results to infer presence or absence of individual targets. Our results show that this approach is very promising, even for challenging data sets and experimental error rates of up to 5%. On a data set of 28S rDNA sequences we were able to identify 660 sequences, a substantial improvement over a prior approach using unique probes which only identified 408 sequences