6,677 research outputs found
A framework for space-efficient string kernels
String kernels are typically used to compare genome-scale sequences whose
length makes alignment impractical, yet their computation is based on data
structures that are either space-inefficient, or incur large slowdowns. We show
that a number of exact string kernels, like the -mer kernel, the substrings
kernels, a number of length-weighted kernels, the minimal absent words kernel,
and kernels with Markovian corrections, can all be computed in time and
in bits of space in addition to the input, using just a
data structure on the Burrows-Wheeler transform of the
input strings, which takes time per element in its output. The same
bounds hold for a number of measures of compositional complexity based on
multiple value of , like the -mer profile and the -th order empirical
entropy, and for calibrating the value of using the data
Filogenia y evolución de la conducta en los ictéridos
The evolution of selected behavioral characteristics in the family Icteridae is discussed in the light of the new DNA phylogeny. The woven pensile nest is found in only two of the main icterid clades, the caciques plus oropendolas (Cacicus and Psarocolius), and in the genus Icterus. It is difficult to assert if this nest type represents an ancestral character to both lineages, or a case of convergence. Nest building mostly by males is only known in the South American genus Chrysomus. Cooperative breeding is found mostly in the South American quiscaline clade, with reports for 13 species. The hypothesis that cooperative breeding is an ancestral trait in this clade is supported by its unusual frequency in the group, and also because it is found in the genus Macroagelaius, placed in a basal position in the lineage. Brood parasitism evolved only once in the family, probably in ancestral North American cowbirds. Withouth denying a role for environment in shaping icterid behavior, the new molecular data supports the idea of an important phylogenetic component in behavioral evolutionSe discute la evolución de algunas características de la conducta de los tordos de la familia Icteridae a la luz de la nueva filogenia basada en secuencias del ADN. El nido tejido péndulo se encuentra solo en dos linajes o clados principales de la familia, los caciques mas oropéndolas (Cacicus y Psarocolius), y en el género Icterus. Resulta difícil deducir si este tipo de nido es una característica ancestral a ambos linajes (plesiomorfía) o un caso de convergencia. La construcción de nidos principalmente por el macho se conoce solamente en el género sudamericano Chrysomus. La cría cooperativa es registrada principalmente en el linaje de los quiscalinos sudamericanos, donde ha sido reportada para 13 especies. La hipótesis de que la cría cooperativa fuese una característica ancestral en este linaje resulta posible, primero por su frecuencia inusual en el mismo, y además por encontrarse en el género Macroagelaius, ubicado en una posición basal en este clado. El parasitismo de cría evolucionó una sola vez en la familia, probablemente en formas ancestrales norteamericanas de Molothrus. Sin negar un rol al medio ambiente en moldear la conducta de los ictéridos, los nuevos datos moleculares permiten también reconocer un importante componente filogenético en la evolución de la mismaFil: Fraga, Rosendo Manuel. Provincia de Entre Ríos. Centro de Investigaciones Científicas y Transferencia de Tecnología a la Producción. Universidad Autónoma de Entre Ríos. Centro de Investigaciones Científicas y Transferencia de Tecnología a la Producción. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Centro de Investigaciones Científicas y Transferencia de Tecnología a la Producción; Argentin
Minimal Forbidden Factors of Circular Words
Minimal forbidden factors are a useful tool for investigating properties of
words and languages. Two factorial languages are distinct if and only if they
have different (antifactorial) sets of minimal forbidden factors. There exist
algorithms for computing the minimal forbidden factors of a word, as well as of
a regular factorial language. Conversely, Crochemore et al. [IPL, 1998] gave an
algorithm that, given the trie recognizing a finite antifactorial language ,
computes a DFA recognizing the language whose set of minimal forbidden factors
is . In the same paper, they showed that the obtained DFA is minimal if the
input trie recognizes the minimal forbidden factors of a single word. We
generalize this result to the case of a circular word. We discuss several
combinatorial properties of the minimal forbidden factors of a circular word.
As a byproduct, we obtain a formal definition of the factor automaton of a
circular word. Finally, we investigate the case of minimal forbidden factors of
the circular Fibonacci words.Comment: To appear in Theoretical Computer Scienc
BOOL-AN: A method for comparative sequence analysis and phylogenetic reconstruction
A novel discrete mathematical approach is proposed as an additional tool for molecular systematics which does not require prior statistical assumptions concerning the evolutionary process. The method is based on algorithms generating mathematical representations directly from DNA/RNA or protein sequences, followed by the output of numerical (scalar or vector) and visual characteristics (graphs). The binary encoded sequence information is transformed into a compact analytical form, called the Iterative Canonical Form (or ICF) of Boolean functions, which can then be used as a generalized molecular descriptor. The method provides raw vector data for calculating different distance matrices, which in turn can be analyzed by neighbor-joining or UPGMA to derive a phylogenetic tree, or by principal coordinates analysis to get an ordination scattergram. The new method and the associated software for inferring phylogenetic trees are called the Boolean analysis or BOOL-AN
TrAp: a Tree Approach for Fingerprinting Subclonal Tumor Composition
Revealing the clonal composition of a single tumor is essential for
identifying cell subpopulations with metastatic potential in primary tumors or
with resistance to therapies in metastatic tumors. Sequencing technologies
provide an overview of an aggregate of numerous cells, rather than
subclonal-specific quantification of aberrations such as single nucleotide
variants (SNVs). Computational approaches to de-mix a single collective signal
from the mixed cell population of a tumor sample into its individual components
are currently not available. Herein we propose a framework for deconvolving
data from a single genome-wide experiment to infer the composition, abundance
and evolutionary paths of the underlying cell subpopulations of a tumor. The
method is based on the plausible biological assumption that tumor progression
is an evolutionary process where each individual aberration event stems from a
unique subclone and is present in all its descendants subclones. We have
developed an efficient algorithm (TrAp) for solving this mixture problem. In
silico analyses show that TrAp correctly deconvolves mixed subpopulations when
the number of subpopulations and the measurement errors are moderate. We
demonstrate the applicability of the method using tumor karyotypes and somatic
hypermutation datasets. We applied TrAp to SNV frequency profile from Exome-Seq
experiment of a renal cell carcinoma tumor sample and compared the mutational
profile of the inferred subpopulations to the mutational profiles of twenty
single cells of the same tumor. Despite the large experimental noise, specific
co-occurring mutations found in clones inferred by TrAp are also present in
some of these single cells. Finally, we deconvolve Exome-Seq data from three
distinct metastases from different body compartments of one melanoma patient
and exhibit the evolutionary relationships of their subpopulations
Recommended from our members
Evolution of substrate specificity in a retained enzyme driven by gene loss.
The connection between gene loss and the functional adaptation of retained proteins is still poorly understood. We apply phylogenomics and metabolic modeling to detect bacterial species that are evolving by gene loss, with the finding that Actinomycetaceae genomes from human cavities are undergoing sizable reductions, including loss of L-histidine and L-tryptophan biosynthesis. We observe that the dual-substrate phosphoribosyl isomerase A or priA gene, at which these pathways converge, appears to coevolve with the occurrence of trp and his genes. Characterization of a dozen PriA homologs shows that these enzymes adapt from bifunctionality in the largest genomes, to a monofunctional, yet not necessarily specialized, inefficient form in genomes undergoing reduction. These functional changes are accomplished via mutations, which result from relaxation of purifying selection, in residues structurally mapped after sequence and X-ray structural analyses. Our results show how gene loss can drive the evolution of substrate specificity from retained enzymes
Comparative study of spinning field development in two species of araneophagic spiders (Araneae, Mimetidae, Australomimetus)
External studies of spider spinning fields allow us to make inferences about internal silk gland biology, including what happens to silk glands when the spider molts. Such studies often focus on adults, but juveniles can provide additional insight on spinning apparatus development and character polarity. Here we document and describe spinning fields at all stadia in two species of pirate spider (Mimetidae: Australomimetus spinosus, A. djuka). Pirate spiders nest within the ecribellate orb-building spiders (Araneoidea), but are vagrant, araneophagic members that do not build prey-capture webs. Correspondingly, they lack aggregate and flagelliform silk glands (AG, FL), specialized for forming prey-capture lines in araneoid orb webs. However, occasional possible vestiges of an AG or FL spigot, as observed in one juvenile A. spinosus specimen, are consistent with secondary loss of AG and FL. By comparing spigots from one stadium to tartipores from the next stadium, silk glands can be divided into those that are tartipore-accommodated (T-A), and thus functional during proecdysis, and those that are not (non-T-A). Though evidence was more extensive in A. spinosus, it was likely true for both species that the number of non-T-A piriform silk glands (PI) was constant (two pairs) through all stadia, while numbers of T-A PI rose incrementally. The two species differed in that A. spinosus had T-A minor ampullate and aciniform silk glands (MiA, AC) that were absent in A. djuka. First instars of A. djuka, however, appeared to retain vestiges of T-A MiA spigots, consistent with a plesiomorphic state in which T-A MiA (called secondary MiA) are present. T-A AC have not previously been observed in Australomimetus and the arrangement of their spigots on posterior lateral spinnerets was unlike that seen thus far in other mimetid genera. Though new AC and T-A PI apparently form throughout much of a spider’s ontogeny, recurring spigot/tartipore arrangements indicated that AC and PI, after functioning during one stadium, were used again in each subsequent stadium (if non-T-A) or in alternate subsequent stadia (if T-A). In A. spinosus, sexual and geographic dimorphisms involving AC were noted. Cylindrical silk gland (CY) spigots were observed in mid-to-late juvenile, as well as adult, females of both species. Their use in juveniles, however, should not be assumed and only adult CY spigots had wide openings typical of mimetids. Neither species exhibited two pairs of modified PI spigots present in some adult male mimetids
- …