6,677 research outputs found

    A framework for space-efficient string kernels

    Full text link
    String kernels are typically used to compare genome-scale sequences whose length makes alignment impractical, yet their computation is based on data structures that are either space-inefficient, or incur large slowdowns. We show that a number of exact string kernels, like the kk-mer kernel, the substrings kernels, a number of length-weighted kernels, the minimal absent words kernel, and kernels with Markovian corrections, can all be computed in O(nd)O(nd) time and in o(n)o(n) bits of space in addition to the input, using just a rangeDistinct\mathtt{rangeDistinct} data structure on the Burrows-Wheeler transform of the input strings, which takes O(d)O(d) time per element in its output. The same bounds hold for a number of measures of compositional complexity based on multiple value of kk, like the kk-mer profile and the kk-th order empirical entropy, and for calibrating the value of kk using the data

    Filogenia y evolución de la conducta en los ictéridos

    Get PDF
    The evolution of selected behavioral characteristics in the family Icteridae is discussed in the light of the new DNA phylogeny. The woven pensile nest is found in only two of the main icterid clades, the caciques plus oropendolas (Cacicus and Psarocolius), and in the genus Icterus. It is difficult to assert if this nest type represents an ancestral character to both lineages, or a case of convergence. Nest building mostly by males is only known in the South American genus Chrysomus. Cooperative breeding is found mostly in the South American quiscaline clade, with reports for 13 species. The hypothesis that cooperative breeding is an ancestral trait in this clade is supported by its unusual frequency in the group, and also because it is found in the genus Macroagelaius, placed in a basal position in the lineage. Brood parasitism evolved only once in the family, probably in ancestral North American cowbirds. Withouth denying a role for environment in shaping icterid behavior, the new molecular data supports the idea of an important phylogenetic component in behavioral evolutionSe discute la evolución de algunas características de la conducta de los tordos de la familia Icteridae a la luz de la nueva filogenia basada en secuencias del ADN. El nido tejido péndulo se encuentra solo en dos linajes o clados principales de la familia, los caciques mas oropéndolas (Cacicus y Psarocolius), y en el género Icterus. Resulta difícil deducir si este tipo de nido es una característica ancestral a ambos linajes (plesiomorfía) o un caso de convergencia. La construcción de nidos principalmente por el macho se conoce solamente en el género sudamericano Chrysomus. La cría cooperativa es registrada principalmente en el linaje de los quiscalinos sudamericanos, donde ha sido reportada para 13 especies. La hipótesis de que la cría cooperativa fuese una característica ancestral en este linaje resulta posible, primero por su frecuencia inusual en el mismo, y además por encontrarse en el género Macroagelaius, ubicado en una posición basal en este clado. El parasitismo de cría evolucionó una sola vez en la familia, probablemente en formas ancestrales norteamericanas de Molothrus. Sin negar un rol al medio ambiente en moldear la conducta de los ictéridos, los nuevos datos moleculares permiten también reconocer un importante componente filogenético en la evolución de la mismaFil: Fraga, Rosendo Manuel. Provincia de Entre Ríos. Centro de Investigaciones Científicas y Transferencia de Tecnología a la Producción. Universidad Autónoma de Entre Ríos. Centro de Investigaciones Científicas y Transferencia de Tecnología a la Producción. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Centro de Investigaciones Científicas y Transferencia de Tecnología a la Producción; Argentin

    Minimal Forbidden Factors of Circular Words

    Full text link
    Minimal forbidden factors are a useful tool for investigating properties of words and languages. Two factorial languages are distinct if and only if they have different (antifactorial) sets of minimal forbidden factors. There exist algorithms for computing the minimal forbidden factors of a word, as well as of a regular factorial language. Conversely, Crochemore et al. [IPL, 1998] gave an algorithm that, given the trie recognizing a finite antifactorial language MM, computes a DFA recognizing the language whose set of minimal forbidden factors is MM. In the same paper, they showed that the obtained DFA is minimal if the input trie recognizes the minimal forbidden factors of a single word. We generalize this result to the case of a circular word. We discuss several combinatorial properties of the minimal forbidden factors of a circular word. As a byproduct, we obtain a formal definition of the factor automaton of a circular word. Finally, we investigate the case of minimal forbidden factors of the circular Fibonacci words.Comment: To appear in Theoretical Computer Scienc

    BOOL-AN: A method for comparative sequence analysis and phylogenetic reconstruction

    Get PDF
    A novel discrete mathematical approach is proposed as an additional tool for molecular systematics which does not require prior statistical assumptions concerning the evolutionary process. The method is based on algorithms generating mathematical representations directly from DNA/RNA or protein sequences, followed by the output of numerical (scalar or vector) and visual characteristics (graphs). The binary encoded sequence information is transformed into a compact analytical form, called the Iterative Canonical Form (or ICF) of Boolean functions, which can then be used as a generalized molecular descriptor. The method provides raw vector data for calculating different distance matrices, which in turn can be analyzed by neighbor-joining or UPGMA to derive a phylogenetic tree, or by principal coordinates analysis to get an ordination scattergram. The new method and the associated software for inferring phylogenetic trees are called the Boolean analysis or BOOL-AN

    TrAp: a Tree Approach for Fingerprinting Subclonal Tumor Composition

    Full text link
    Revealing the clonal composition of a single tumor is essential for identifying cell subpopulations with metastatic potential in primary tumors or with resistance to therapies in metastatic tumors. Sequencing technologies provide an overview of an aggregate of numerous cells, rather than subclonal-specific quantification of aberrations such as single nucleotide variants (SNVs). Computational approaches to de-mix a single collective signal from the mixed cell population of a tumor sample into its individual components are currently not available. Herein we propose a framework for deconvolving data from a single genome-wide experiment to infer the composition, abundance and evolutionary paths of the underlying cell subpopulations of a tumor. The method is based on the plausible biological assumption that tumor progression is an evolutionary process where each individual aberration event stems from a unique subclone and is present in all its descendants subclones. We have developed an efficient algorithm (TrAp) for solving this mixture problem. In silico analyses show that TrAp correctly deconvolves mixed subpopulations when the number of subpopulations and the measurement errors are moderate. We demonstrate the applicability of the method using tumor karyotypes and somatic hypermutation datasets. We applied TrAp to SNV frequency profile from Exome-Seq experiment of a renal cell carcinoma tumor sample and compared the mutational profile of the inferred subpopulations to the mutational profiles of twenty single cells of the same tumor. Despite the large experimental noise, specific co-occurring mutations found in clones inferred by TrAp are also present in some of these single cells. Finally, we deconvolve Exome-Seq data from three distinct metastases from different body compartments of one melanoma patient and exhibit the evolutionary relationships of their subpopulations

    Comparative study of spinning field development in two species of araneophagic spiders (Araneae, Mimetidae, Australomimetus)

    Get PDF
    External studies of spider spinning fields allow us to make inferences about internal silk gland biology, including what happens to silk glands when the spider molts. Such studies often focus on adults, but juveniles can provide additional insight on spinning apparatus development and character polarity. Here we document and describe spinning fields at all stadia in two species of pirate spider (Mimetidae: Australomimetus spinosus, A. djuka). Pirate spiders nest within the ecribellate orb-building spiders (Araneoidea), but are vagrant, araneophagic members that do not build prey-capture webs. Correspondingly, they lack aggregate and flagelliform silk glands (AG, FL), specialized for forming prey-capture lines in araneoid orb webs. However, occasional possible vestiges of an AG or FL spigot, as observed in one juvenile A. spinosus specimen, are consistent with secondary loss of AG and FL. By comparing spigots from one stadium to tartipores from the next stadium, silk glands can be divided into those that are tartipore-accommodated (T-A), and thus functional during proecdysis, and those that are not (non-T-A). Though evidence was more extensive in A. spinosus, it was likely true for both species that the number of non-T-A piriform silk glands (PI) was constant (two pairs) through all stadia, while numbers of T-A PI rose incrementally. The two species differed in that A. spinosus had T-A minor ampullate and aciniform silk glands (MiA, AC) that were absent in A. djuka. First instars of A. djuka, however, appeared to retain vestiges of T-A MiA spigots, consistent with a plesiomorphic state in which T-A MiA (called secondary MiA) are present. T-A AC have not previously been observed in Australomimetus and the arrangement of their spigots on posterior lateral spinnerets was unlike that seen thus far in other mimetid genera. Though new AC and T-A PI apparently form throughout much of a spider’s ontogeny, recurring spigot/tartipore arrangements indicated that AC and PI, after functioning during one stadium, were used again in each subsequent stadium (if non-T-A) or in alternate subsequent stadia (if T-A). In A. spinosus, sexual and geographic dimorphisms involving AC were noted. Cylindrical silk gland (CY) spigots were observed in mid-to-late juvenile, as well as adult, females of both species. Their use in juveniles, however, should not be assumed and only adult CY spigots had wide openings typical of mimetids. Neither species exhibited two pairs of modified PI spigots present in some adult male mimetids