39,464 research outputs found
Recommended from our members
XML-based genetic rules for scene boundary detection in a parallel processing environment
Genetic programming is based on Darwinian evolutionary theory that suggests that the best solution for a problem can be evolved by methods of natural selection of the fittest organisms in a population. These principles are translated into genetic programming by populating the solution space with an initial number of computer programs that can possibly solve the problem and then evolving the programs by means of mutation, reproduction and crossover until a candidate solution can be found that is close to or is the optimal solution for the problem. The computer programs are not fully formed source code but rather a derivative that is represented as a parse tree. The initial solutions are randomly generated and set to a certain population size that the system can compute efficiently. Research has shown that better solutions can be obtained if 1) the population size is increased and 2) if multiple runs are performed of each experiment. If multiple runs are initiated on many machines the probability of finding an optimal solution are increased exponentially and computed more efficiently. With the proliferation of the web and high speed bandwidth connections genetic programming can take advantage of grid computing to both increase population size and increasing the number of runs by utilising machines connected to the web. Using XML-Schema as a global referencing mechanism for defining the parameters and syntax of the evolvable computer programs all machines can synchronise ad-hoc to the ever changing environment of the solution space. Another advantage of using XML is that rules are constructed that can be transformed by XSLT or DOM tree viewers so they can be understood by the GP programmer. This allows the programmer to experiment by manipulating rules to increase the fitness of a rule and evaluate the selection of parameters used to define a solution
TrAp: a Tree Approach for Fingerprinting Subclonal Tumor Composition
Revealing the clonal composition of a single tumor is essential for
identifying cell subpopulations with metastatic potential in primary tumors or
with resistance to therapies in metastatic tumors. Sequencing technologies
provide an overview of an aggregate of numerous cells, rather than
subclonal-specific quantification of aberrations such as single nucleotide
variants (SNVs). Computational approaches to de-mix a single collective signal
from the mixed cell population of a tumor sample into its individual components
are currently not available. Herein we propose a framework for deconvolving
data from a single genome-wide experiment to infer the composition, abundance
and evolutionary paths of the underlying cell subpopulations of a tumor. The
method is based on the plausible biological assumption that tumor progression
is an evolutionary process where each individual aberration event stems from a
unique subclone and is present in all its descendants subclones. We have
developed an efficient algorithm (TrAp) for solving this mixture problem. In
silico analyses show that TrAp correctly deconvolves mixed subpopulations when
the number of subpopulations and the measurement errors are moderate. We
demonstrate the applicability of the method using tumor karyotypes and somatic
hypermutation datasets. We applied TrAp to SNV frequency profile from Exome-Seq
experiment of a renal cell carcinoma tumor sample and compared the mutational
profile of the inferred subpopulations to the mutational profiles of twenty
single cells of the same tumor. Despite the large experimental noise, specific
co-occurring mutations found in clones inferred by TrAp are also present in
some of these single cells. Finally, we deconvolve Exome-Seq data from three
distinct metastases from different body compartments of one melanoma patient
and exhibit the evolutionary relationships of their subpopulations
Learning mutational graphs of individual tumour evolution from single-cell and multi-region sequencing data
Background. A large number of algorithms is being developed to reconstruct
evolutionary models of individual tumours from genome sequencing data. Most
methods can analyze multiple samples collected either through bulk multi-region
sequencing experiments or the sequencing of individual cancer cells. However,
rarely the same method can support both data types.
Results. We introduce TRaIT, a computational framework to infer mutational
graphs that model the accumulation of multiple types of somatic alterations
driving tumour evolution. Compared to other tools, TRaIT supports multi-region
and single-cell sequencing data within the same statistical framework, and
delivers expressive models that capture many complex evolutionary phenomena.
TRaIT improves accuracy, robustness to data-specific errors and computational
complexity compared to competing methods.
Conclusions. We show that the application of TRaIT to single-cell and
multi-region cancer datasets can produce accurate and reliable models of
single-tumour evolution, quantify the extent of intra-tumour heterogeneity and
generate new testable experimental hypotheses
Mitochondrial and nuclear genes suggest that stony corals are monophyletic but most families of stony corals are not (Order Scleractinia, Class Anthozoa, Phylum Cnidaria)
Modern hard corals (Class Hexacorallia; Order Scleractinia) are widely studied because of their fundamental role in reef
building and their superb fossil record extending back to the Triassic. Nevertheless, interpretations of their evolutionary
relationships have been in flux for over a decade. Recent analyses undermine the legitimacy of traditional suborders,
families and genera, and suggest that a non-skeletal sister clade (Order Corallimorpharia) might be imbedded within the
stony corals. However, these studies either sampled a relatively limited array of taxa or assembled trees from heterogeneous
data sets. Here we provide a more comprehensive analysis of Scleractinia (127 species, 75 genera, 17 families) and various
outgroups, based on two mitochondrial genes (cytochrome oxidase I, cytochrome b), with analyses of nuclear genes (Ăźtubulin,
ribosomal DNA) of a subset of taxa to test unexpected relationships. Eleven of 16 families were found to be
polyphyletic. Strikingly, over one third of all families as conventionally defined contain representatives from the highly
divergent "robust" and "complex" clades. However, the recent suggestion that corallimorpharians are true corals that have
lost their skeletons was not upheld. Relationships were supported not only by mitochondrial and nuclear genes, but also
often by morphological characters which had been ignored or never noted previously. The concordance of molecular
characters and more carefully examined morphological characters suggests a future of greater taxonomic stability, as well as
the potential to trace the evolutionary history of this ecologically important group using fossils
Evolutionary distances in the twilight zone -- a rational kernel approach
Phylogenetic tree reconstruction is traditionally based on multiple sequence
alignments (MSAs) and heavily depends on the validity of this information
bottleneck. With increasing sequence divergence, the quality of MSAs decays
quickly. Alignment-free methods, on the other hand, are based on abstract
string comparisons and avoid potential alignment problems. However, in general
they are not biologically motivated and ignore our knowledge about the
evolution of sequences. Thus, it is still a major open question how to define
an evolutionary distance metric between divergent sequences that makes use of
indel information and known substitution models without the need for a multiple
alignment. Here we propose a new evolutionary distance metric to close this
gap. It uses finite-state transducers to create a biologically motivated
similarity score which models substitutions and indels, and does not depend on
a multiple sequence alignment. The sequence similarity score is defined in
analogy to pairwise alignments and additionally has the positive semi-definite
property. We describe its derivation and show in simulation studies and
real-world examples that it is more accurate in reconstructing phylogenies than
competing methods. The result is a new and accurate way of determining
evolutionary distances in and beyond the twilight zone of sequence alignments
that is suitable for large datasets.Comment: to appear in PLoS ON
- …