433 research outputs found

    Cancer progression models and fitness landscapes: A many-to-many relationship

    Full text link
    Motivation The identification of constraints, due to gene interactions, in the order of accumulation of mutations during cancer progression can allow us to single out therapeutic targets. Cancer progression models (CPMs) use genotype frequency data from cross-sectional samples to identify these constraints, and return Directed Acyclic Graphs (DAGs) of restrictions where arrows indicate dependencies or constraints. On the other hand, fitness landscapes, which map genotypes to fitness, contain all possible paths of tumor progression. Thus, we expect a correspondence between DAGs from CPMs and the fitness landscapes where evolution happened. But many fitness landscapes - e.g. those with reciprocal sign epistasis - cannot be represented by CPMs. Results Using simulated data under 500 fitness landscapes, I show that CPMs' performance (prediction of genotypes that can exist) degrades with reciprocal sign epistasis. There is large variability in the DAGs inferred from each landscape, which is also affected by mutation rate, detection regime and fitness landscape features, in ways that depend on CPM method. Using three cancer datasets, I show that these problems strongly affect the analysis of empirical data: fitness landscapes that are widely different from each other produce data similar to the empirically observed ones and lead to DAGs that infer very different restrictions. Because reciprocal sign epistasis can be common in cancer, these results question the use and interpretation of CPMs.This study was supported by BFU2015-67302-R (MINECO/FEDER, EU

    PaLS: filtering common literature, biological terms and pathway information

    Get PDF
    Many biological experiments and their subsequent analysis yield lists of genes or proteins that can potentially be important to the prognosis or diagnosis of certain diseases (e.g. cancer). Nowadays, information about the function of those genes or proteins may be already gathered in some databases, but it is essential to understand if some of the members of those lists have a function in common or if they belong to the same metabolic pathway. To help researchers filter those genes or proteins that have such information in common, we have developed PaLS (pathway and literature strainer, http://pals.bioinfo.cnio.es). PaLS takes a list or a set of lists of gene or protein identifiers and shows which ones share certain descriptors. Four publicly available databases have been used for this purpose: PubMed, which links genes with those articles that make reference to them; Gene Ontology, an annotated ontology of terms related to the cellular component, biological process or molecular function where those genes or proteins are involved; KEGG pathways and Reactome pathways. Those descriptors among these four sources of information that are shared by more members of the list (or lists) are highlighted by PaLS

    Conditional prediction of consecutive tumor evolution using cancer progression models: What genotype comes next?

    Full text link
    Accurate prediction of tumor progression is key for adaptive therapy and precision medicine. Cancer progression models (CPMs) can be used to infer dependencies in mutation accumulation from cross-sectional data and provide predictions of tumor progression paths. However, their performance when predicting complete evolutionary trajectories is limited by violations of assumptions and the size of available data sets. Instead of predicting full tumor progression paths, here we focus on short-term predictions, more relevant for diagnostic and therapeutic purposes. We examine whether five distinct CPMs can be used to answer the question “Given that a genotype with n mutations has been observed, what genotype with n + 1 mutations is next in the path of tumor progression?” or, shortly, “What genotype comes next?”. Using simulated data we find that under specific combinations of genotype and fitness landscape characteristics CPMs can provide predictions of short-term evolution that closely match the true probabilities, and that some genotype characteristics can be much more relevant than global features. Application of these methods to 25 cancer data sets shows that their use is hampered by a lack of information needed to make principled decisions about method choice. Fruitful use of these methods for short-term predictions requires adapting method’s use to local genotype characteristics and obtaining reliable indicators of performance; it will also be necessary to clarify the interpretation of the method’s results when key assumptions do not holdSupported by grant BFU2015-67302-R (MINECO/FEDER, EU) funded by MCIN/AEI/ 10.13039/501100011033 and by ERDF A way of making Europe and by grant PID2019-111256RBI00 funded by MCIN/AEI/10.13039/501100011033 to RDU. JDC supported by PEJD-2018-POST/ BMD-8960 from Comunidad de Madrid to RDUGobierno de España. BFU2015-67302-RGobierno de España. PID2019-111256RBI0

    Detection of Recurrent Copy Number Alterations in the Genome: a Probabilistic Approach

    Get PDF
    Copy number variation (CNV) in genomic DNA is linked to a variety of human diseases (including cancer, HIV acquisition, autoimmune and neurodegenerative diseases), and array-based CGH (aCGH) is currently the main technology to locate CNVs. Several methods can analyze aCGH data at the single sample level, but disease-critical genes are more likely to be found in regions that are common or recurrent among samples. Unfortunately, defining recurrent CNV regions remains a challenge. Moreover, the heterogeneous nature of many diseases requires that we search for CNVs that affect only some subsets of the samples (without prior knowledge of which regions and subsets of samples are affected), but this is neglected by current methods. We have developed two methods to define recurrent CNV regions. Our methods are unique and qualitatively different from existing approaches: they detect both regions over the complete set of arrays and alterations that are common only to some subsets of the samples and, thus, CNV alterations that might characterize previously unknown groups; they use probabilities of alteration as input (not discretized gain/loss calls, which discard uncertainty and variability) and return probabilities of being a shared common region, thus allowing researchers to modify thresholds as needed; the two parameters of the methods have an immediate, straightforward, biological interpretation. Using data from previous studies, we show that we can detect patterns that other methods miss and, by using probabilities, that researchers can modify, as needed, thresholds of immediate interpretability to answer specific research questions. These methods are a qualitative advance in the location of recurrent CNV regions and will be instrumental in efforts to standardize definitions of recurrent CNVs and cluster samples with respect to patterns of CNV, and ultimately in the search for genomic regions harboring disease-critical genes

    Finding Recurrent Regions of Copy Number Variation: A Review

    Get PDF
    Copy number variation (CNV) in genomic DNA is linked to a variety of human diseases, and array-based CGH (aCGH) is currently the main technology to locate CNVs. Although many methods have been developed to analyze aCGH from a single array/subject, disease-critical genes are more likely to be found in regions that are common or recurrent among subjects. Unfortunately, finding recurrent CNV regions remains a challenge. We review existing methods for the identification of recurrent CNV regions. The working definition of ``common\u27\u27 or ``recurrent\u27\u27 region differs between methods, leading to approaches that use different types of input (discretized output from a previous CGH segmentation analysis or intensity ratios), or that incorporate to varied degrees biological considerations (which play a role in the identification of ``interesting\u27\u27 regions and in the details of null models used to assess statistical significance). Very few approaches use and/or return probabilities, and code is not easily available for several methods. We suggest that finding recurrent CNVs could benefit from reframing the problem in a biclustering context. We also emphasize that, when analyzing data from complex diseases with significant among-subject heterogeneity, methods should be able to identify CNVs that affect only a subset of subjects. We make some recommendations about choice among existing methods, and we suggest further methodological research

    El multiculturismo en los niños de cinco años de educación inicial

    Get PDF
    En el Perú y especialmente en Cajamarca existe muchos pueblos que hay multiculturalidad; en la provincia de San Ignacio en sus distritos de Huarango, La coipa, Namballe y san José de Lourdes, como también en la provincia de Jaén, en el distrito de Pomahuaca, pucará y santa Rosa, donde los estudiantes de los primeros años de estudio, es decir, los niños de inicial tienen muchos problemas para poder comunicarse con sus semejantes, ya que tienen diferentes lenguas; es ahí donde el docente cumple un rol muy importante para que sus niños puedan comunicarse de una forma fluida. Por eso, en este trabajo de investigación pretendo, brindar las estrategias necesarias para que el docente pueda comunicarse de una manera fluida y alturada

    Asterias: a parallelized web-based suite for the analysis of expression and aCGH data

    Get PDF
    Asterias (\url{http://www.asterias.info}) is an integrated collection of freely-accessible web tools for the analysis of gene expression and aCGH data. Most of the tools use parallel computing (via MPI). Most of our applications allow the user to obtain additional information for user-selected genes by using clickable links in tables and/or figures. Our tools include: normalization of expression and aCGH data; converting between different types of gene/clone and protein identifiers; filtering and imputation; finding differentially expressed genes related to patient class and survival data; searching for models of class prediction; using random forests to search for minimal models for class prediction or for large subsets of genes with predictive capacity; searching for molecular signatures and predictive genes with survival data; detecting regions of genomic DNA gain or loss. The capability to send results between different applications, access to additional functional information, and parallelized computation make our suite unique and exploit features only available to web-based applications.Comment: web based application; 3 figure

    Engagement entre dos Instituciones Financieras de Chiclayo 2016.

    Get PDF
    Esta investigación tiene como objetivo determinar la diferencia de engagement de trabajadores de dos instituciones financieras de Chiclayo, el tipo de investigación cuantitativo-básica, y diseño no experimental con método de comparación a posteriori, usando una muestra no probabilística y de conveniencia de trabajadores de dos financieras con más de un año de contrato. Los instrumentos de recolección de datos fue el «Cuestionario de Implicación con el trabajo o compromiso con la organización (UWES)» de HallBerg y Schaufeli de 1999. Los resultados indican que existe diferencias estadísticamente significativas de engagement entre la financiera pública y la financiera privada con una significancia de p<0,01, siendo los trabajadores de la financiera pública quienes presentan menor engagement a comparación de los trabajadores de la financiera privada [Pm=47.19; PM=109.16]. Asimismo, en cuanto a las dimensiones de engagement de vigor, dedicación y absorción se encontraron diferencias estadísticamente significativas entre la financiera pública y privada [p<0,01], siendo los resultados favorables para la financiera privada.Tesi

    Global epistasis on fitness landscapes

    Full text link
    Epistatic interactions between mutations add substantial complexity to adaptive landscapes, and are often thought of as detrimental to our ability to predict evolution. Yet, patterns of global epistasis, in which the fitness effect of a mutation is well-predicted by the fitness of its genetic background, may actually be of help in our efforts to reconstruct fitness landscapes and infer adaptive trajectories. Microscopic interactions between mutations, or inherent nonlinearities in the fitness landscape, may cause global epistasis patterns to emerge. In this brief review, we provide a succinct overview of recent work about global epistasis, with an emphasis on building intuition about why it is often observed. To this end, we reconcile simple geometric reasoning with recent mathematical analyses, using these to explain why different mutations in an empirical landscape may exhibit different global epistasis patterns - ranging from diminishing to increasing returns. Finally, we highlight open questions and research directions.Comment: 20 pages, 4 figure
    corecore