12 research outputs found

    Mutagenesis as a Diversity Enhancer and Preserver in Evolution Strategies

    Get PDF
    Proceedings of: 9th International Symposium on Distributed Computing and Artificial Intelligence (DCAI 2012). Salamanca, March 28-30, 2012Mutagenesis is a process which forces the coverage of certain zones of the search space during the generations of an evolution strategy, by keeping track of the covered ranges for the different variables in the so called gene matrix. Originally introduced as an artifact to control the automated stopping criterion in a memetic algorithm, ESLAT, it also improved the exploration capabilities of the algorithm, even though this was considered a secondary matter and not properly analyzed or tested. This work focuses on this diversity enhancement, redefining mutagenesis to increase this characteristic, measuring this improvement over a set of twenty-seven unconstrained optimization functions to provide statistically significant results.This work was supported in part by Projects CICYT TIN2008-06742-C02-02/TSI, CICYT TEC2008-06732-C02-02/TEC, CAM CONTEXTS (S2009/TIC-1485) and DPS2008-07029-C02-02.Publicad

    The segmentation issue: general stopping criteria and specific design considerations for practical application of evolutionary algorithms

    Get PDF
    Segmentation is a tool presented for representation and approximation of data, according to a set of appropriate models. These procedures have applications to many different domains, such as time series analysis, polygonal approximation, Air Traffic Control,... Different heuristic and metaheuristic proposals have been introduced to deal with this issue. This thesis provides a novel multiobjective evolutionary method, analyzing the required general tools for the application evolutionary algorithms to real problems and the specific modifications required over the different steps of general proposals to adapt them to the segmentation domain. An introduction to the domain is presented by means of the design of a specific heuristic for segmentation of Air Traffic Control (ATC) data. This domain has a series of characteristics which make it difficult to be faced with traditional techniques: noisy data and a large number of measurements. The proposal works on two phases, using a pre-segmentation which introduces available domain information and applying a standard technique over this initial technique's results. Its results according to the presented domain, tested with a set of eight different representative trajectories, show competitive advantages compared to general approaches, which oversegmentate noisy data and, in some cases, exhibit poor scalability. This heuristic proposal shows the costly process of adapting available approaches and designing specific ones, along with the multi-objective nature of the problem, which requires the use of quality indicators for a proper comparison process. Applying evolutionary algorithms to segmentation provides several advantages, highlighting the fact that the problem dependance of heuristics make it costly to adapt these heuristics to new domains, as introduced by the designed heuristic to ATC. However, the practical application of these algorithms requires the study of a topic which has received little research effort from the community: stopping criteria. An evolutionary approach should contain a dynamic procedure which can determine when stagnation has taken place and stop the algorithm accordingly (as opposed to a-priori cost budgets, either in function evaluations or generations, which are usually applied for test datasets). Stopping criteria have been faced for single and multi-objective cases in this thesis. Single-objective stopping criteria have been approached proposing an active role of the stopping criteria, actively increasing the diversity in the variable space while tracking the updates in the fitness function. Thus, the algorithm reuses the information obtained for the stopping decision and feeds it to a stopping prevention mechanism in order to prevent problematic situations such as early convergence. The presented algorithm has been tested according to a set of 27 different functions, with different characteristics regarding their dimensionality, search space, local minima... The results show that the introduced mechanisms enhance the robustness of the results, due to the improved exploration and the early convergence prevention. Multi-objective stopping criteria are faced with the use of progress indicators (comparison measures of the quality of the evolution results at different generations) and an associated data gathering tool. The final proposal uses three different progress indicators, (hypervolume, epsilon and Mutual Dominance Rate) and considers them jointly according to a decision fusion architecture. The stagnation analysis is based on the least squares regression parameters of the indicators values, including a normality analysis as well. The online nature of these algorithms is highlighted, preventing the recomputation of the indicators values which were present in other available alternatives, and also focusing on the simplicity of the final proposal, in order to reduce the cost of introducing it into available algorithms. The proposal has been tested with instances of the DTLZ algorithm family, obtaining satisfactory stops with a standard set of configuration values for the technique. However, there is a lack of quantitative measures to determine the objective quality of a stop and to properly compare its value to other alternatives. The multi-objective nature of the segmentation problem is analyzed to propose a multiobjective evolutionary algorithm (MOEA) to deal with it. This nature is analyzed according to a selection of available approaches, highlighting the difficulties which had to be faced in the parameter configuration in order to guide the processes to the desired solution values. A multi-objective a-posteriori approach such as the one presented allows the decision maker to choose from the front of possible final solutions the one which suits him best, simplifying this process. The presented approach chooses SPEA2 as its underlying MOEA, analyzing different representation and initialization proposals. The results have been validated against a representative set of heuristic and metaheuristic techniques, using three widely extended curves from the polygonal approximation domain (chromosome, leaf and semicircle), obtaining statistically better results for almost all the different test cases. This initial MOEA approach had unresolved issues, such as the archiving technique complexity order, and also lacked the proper specific design considerations to adapt it to the application domain. These issues have been faced according to different improvements. First of all, an alternative representation is proposed, including partial fitness information and associated fitness-aware transformation operators (transformation operators which compute children fitness values according to their changes and the parents partial values). A novel archiving procedure is introduced according to the bi-objective nature of the domain, being one of them discrete. This leads to a relaxed Pareto dominance check, named epsilon glitches. Multi-objective local search versions of the traditional algorithms are proposed and tested for the initialization of the algorithm, along with the stopping criterion proposal which has also been adapted to the problem characteristics. The archive size in this case is big enough to contain all the different individuals in the optimal front, such that quality assessment is simplified and a simpler mechanism can be introduced to detect stagnation, according to the improvements in each of the possible individuals. The final evolutionary proposal is scalable, requires few configuration parameters and introduces an efficient dynamic stopping criterion. Its results have been tested against the original technique and the set of heuristic and metaheuristic techniques previously used, including the three original curves and also more complex versions of them (obtained with an introduced generation mechanism according to these original shapes). Even though the stopping results are very satisfactory, the obtained results are slightly worse than the original MOEA for the three simpler problem instances with the established configuration parameters (as was expected, due to the computational effort of the a-priori established number of generations and population size, based on the analysis of the algorithm's results). However, the comparison versus the alternative techniques stills shows the same statistically better results, and its reduced computational cost allows its application to a wider set of problems.La segmentación es una técnica creada para la representación y la aproximación de conjuntos de datos a través de un conjunto de modelos apropiados. Estos procedimientos tienen aplicaciones para múltiples dominios distintos, como el análisis de series temporales, la aproximación poligonal o el Control de Tráfico Aéreo. Se han hecho múltiples propuestas tanto de carácter heurístico como metaheurístico para lidiar con este problema. Esta tesis proporciona un nuevo método evolutivo multiobjetivo, analizando las herramientas generales necesarias para la aplicación de algoritmos evolutivos a problemas reales y las modificaciones específicas necesarias sobre los distintos pasos de las propuestas genéricas para adaptarlos al dominio de la segmentación. Se presenta una introducción al dominio mediante el diseño de una heurística específica para la segmentación de datos procedentes del Control de Tráfico Aéreo (CTA). Este dominio tiene una serie de características que dificultan la aplicación de técnicas tradicionales: datos con ruido y un gran número de muestras. La propuesta realizada funciona de acuerdo a dos fases, utilizando una presegmentación que introduce información del dominio disponible para posteriormente aplicar una técnica estándar sobre los resultados de esta técnica inicial. Sus resultados para el dominio presentado, probado con un conjunto de ocho trayectorias representativas distintas, presentan ventajas competitivas frente a los enfoques generales, que sobresegmentan los datos con ruido y, en algunos casos, presentan una mala escalabilidad. Esta propuesta heurística muestra el costoso proceso que implica adaptar los enfoques existentes o el diseño de otros nuevos, junto a la naturaleza multiobjectivo del problema, que precisa del uso de indicadores de calidad para realizar un proceso de comparación apropiado. La aplicación de algoritmos evolutivos a la segmentación tiene múltiples ventajas, destacando el hecho de la dependencia existente entre las heurísticas y el problema específico para el que han sido diseñadas, lo que hace que su adaptación a nuevos dominios sea costosa, como se ha introducido a través de la propuesta heurística para CTA. A pesar de ello, la aplicación práctica de estos algoritmos requiere el estudio de una faceta que ha recibido poca atención por parte de la comunidad desde el punto de vista de la investigación: los criterios de parada. Un enfoque evolutivo debería tener una técnica dinámica que pueda detectar cuando se ha producido el estancamiento del proceso, y parar el algoritmo de acuerdo a ello (de manera opuesta a los criterios a-priori que establecen un coste predeterminado, expresado como número de evaluaciones o de generaciones, y que son habitualmente aplicados para los conjuntos de datos de prueba). Los criterios de parada se han afrontado tanto desde el caso de un único objetivo como desde el caso multiobjectivo en esta tesis. Los criterios de parada para un único objetivo se han abordado proponiendo un rol activo para el criterio, aumentando la diversidad en el espacio de variables de una manera activa, mientras se monitorizan los cambios en la función objetivo. De esta manera, el algoritmo reutiliza la información obtenida para la decisión de parada y la inserta en un mecanismo de prevención de la parada con la finalidad de prevenir situaciones problemáticas como la convergencia temprana. El algoritmo presentado se ha probado sobre un conjunto de 27 funciones distintas, con diferentes características respecto a su dimensionalidad, espacio de búsqueda, mínimos locales... Los resultados muestran que los mecanismos introducidos mejoran la robustez de los resultados, haciendo uso de la exploración mejorada y la prevención de la convergencia temprana. Los criterios de parada multiobjetivo se han planteado con el uso de indicadores de avance (medidas comparativas de la calidad de los resultados de la evolución en diferentes generaciones) y una herramienta de recolección de datos asociada. La propuesta final utiliza tres indicadores de avance distintos (hypervolumen, epsilon y ratio de dominancia mutua) y los considera de una manera conjunta de acuerdo a una arquitectura de fusión de decisiones. El análisis del estancamiento se basa en los parámetros de una regresión de mínimos cuadrados sobre los valores de los indicadores, incluyendo asimismo un análisis de normalidad. Se recalca la naturaleza online de estos algoritmos, evitando el recálculo de los valores de los indicadores que estaba presente en otras alternativas disponibles, y también focalizándose en la simplicidad de la propuesta final, de manera que se facilite el proceso de introducir el criterio en los algoritmos existentes. La propuesta ha sido probada con instancias de la familia de algoritmos DTLZ, obteniendo resultados de parada satisfactorios con un conjunto de valores de configuración estándar para la técnica. Sin embargo, existe una falta de medidas cuantitativas para determinar la calidad objetiva de una parada, así como para comparar de manera apropiada su valor frente al de otras alternativas. La naturaleza multiobjetivo del problema de segmentación se ha analizado para proponer un algoritmo evolutivo multiobjetivo (AEMO) para resolverlo. Esta naturaleza ha sido analizada de acuerdo a una selección de los enfoques disponibles, destacando las dificultades que se tienen que afrontar en la configuración de los parámetros de cara a guiar el proceso hacia los valores de solución deseados. Un enfoque multiobjetivo a-posteriori como el que se ha presentado permite al responsable elegir del frente de posibles soluciones finales aquella que encaja mejor, simplificando este proceso. El enfoque presentado ha elegido SPEA2 como algoritmo de base, analizando diferentes propuestas de inicialización y representación. Los resultados se han validado frente a un conjunto significativo de técnicas heurísticas y metaheurísticas, utilizando tres curvas ampliamente extendidas en el dominio de la segmentación poligonal (cromosoma, hoja y semicírculo), obteniendo resultados estadísticamente mejores para la casi totatilidad de los casos de prueba. Esta propuesta inicial de AEMO presentaba una serie de problemas sin resolver, como el orden de complejidad de la técnica de almacenaje, y además carecía de las consideraciones específicas de diseño para su adaptación al dominio de aplicación. Estos problemas se han afrontado de acuerdo a diferentes mejoras. Por un lado, se ha propuesto una representación alternativa, incluyendo información parcial de la función objetivo y operadores de transformación informados (operadores de transformación que calculan los valores de la función objetivo de los hijos de acuerdo a los cambios realizados y los valores parciales de los padres). Una nueva técnica de almacenaje se ha introducido de acuerdo a la naturaleza biobjetivo del dominio, siendo uno de ellos además discreto. Esta naturaleza ha llevado a la aplicación de una forma relajada de dominancia de Pareto, que hemos denominado pulsos épsilon. Versiones multiobjetivo de los algoritmos tradicionales de búsqueda local han sido propuestas y probadas para la inicialización del algoritmo, junto con la propuesta de criterio de parada, que también ha sido adaptada a las características del problema. En este caso, el tamaño del almacén es suficientemente grande como para almacenar todos los individuos del frente óptimo, de manera que las técnicas de análisis de calidad de los frentes se simplifican, y un mecanismo más sencillo puede ser introducido para detectar el estancamiento, de acuerdo a las mejoras en cada uno de los individuos posibles. La propuesta evolutiva final es escalable, requiere pocos parámetros de configuración e introduce un criterio de parada dinámico y eficiente. Sus resultados se han probado frente a la técnica original y el conjunto de técnicas heurísticas y metaheurísticas previamente utilizadas, incluyendo las tres curvas originales y versiones más complejas de las mismas (obtenidas con un mecanismo de generación incluido de acuerdo a estas tres formas originales). A pesar de que los resultados de parada son muy satisfactorios, los resultados obtenidos son ligeramente peores que el AEMO original para las tres instancias del problema más simples, utilizando el conjunto de parámetros de configuración establecidos (como cabía esperar, dado el coste computacional del número de generaciones y tamaño de la población establecidos a priori, basados en el análisis de los resultados del algoritmo). En cualquier caso, la comparación frente a las técnicas alternativas todavía presenta los mismos resultados estadísticamente mejores, y las mejoras en el coste computacional permiten su aplicación a un mayor conjunto de problemas.Programa Oficial de Doctorado en Ciencia y Tecnología InformáticaPresidente: Pedro Isasi Viñuela.- Secretario: Rafael Martínez Tomás.- Vocal: Javier Segovia Pére

    The Role of RNA Editing in Cancer Development and Metabolic Disorders

    Get PDF
    Numerous human diseases arise from alterations of genetic information, most notably DNA mutations. Thought to be merely the intermediate between DNA and protein, changes in RNA sequence were an afterthought until the discovery of RNA editing 30 years ago. RNA editing alters RNA sequence without altering the sequence or integrity of genomic DNA. The most common RNA editing events are A-to-I changes mediated by adenosine deaminase acting on RNA (ADAR), and C-to-U editing mediated by apolipoprotein B mRNA editing enzyme, catalytic polypeptide 1 (APOBEC1). Both A-to-I and C-to-U editing were first identified in the context of embryonic development and physiological homeostasis. The role of RNA editing in human disease has only recently started to be understood. In this review, the impact of RNA editing on the development of cancer and metabolic disorders will be examined. Distinctive functions of each RNA editase that regulate either A-to-I or C-to-U editing will be highlighted in addition to pointing out important regulatory mechanisms governing these processes. The potential of developing novel therapeutic approaches through intervention of RNA editing will be explored. As the role of RNA editing in human disease is elucidated, the clinical utility of RNA editing targeted therapies will be needed. This review aims to serve as a bridge of information between past findings and future directions of RNA editing in the context of cancer and metabolic disease

    A High-Throughput Approach to Uncover Novel Roles of APOBEC2, a Functional Orphan of the AID/APOBEC Family

    Get PDF
    APOBEC2 is a member of the AID/APOBEC cytidine deaminase family of proteins. Unlike most of AID/APOBEC, however, APOBEC2’s function remains elusive. Previous research has implicated APOBEC2 in diverse organisms and cellular processes such as muscle biology (in Mus musculus), regeneration (in Danio rerio), and development (in Xenopus laevis). APOBEC2 has also been implicated in cancer. However the enzymatic activity, substrate or physiological target(s) of APOBEC2 are unknown. For this thesis, I have combined Next Generation Sequencing (NGS) techniques with state-of-the-art molecular biology to determine the physiological targets of APOBEC2. Using a cell culture muscle differentiation system, and RNA sequencing (RNA-Seq) by polyA capture, I demonstrated that unlike the AID/APOBEC family member APOBEC1, APOBEC2 is not an RNA editor. Using the same system combined with enhanced Reduced Representation Bisulfite Sequencing (eRRBS) analyses I showed that, unlike the AID/APOBEC family member AID, APOBEC2 does not act as a 5-methyl-C deaminase. Finally, using a combination of biochemical, Chromatin Immunoprecipitation Sequencing (ChiP-Seq) and polyA RNA-Seq analyses I show that APOBEC2 is a (negative) regulator of gene expression (at least in muscle cells) and binds chromatin directly to inhibit transcription of genes involved in muscle cell differentiation. While the precise mechanism behind this activity is still a matter of investigation, this role of APOBEC2 in inhibiting genes involved in cell cycle exit, might have implications for its role in in cancer

    The role of notch signalling in colorectal cancer

    Get PDF

    Analysis of the p53 Regulator MDM2 and the Identification of the Novel p53 Target Gene LRP1

    Get PDF
    The transcription factor p53 responds to many stresses and regulates many different pathways. The earliest characterized functions of p53 include the induction of cell cycle arrest, apoptosis, and senescence; however, more recent studies have shown that p53 regulates other pathways, including lipid and glucose metabolism, DNA damage repair, and autophagy. While the activation of many of these pathways likely overlaps in many contexts, a current model proposes that p53 plays a critical role in deciding cell fate in response to stress. Indeed, depending on the type and severity of stress, p53 can induce genes that promote the resolution of cellular damage thereby allowing the cell to continue to proliferate, or p53 can induce genes that promote apoptosis to prevent the cell from propagating deleterious mutations. The chief negative regulators of p53, MDM2 and its homologous binding partner MDMX, are overexpressed in many cancers, especially those with wild-type p53. Although MDM2 and MDMX have been intensely studied, basic aspects regarding their interaction, such as how MDM2 preferentially heterooligomerizes with MDMX over homooligomerizing with MDM2, remain unknown. In my research, I generated multiple MDM2 mutant constructs to test their ability to homooligomerize with MDM2 and heterooligomerize with MDMX. Surprisingly, despite many studies suggesting that the C-terminal Really Interesting New Gene (RING) domain is critical for both MDM2 homooligomerization and MDM2-MDMX heterooligomerization, my results show that MDM2 RING structural mutations that prevent MDM2 enzymatic function and MDMX binding retain the ability to homooligomerize. Interestingly, deletion of the regulatory central acidic domain of MDM2 inhibits the ability of MDM2 to homooligomerize but does not impede its ability to heterooligomerize with MDMX, suggesting that MDM2-MDM2 homooligomerization and MDM2-MDMX heterooligomerization occur through different mechanisms. In another study, I identified the gene low-density lipoprotein receptor related protein 1 (LRP1) as a novel p53 target gene. Further analysis revealed that LRP1 protein induction occurs in response to sub-lethal but not lethal p53-activating stresses. Interestingly, although lethal p53-activating stress can induce LRP1 transcription, protein expression is impeded at the translational level. Collectively, these studies contribute to our knowledge of p53 regulation as well as the p53 regulome.Doctor of Philosoph

    Discovery of novel molecular and biochemical predictors of response and outcome in diffuse large B-cell lymphoma

    Get PDF
    PhDDiscovery of Novel Molecular and Biochemical Predictors of Response and Outcome in Diffuse Large B-cell Lymphoma Diffuse large B-cell lymphoma (DLBCL) is the commonest form of non-Hodgkin lymphoma and responds to treatment with a 5-year overall survival (OS) of 40-50%. Predicting outcome using the best available method, the International Prognostic Index (IPI), is inaccurate and unsatisfactory. This thesis describes research undertaken to discover, explore and validate new molecular and biochemical predictors of response and long-term outcome with the aims of improving on the inaccurate IPI and of suggesting novel therapeutic approaches. Two strategies were adopted: a rational and an empirical approach. The rational strategy used gene expression profiling to identify transcriptional signatures that correlated with outcome to treatment and from which a model of 13-genes accurately predict long-term OS. Two components of the 13-gene model, PKC and PDE4B, were studied using inhibitors in lymphoma cell-lines and primary cell cultures. PKC inhibition using SC-236 proved to be cytostatic and cytotoxic in the cell-lines examined and to a lesser extent in primary tumours. PDE4 inhibition using piclamilast and rolipram had no effect either alone or in combination with chemotherapy. The empirical approach investigated the trace element selenium in presentation serum and found that it was a biochemical predictor of response and outcome to treatment. In an attempt to provide evidence of a causal relationship as an explanation for the associations between presentation serum selenium, response and outcome, two selenium compounds, methylseleninic acid (MSA) and selenodiglutathione (SDG) were studied in vitro in the same lymphoma cell-lines and primary cell cultures. Both MSA and SDG exhibited cytostatic and cytotoxic activity and caspase-8 and caspase-9 driven apoptosis. For SDG reactive oxygen species generation was important for its activity in three of the four cell-lines. In conclusion, molecular and biochemical predictors of response and survival were discovered in DLBCL that led to viable targets for drug intervention being validated in vitro

    Overexpression and purification of membrane proteins in yeast

    Get PDF

    Effects of Irrigation Rate and Planting Density on Maize Yield and Water Use Efficiency in the Temperate Climate of Serbia

    Get PDF
    Scarce water resources severely limit maize (Zea mays L.) cultivation in the temperate regions of northern Serbia. A two-year field experiment was conducted to investigate the effects of irrigation and planting density on yield and water use efficiency in temperate climate under sprinkler irrigation. The experiment included five irrigation treatments (full irrigated treatment – FIT; 80% FIT, 60% FIT, 40% FIT, and rainfed) and three planting densities (PD1: 54,900 plants ha–1 ; PD2: 64,900 plants ha–1; PD3: 75,200 plants ha–1). There was increase in yield with the irrigation (1.05–80.00%) as compared to the rainfed crop. Results showed that decreasing irrigation rates resulted in a decrease in yield, crop water use efficiency (WUE), and irrigation water use efficiency (IWUE). Planting density had significant effects on yield, WUE, and IWUE which differed in both years. Increasing planting density gradually increased yield, WUE, and IWUE. For the pooled data, irrigation rate, planting density and their interaction was significant (P < 0.05). The highest two-year average yield, WUE, and IWUE were found for FIT-PD3 (14,612 kg ha–1), rainfed-PD2 (2.764 kg m–3), and 60% FITPD3 (2.356 kg m–3), respectively. The results revealed that irrigation is necessary for maize cultivation because rainfall is insufficient to meet the crop water needs. In addition, if water becomes a limiting factor, 80% FIT-PD3 with average yield loss of 15% would be the best agronomic practices for growing maize with a sprinkler irrigation system in a temperate climate of Serbia
    corecore