Search CORE

25 research outputs found

Relatedness Measures to Aid the Transfer of Building Blocks among Multiple Tasks

Author: Ahluwalia Mann
Holland John H.
Koza John R.
Nguyen T. B.
Pier Luca Lanzi
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 17/05/2020
Field of study

Multitask Learning is a learning paradigm that deals with multiple different tasks in parallel and transfers knowledge among them. XOF, a Learning Classifier System using tree-based programs to encode building blocks (meta-features), constructs and collects features with rich discriminative information for classification tasks in an observed list. This paper seeks to facilitate the automation of feature transferring in between tasks by utilising the observed list. We hypothesise that the best discriminative features of a classification task carry its characteristics. Therefore, the relatedness between any two tasks can be estimated by comparing their most appropriate patterns. We propose a multiple-XOF system, called mXOF, that can dynamically adapt feature transfer among XOFs. This system utilises the observed list to estimate the task relatedness. This method enables the automation of transferring features. In terms of knowledge discovery, the resemblance estimation provides insightful relations among multiple data. We experimented mXOF on various scenarios, e.g. representative Hierarchical Boolean problems, classification of distinct classes in the UCI Zoo dataset, and unrelated tasks, to validate its abilities of automatic knowledge-transfer and estimating task relatedness. Results show that mXOF can estimate the relatedness reasonably between multiple tasks to aid the learning performance with the dynamic feature transferring.Comment: accepted by The Genetic and Evolutionary Computation Conference (GECCO 2020

arXiv.org e-Print Archive

Crossref

Queensland University of Technology ePrints Archive

Constructing Complexity-efficient Features in XCS with Tree-based Rule Conditions

Author: Browne Will N.
Nguyen Trung B.
Zhang Mengjie
Publication venue
Publication date: 23/04/2020
Field of study

A major goal of machine learning is to create techniques that abstract away irrelevant information. The generalisation property of standard Learning Classifier System (LCS) removes such information at the feature level but not at the feature interaction level. Code Fragments (CFs), a form of tree-based programs, introduced feature manipulation to discover important interactions, but they often contain irrelevant information, which causes structural inefficiency. XOF is a recently introduced LCS that uses CFs to encode building blocks of knowledge about feature interaction. This paper aims to optimise the structural efficiency of CFs in XOF. We propose two measures to improve constructing CFs to achieve this goal. Firstly, a new CF-fitness update estimates the applicability of CFs that also considers the structural complexity. The second measure we can use is a niche-based method of generating CFs. These approaches were tested on Even-parity and Hierarchical problems, which require highly complex combinations of input features to capture the data patterns. The results show that the proposed methods significantly increase the structural efficiency of CFs, which is estimated by the rule "generality rate". This results in faster learning performance in the Hierarchical Majority-on problem. Furthermore, a user-set depth limit for CF generation is not needed as the learning agent will not adopt higher-level CFs once optimal CFs are constructed

arXiv.org e-Print Archive

Queensland University of Technology ePrints Archive

Improving the Scalability of XCS-Based Learning Classifier Systems

Author: Iqbal Muhammad
Publication venue: 'Victoria University of Wellington Library'
Publication date: 01/01/2014
Field of study

Using evolutionary intelligence and machine learning techniques, a broad range of intelligent machines have been designed to perform different tasks. An intelligent machine learns by perceiving its environmental status and taking an action that maximizes its chances of success. Human beings have the ability to apply knowledge learned from a smaller problem to more complex, large-scale problems of the same or a related domain, but currently the vast majority of evolutionary machine learning techniques lack this ability. This lack of ability to apply the already learned knowledge of a domain results in consuming more than the necessary resources and time to solve complex, large-scale problems of the domain. As the problem increases in size, it becomes difficult and even sometimes impractical (if not impossible) to solve due to the needed resources and time. Therefore, in order to scale in a problem domain, a systemis needed that has the ability to reuse the learned knowledge of the domain and/or encapsulate the underlying patterns in the domain. To extract and reuse building blocks of knowledge or to encapsulate the underlying patterns in a problem domain, a rich encoding is needed, but the search space could then expand undesirably and cause bloat, e.g. as in some forms of genetic programming (GP). Learning classifier systems (LCSs) are a well-structured evolutionary computation based learning technique that have pressures to implicitly avoid bloat, such as fitness sharing through niche based reproduction. The proposed thesis is that an LCS can scale to complex problems in a domain by reusing the learnt knowledge from simpler problems of the domain and/or encapsulating the underlying patterns in the domain. Wilson’s XCS is used to implement and test the proposed systems, which is a well-tested, online learning and accuracy based LCS model. To extract the reusable building blocks of knowledge, GP-tree like, code-fragments are introduced, which are more than simply another representation (e.g. ternary or real-valued alphabets). This thesis is extended to capture the underlying patterns in a problemusing a cyclic representation. Hard problems are experimented to test the newly developed scalable systems and compare them with benchmark techniques. Specifically, this work develops four systems to improve the scalability of XCS-based classifier systems. (1) Building blocks of knowledge are extracted fromsmaller problems of a Boolean domain and reused in learning more complex, large-scale problems in the domain, for the first time. By utilizing the learnt knowledge from small-scale problems, the developed XCSCFC (i.e. XCS with Code-Fragment Conditions) system readily solves problems of a scale that existing LCS and GP approaches cannot, e.g. the 135-bitMUX problem. (2) The introduction of the code fragments in classifier actions in XCSCFA (i.e. XCS with Code-Fragment Actions) enables the rich representation of GP, which when couples with the divide and conquer approach of LCS, to successfully solve various complex, overlapping and niche imbalance Boolean problems that are difficult to solve using numeric action based XCS. (3) The underlying patterns in a problem domain are encapsulated in classifier rules encoded by a cyclic representation. The developed XCSSMA system produces general solutions of any scale n for a number of important Boolean problems, for the first time in the field of LCS, e.g. parity problems. (4) Optimal solutions for various real-valued problems are evolved by extending the existing real-valued XCSR system with code-fragment actions to XCSRCFA. Exploiting the combined power of GP and LCS techniques, XCSRCFA successfully learns various continuous action and function approximation problems that are difficult to learn using the base techniques. This research work has shown that LCSs can scale to complex, largescale problems through reusing learnt knowledge. The messy nature, disassociation of message to condition order, masking, feature construction, and reuse of extracted knowledge add additional abilities to the XCS family of LCSs. The ability to use rich encoding in antecedent GP-like codefragments or consequent cyclic representation leads to the evolution of accurate, maximally general and compact solutions in learning various complex Boolean as well as real-valued problems. Effectively exploiting the combined power of GP and LCS techniques, various continuous action and function approximation problems are solved in a simple and straight forward manner. The analysis of the evolved rules reveals, for the first time in XCS, that no matter how specific or general the initial classifiers are, all the optimal classifiers are converged through the mechanism ‘be specific then generalize’ near the final stages of evolution. Also that standard XCS does not use all available information or all available genetic operators to evolve optimal rules, whereas the developed code-fragment action based systems effectively use figure and ground information during the training process. Thiswork has created a platformto explore the reuse of learnt functionality, not just terminal knowledge as present, which is needed to replicate human capabilities

Victoria University of Wellington

ResearchArchive at Victoria University of Wellington

Function of TALE1Xam in cassava bacterial blight: a transcriptomic approach

Author: Muñoz Bodnar Alejandra
Publication venue
Publication date: 01/01/2012
Field of study

Abstract. Xanthomonas axonopodis pv. manihotis (Xam) is a gram negative bacteria causing the Cassava Bacterial Blight (CBB) in Manihot esculenta Crantz. Cassava represents one of the most important sources of carbohydrates for around one billion people around the world as well as a source of energy due to its high starch levels content. The CBB disease represents an important limitation for cassava massive production and little is known about this pathosystem. Bacterial pathogenicity often relies on the injection in eucaryotic host cells of effector proteins via a type III secretion system (TTSS). Between all the type III effectors described up to now, Transcription Activator-Like Type III effectors (TALE) appear as particularly interesting. Once injected into the plant cell, TALEs go into the nucleus cell and modulate the expression of target host genes to the benefit of the invading bacteria by interacting directly with plant DNA. In Xam, only one gene belonging to this family has been functionally studied so far. It consists on TALE1Xam. This work aim to identify cassava genes whose expression will be modified upon the presence of TALE1Xam. By means of a microarray containing 5700 cassava genes, the TALE code and two Hi-RNAseq lanes, we seek out direct TALE1Xam target genes. Hence, through functional qRT validation, specific artificial TALEs design and statistical analyses between cassava plants challenged with Xam Δ TALE1Xam vs. Xam + TALE1Xam, we proposed that TALE1Xam is potentially interacting with a Heat Shock Transcription Factor B3. Moreover we argue that this gene is responsible of the susceptibility during Xam infection. Furthermore this work represents the first complete transcriptomic approach done in the cassava/Xam interaction and open huge possibilities to understand and study CBB.Xanthomonas axonopodis pv. manihotis (Xam) en una bacteria gram negativa responsable del añublo bacteriano de la yuca. La yuca (Manihot esculenta Crantz) es una de las fuentes más importantes de carbohidratos para más de 1000 millones de personas alrededor del mundo y representa igualmente una fuente importante de energía por las altas concentraciones de almidón en sus raíces. El añublo bacteriano de la yuca representa una limitación importante para el cultivo masivo de este alimento, sin embargo este patosistema ha sido muy poco estudiado. La patogenicidad bacteriana, depende frecuentemente de la capacidad de la bacteria de inyectar efectores a través del Sistema de Secreción Tipo III (SSTIII). Entre todos los tipos de efectores descritos hasta hoy, los efectores tipo TAL (TALEs, del inglés Transcription Activator Like Effectors) son particularmente interesantes. Una vez inyectados en la célula vegetal, los TALEs son capaces posicionarse en el núcleo celular en donde tienen la capacidad de interactuar directamente con el ADN de la planta modulando la expresión de genes. Los genes modulados mediante esta interacción benefician, en la mayoría de los casos, el progreso de la infección. En Xam sólo uno de estos efectores ha sido funcionalmente estudiado hasta ahora, se trata de TALE1Xam. Este trabajo tiene como objetivo identificar los genes de yuca cuya expresión sea modificada en presencia de TALE1Xam. Con este fin, utilizando un microarreglo con 5700 genes de yuca, el código de los efectores tipo TAL y dos lanes de RNAseq, obtuvimos targets directos de TALE1Xam. Porteriormente, a través de validación por qRT PCR, la construcción de TALEs artificiales y análisis estadísticos entre plantas inoculadas con Xam Δ TALE1Xam vs. plantas inoculadas con Xam + TALE1Xam proponemos una lista de genes candidatos regulados por TALE1Xam . Dentro de esta lista de genes, se destaca el Heat shock transcription factor B3. Finalmente, este trabajo representa la primera aproximación transcriptómica de la interacción yuca/Xam y abre por lo tanto enormes posibilidades para el estudio de este patosistema.Doctorad

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Universidad Nacional De Colombia - Repositorio Institucional UN

Hybrid Genetic Relational Search for Inductive Learning

Author: Divina F.
Publication venue: Amsterdam: Vrije Universiteit
Publication date: 01/01/2004
Field of study

An important characteristic of all natural systems is the ability to acquire knowledge through experience and to adapt to new situations. Learning is the single unifying theme of all natural systems. One of the basic ways of gaining knowledge is through examples of some concepts.For instance, we may learn how to distinguish a dog from other creatures after that we have seen a number of creatures, and after that someone (a teacher, or supervisor) told us which creatures are dogs and which are not. This way of learning is called supervised learning. Inductive Concept Learning (ICL) constitutes a central topic in machine learning. The problem can be formulated in the following manner: given a description language used to express possible hypotheses, a background knowledge, a set of positive examples, and a set of negative examples, one has to find a hypothesis which covers all positive examples and none of the negative ones. This is a supervised way of learning, since a supervisor has already classified the examples of the concept into positive and negative examples. The so learned concept can be used to classify previously unseen examples. In general deriving general conclusions from specific observation is called induction. Thus in ICL, concepts are induced because obtained from the observation of a limited set of training examples. The process can be seen as a search process. Starting from an initial hypothesis, what is done is searching the space of the possible hypotheses for one that fits the given set of examples. A representation language has to be chosen in order to represent concepts, examples and the background knowledge. This is an important choice, because this may limit the kind of concept we can learn. With a representation language that has a low expressive power we may not be able to represent some problem domain, because too complex for the language adopted. On the other side, a too expressive language may give us the possibility to represent all problem domains. However this solution may also give us too much freedom, in the sense that we can build concepts in too many different ways, and this could lead to the impossibility of finding the right concept. We are interested in learning concepts expressed in a fragment of first--order logic (FOL). This subject is known as Inductive Logic Programming (ILP), where the knowledge to be learn is expressed by Horn clauses, which are used in programming languages based on logic programming like Prolog. Learning systems that use a representation based on first--order logic have been successfully applied to relevant real life problems, e.g., learning a specific property related to carcinogenicity. Learning first--order hypotheses is a hard task, due to the huge search space one has to deal with. The approach used by the majority of ILP systems tries to overcome this problem by using specific search strategies, like the top-down and the inverse resolution mechanism. However, the greedy selection strategies adopted for reducing the computational effort, render techniques based on this approach often incapable of escaping from local optima. An alternative approach is offered by genetic algorithms (GAs). GAs have proved to be successful in solving comparatively hard optimization problems, as well as problems like ICL. GAs represents a good approach when the problems to solve are characterized by a high number of variables, when there is interaction among variables, when there are mixed types of variables, e.g., numerical and nominal, and when the search space presents many local optima. Moreover it is easy to hybridize GAs with other techniques that are known to be good for solving some classes of problems. Another appealing feature of GAs is represented by their intrinsic parallelism, and their use of exploration operators, which give them the possibility of escaping from local optima. However this latter characteristic of GAs is also responsible for their rather poor performance on learning tasks which are easy to tackle by algorithms that use specific search strategies. These observations suggest that the two approaches above described, i.e., standard ILP strategies and GAs, are applicable to partly complementary classes of learning problems. More important, they indicate that a system incorporating features from both approaches could profit from the different benefits of the approaches. This motivates the aim of this thesis, which is to develop a system based on GAs for ILP that incorporates search strategies used in successful ILP systems. Our approach is inspired by memetic algorithms, a population based search method for combinatorial optimization problems. In evolutionary computation memetic algorithms are GAs in which individuals can be refined during their lifetime.Eiben, A.E. [Promotor]Marchiori, E. [Copromotor

VU Research Portal

From cluster databases to cloud storage: Providing transactional support on the cloud

Author: Navarro Martín Joan
Publication venue: Blanquerna - Universitat Ramon Llull
Publication date: 01/01/2015
Field of study

Durant les últimes tres dècades, les limitacions tecnològiques (com per exemple la capacitat dels dispositius d'emmagatzematge o l'ample de banda de les xarxes de comunicació) i les creixents demandes dels usuaris (estructures d'informació, volums de dades) han conduït l'evolució de les bases de dades distribuïdes. Des dels primers repositoris de dades per arxius plans que es van desenvolupar en la dècada dels vuitanta, s'han produït importants avenços en els algoritmes de control de concurrència, protocols de replicació i en la gestió de transaccions. No obstant això, els reptes moderns d'emmagatzematge de dades que plantegen el Big Data i el cloud computing—orientats a millorar la limitacions pel que fa a escalabilitat i elasticitat de les bases de dades estàtiques—estan empenyent als professionals a relaxar algunes propietats importants dels sistemes transaccionals clàssics, cosa que exclou a diverses aplicacions les quals no poden encaixar en aquesta estratègia degut a la seva alta dependència transaccional. El propòsit d'aquesta tesi és abordar dos reptes importants encara latents en el camp de les bases de dades distribuïdes: (1) les limitacions pel que fa a escalabilitat dels sistemes transaccionals i (2) el suport transaccional en repositoris d'emmagatzematge en el núvol. Analitzar les tècniques tradicionals de control de concurrència i de replicació, utilitzades per les bases de dades clàssiques per suportar transaccions, és fonamental per identificar les raons que fan que aquests sistemes degradin el seu rendiment quan el nombre de nodes i / o quantitat de dades creix. A més, aquest anàlisi està orientat a justificar el disseny dels repositoris en el núvol que deliberadament han deixat de banda el suport transaccional. Efectivament, apropar el paradigma de l'emmagatzematge en el núvol a les aplicacions que tenen una forta dependència en les transaccions és fonamental per a la seva adaptació als requeriments actuals pel que fa a volums de dades i models de negoci. Aquesta tesi comença amb la proposta d'un simulador de protocols per a bases de dades distribuïdes estàtiques, el qual serveix com a base per a la revisió i comparativa de rendiment dels protocols de control de concurrència i les tècniques de replicació existents. Pel que fa a la escalabilitat de les bases de dades i les transaccions, s'estudien els efectes que té executar diferents perfils de transacció sota diferents condicions. Aquesta anàlisi contínua amb una revisió dels repositoris d'emmagatzematge de dades en el núvol existents—que prometen encaixar en entorns dinàmics que requereixen alta escalabilitat i disponibilitat—, el qual permet avaluar els paràmetres i característiques que aquests sistemes han sacrificat per tal de complir les necessitats actuals pel que fa a emmagatzematge de dades a gran escala. Per explorar les possibilitats que ofereix el paradigma del cloud computing en un escenari real, es presenta el desenvolupament d'una arquitectura d'emmagatzematge de dades inspirada en el cloud computing la qual s’utilitza per emmagatzemar la informació generada en les Smart Grids. Concretament, es combinen les tècniques de replicació en bases de dades transaccionals i la propagació epidèmica amb els principis de disseny usats per construir els repositoris de dades en el núvol. Les lliçons recollides en l'estudi dels protocols de replicació i control de concurrència en el simulador de base de dades, juntament amb les experiències derivades del desenvolupament del repositori de dades per a les Smart Grids, desemboquen en el que hem batejat com Epidemia: una infraestructura d'emmagatzematge per Big Data concebuda per proporcionar suport transaccional en el núvol. A més d'heretar els beneficis dels repositoris en el núvol en quant a escalabilitat, Epidemia inclou una capa de gestió de transaccions que reenvia les transaccions dels clients a un conjunt jeràrquic de particions de dades, cosa que permet al sistema oferir diferents nivells de consistència i adaptar elàsticament la seva configuració a noves demandes de càrrega de treball. Finalment, els resultats experimentals posen de manifest la viabilitat de la nostra contribució i encoratgen als professionals a continuar treballant en aquesta àrea.Durante las últimas tres décadas, las limitaciones tecnológicas (por ejemplo la capacidad de los dispositivos de almacenamiento o el ancho de banda de las redes de comunicación) y las crecientes demandas de los usuarios (estructuras de información, volúmenes de datos) han conducido la evolución de las bases de datos distribuidas. Desde los primeros repositorios de datos para archivos planos que se desarrollaron en la década de los ochenta, se han producido importantes avances en los algoritmos de control de concurrencia, protocolos de replicación y en la gestión de transacciones. Sin embargo, los retos modernos de almacenamiento de datos que plantean el Big Data y el cloud computing—orientados a mejorar la limitaciones en cuanto a escalabilidad y elasticidad de las bases de datos estáticas—están empujando a los profesionales a relajar algunas propiedades importantes de los sistemas transaccionales clásicos, lo que excluye a varias aplicaciones las cuales no pueden encajar en esta estrategia debido a su alta dependencia transaccional. El propósito de esta tesis es abordar dos retos importantes todavía latentes en el campo de las bases de datos distribuidas: (1) las limitaciones en cuanto a escalabilidad de los sistemas transaccionales y (2) el soporte transaccional en repositorios de almacenamiento en la nube. Analizar las técnicas tradicionales de control de concurrencia y de replicación, utilizadas por las bases de datos clásicas para soportar transacciones, es fundamental para identificar las razones que hacen que estos sistemas degraden su rendimiento cuando el número de nodos y/o cantidad de datos crece. Además, este análisis está orientado a justificar el diseño de los repositorios en la nube que deliberadamente han dejado de lado el soporte transaccional. Efectivamente, acercar el paradigma del almacenamiento en la nube a las aplicaciones que tienen una fuerte dependencia en las transacciones es crucial para su adaptación a los requerimientos actuales en cuanto a volúmenes de datos y modelos de negocio. Esta tesis empieza con la propuesta de un simulador de protocolos para bases de datos distribuidas estáticas, el cual sirve como base para la revisión y comparativa de rendimiento de los protocolos de control de concurrencia y las técnicas de replicación existentes. En cuanto a la escalabilidad de las bases de datos y las transacciones, se estudian los efectos que tiene ejecutar distintos perfiles de transacción bajo diferentes condiciones. Este análisis continua con una revisión de los repositorios de almacenamiento en la nube existentes—que prometen encajar en entornos dinámicos que requieren alta escalabilidad y disponibilidad—, el cual permite evaluar los parámetros y características que estos sistemas han sacrificado con el fin de cumplir las necesidades actuales en cuanto a almacenamiento de datos a gran escala. Para explorar las posibilidades que ofrece el paradigma del cloud computing en un escenario real, se presenta el desarrollo de una arquitectura de almacenamiento de datos inspirada en el cloud computing para almacenar la información generada en las Smart Grids. Concretamente, se combinan las técnicas de replicación en bases de datos transaccionales y la propagación epidémica con los principios de diseño usados para construir los repositorios de datos en la nube. Las lecciones recogidas en el estudio de los protocolos de replicación y control de concurrencia en el simulador de base de datos, junto con las experiencias derivadas del desarrollo del repositorio de datos para las Smart Grids, desembocan en lo que hemos acuñado como Epidemia: una infraestructura de almacenamiento para Big Data concebida para proporcionar soporte transaccional en la nube. Además de heredar los beneficios de los repositorios en la nube altamente en cuanto a escalabilidad, Epidemia incluye una capa de gestión de transacciones que reenvía las transacciones de los clientes a un conjunto jerárquico de particiones de datos, lo que permite al sistema ofrecer distintos niveles de consistencia y adaptar elásticamente su configuración a nuevas demandas cargas de trabajo. Por último, los resultados experimentales ponen de manifiesto la viabilidad de nuestra contribución y alientan a los profesionales a continuar trabajando en esta área.Over the past three decades, technology constraints (e.g., capacity of storage devices, communication networks bandwidth) and an ever-increasing set of user demands (e.g., information structures, data volumes) have driven the evolution of distributed databases. Since flat-file data repositories developed in the early eighties, there have been important advances in concurrency control algorithms, replication protocols, and transactions management. However, modern concerns in data storage posed by Big Data and cloud computing—related to overcome the scalability and elasticity limitations of classic databases—are pushing practitioners to relax some important properties featured by transactions, which excludes several applications that are unable to fit in this strategy due to their intrinsic transactional nature. The purpose of this thesis is to address two important challenges still latent in distributed databases: (1) the scalability limitations of transactional databases and (2) providing transactional support on cloud-based storage repositories. Analyzing the traditional concurrency control and replication techniques, used by classic databases to support transactions, is critical to identify the reasons that make these systems degrade their throughput when the number of nodes and/or amount of data rockets. Besides, this analysis is devoted to justify the design rationale behind cloud repositories in which transactions have been generally neglected. Furthermore, enabling applications which are strongly dependent on transactions to take advantage of the cloud storage paradigm is crucial for their adaptation to current data demands and business models. This dissertation starts by proposing a custom protocol simulator for static distributed databases, which serves as a basis for revising and comparing the performance of existing concurrency control protocols and replication techniques. As this thesis is especially concerned with transactions, the effects on the database scalability of different transaction profiles under different conditions are studied. This analysis is followed by a review of existing cloud storage repositories—that claim to be highly dynamic, scalable, and available—, which leads to an evaluation of the parameters and features that these systems have sacrificed in order to meet current large-scale data storage demands. To further explore the possibilities of the cloud computing paradigm in a real-world scenario, a cloud-inspired approach to store data from Smart Grids is presented. More specifically, the proposed architecture combines classic database replication techniques and epidemic updates propagation with the design principles of cloud-based storage. The key insights collected when prototyping the replication and concurrency control protocols at the database simulator, together with the experiences derived from building a large-scale storage repository for Smart Grids, are wrapped up into what we have coined as Epidemia: a storage infrastructure conceived to provide transactional support on the cloud. In addition to inheriting the benefits of highly-scalable cloud repositories, Epidemia includes a transaction management layer that forwards client transactions to a hierarchical set of data partitions, which allows the system to offer different consistency levels and elastically adapt its configuration to incoming workloads. Finally, experimental results highlight the feasibility of our contribution and encourage practitioners to further research in this area

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Tesis Doctorals en Xarxa

Using MapReduce Streaming for Distributed Life Simulation on the Cloud

Author: Radenski Atanas
Publication venue: Chapman University Digital Commons
Publication date: 01/01/2013
Field of study

Distributed software simulations are indispensable in the study of large-scale life models but often require the use of technically complex lower-level distributed computing frameworks, such as MPI. We propose to overcome the complexity challenge by applying the emerging MapReduce (MR) model to distributed life simulations and by running such simulations on the cloud. Technically, we design optimized MR streaming algorithms for discrete and continuous versions of Conway’s life according to a general MR streaming pattern. We chose life because it is simple enough as a testbed for MR’s applicability to a-life simulations and general enough to make our results applicable to various lattice-based a-life models. We implement and empirically evaluate our algorithms’ performance on Amazon’s Elastic MR cloud. Our experiments demonstrate that a single MR optimization technique called strip partitioning can reduce the execution time of continuous life simulations by 64%. To the best of our knowledge, we are the first to propose and evaluate MR streaming algorithms for lattice-based simulations. Our algorithms can serve as prototypes in the development of novel MR simulation algorithms for large-scale lattice-based a-life models.https://digitalcommons.chapman.edu/scs_books/1014/thumbnail.jp

Chapman University Digital Commons

New insights on black rot of crucifers : disclosing novel virulence genes by in vivo host/pathogen transcriptomics and functional genetics

Author: Cruz Joana Costa Cardoso da
Publication venue
Publication date: 01/01/2017
Field of study

Tese de doutoramento, Biologia (Microbiologia), Universidade de Lisboa, Faculdade de Ciências, 2018Xanthomonas campestris belongs to the gamma subdivision of Proteobacteria and is the type species of a genus comprising 28 species of plant pathogenic bacteria, affecting 124 species of monocotyledonous and 268 species of dicotyledonous plants. Identified for the first time in 1895 in the United States of America, this species has undergone several taxonomic reclassifications, and currently comprises pathovars X. campestris pv. campestris (Xcc), X. campestris pv. raphani (Xcr) and X. campestris pv. incanae (Xci), causing distinct diseases in vegetables, ornamentals and spontaneous plants belonging to Brassicaceae family, as well as some vegetable crops belonging to the Solanaceae family. Black rot disease, caused by Xcc, is the most important bacterial disease of Brassicaceae, affecting crops and weeds worldwide. After penetration through hydathodes on leaf margins, bacterial multiplication causes typical V-shaped lesions on leaf margins, followed by darkening of the veins that evolve to necrosis of the affected tissue, ultimately leading to plant death. Short-distance dispersal through plant-to-plant contact, human manipulation, wind, insects, aerosols and irrigation water or rain, associated with long distance dispersal through infected seeds and plantlets following commercial routes across the globe are responsible for the worldwide distribution of this disease. Studies on the interaction of Xcc with B. oleracea have identified several virulence genes, however no resistance genes have been successfully cloned in Brassicaceae crops and there is still a lack of effective black rot disease control measures, which continues to cause severe economic losses worldwide. Despite the growing body of work on this subject, molecular mechanisms of host-pathogen interaction have been mostly inferred using in vitro approaches, resulting in a knowledge gap concerning the in vivo behavior of this pathogen during its interaction with plant hosts. In Portugal, an important centre of Brassicaceae domestication, Xcc has long been identified and Portuguese Xcc strains have already been described as a unique sub-population of this pathogen. In this context, with the goal of bringing new insights on the molecular mechanisms of host/pathogen interaction, a set of 33 X. campestris strains collected in the country was characterized, in terms of pathogenicity, virulence, population structure and phylogenetic diversity. Furthermore, the in planta transcriptomes of the Xcc strains representing the extremes of the virulence spectrum during the infection process on two selected hosts were profiled, in a total of four pathosystems. A high level of phenotypic diversity, supported by phylogenetic data, was found among X. campestris isolates, allowing the identification of Xcc closely related pathovars Xcr and Xci for the first time in Portugal. Moreover, among Xcc isolates, presence of races 4, 6 and 7 was recorded, and two novel races of this pathovar, race 10 and race 11, were also described. Contrastingly, the partial virulence profiles determined by the presence of known virulence genes were highly conserved among the set of X. campestris strains. The integration of Portuguese strains in a global dataset comprising a total of 75 X. campestris strains provided a snapshot of the worldwide X. campestris phylogenetic diversity and population structure, correlating the existing pathovars with three distinct genetic lineages. The identification of an intermediate link between Xcc and Xcr, suggests that these pathovars are more closely related to each other than to Xci. This gradient of genetic relatedness seems to be associated to the host range of each pathovar. While Xci appears only to be pathogenic on ornamental Brassicaceae, Xcc and Xcr have a partially overlapping host range: Xcc affects mostly Brassicaceous hosts, whereas Xcr, while affecting the same hosts, is also pathogenic on Solanaceous hosts. These findings suggest that instead of causing a host shift, the genetic divergence between Xcc and Xcr conferred the latter strains the ability to explore additional hosts, resulting in a broader host range. Although population displayed a major clonal structure, the presence of recombinational events that may have driven the ecological specialization of X. campestris and distinct host ranges was highlighted. Portuguese X. campestris strains provided a significant input of genetic diversity, confirming this region as an important diversification reservoir, most likely taking place through host-pathogen co-evolution. Virulence assessment of Portuguese Xcc strains, based on the percentage on infected leaf area after inoculation, highlighted that virulence was not homogeneous within races and that higher pathogenicity was not necessarily correlated higher virulence. It was then possible to select the extremes of the virulence spectrum – CPBF213, the lowest virulence strain (L-vir), and CPBF278, the highest virulence strain (H-vir). The in vivo transcriptome profiling of those contrastingly virulent strains infecting two cultivars of B. oleracea, using RNA-Seq, allowed establishing that Xcc undergoes transcriptional reprogramming in a host-independent manner, suggesting that virulence is an intrinsic feature of the pathogen. A total of 154 differentially expressed genes (DEGs) were identified between the two strains. The most represented functional category of DEGs was, as expected, ‘pathogenicity and adaptation’, representing 16% of the DEGs, although the complex adaptation of pathogen cells to the host environment was highlighted by the presence of DEGs from various functional categories. Among DEGs, Type III effector coding genes xopE2 and xopD were induced in L-vir strain, while xopAC, xopX and xopR were induced in H-vir strain. In addition to Type III effectors, genes encoding proteins involved in signal transduction systems, transport, detoxification mechanisms and other virulenceassociated processes were found differentially expressed between both strains, highlighting their role in virulence regulation. Overall, low virulence appears to be the combined result of impaired sensory mechanisms, reduced detoxification of reactive oxygen species, decreased motility and higher production of pathogen-associated molecular patterns (PAMPs), accompanied by an overexpression of avirulence proteins and a repression of virulence proteins targeting the hosts’ PAMP-triggered immune responses. Contrastingly, the highly virulent strain showed to be better equipped to escape initial plant defenses, whether avoiding detection, or by the ability to counteract those responses. On the other hand, upon detection, highly virulent pathogens will show a decreased expression of avirulence proteins, making them less recognizable by the hosts Effector-Triggered Immunity mechanisms, and thus able to continue multiplying and cause more severe disease symptoms. Through the study of differential infections in planta, the highly innovative strategy used in this work contributed to disclosure of novel virulence related genes in X. campestris pv. campestris - B. oleracea pathosystem, that will be crucial to further detail the virulence regulation network, and develop new tools for the control of black rot disease

Universidade de Lisboa: Repositório.UL

Proceedings of The Multi-Agent Logics, Languages, and Organisations Federated Workshops (MALLOW 2010)

Author: Boissier Olivier
El Fallah-Seghrouchni Amal
Hassas Salima
Maudet Nicolas
Publication venue: CEUR-WS.org
Publication date: 01/01/2010
Field of study

http://ceur-ws.org/Vol-627/allproceedings.pdfInternational audienceMALLOW-2010 is a third edition of a series initiated in 2007 in Durham, and pursued in 2009 in Turin. The objective, as initially stated, is to "provide a venue where: the cost of participation was minimum; participants were able to attend various workshops, so fostering collaboration and cross-fertilization; there was a friendly atmosphere and plenty of time for networking, by maximizing the time participants spent together"

HAL-EMSE