2,640 research outputs found

    An evolutionary hill-climbing approach to symbolic theory revision

    Get PDF

    Exploring Complex Disease Gene Relationships Using Simultaneous Analysis

    Get PDF
    The characterization of complex diseases remains a great challenge for biomedical researchers due to the myriad interactions of genetic and environmental factors. Adaptation of phylogenomic techniques to increasingly available genomic data provides an evolutionary perspective that may elucidate important unknown features of complex diseases. Here an automated method is presented that leverages publicly available genomic data and phylogenomic techniques. The approach is tested with nine genes implicated in the development of Alzheimer Disease, a complex neurodegenerative syndrome. The developed technique, which is an update to a previously described Perl script called “ASAP,” was implemented through a suite of Ruby scripts entitled “ASAP2,” first compiles a list of sequence-similarity based orthologues using PSI-BLAST and a recursive NCBI BLAST+ search strategy, then constructs maximum parsimony phylogenetic trees for each set of nucleotide and protein sequences, and calculates phylogenetic metrics (partitioned Bremer support values, combined branch scores, and Robinson-Foulds distance) to provide an empirical assessment of evolutionary conservation within a given genetic network. This study demonstrates the potential for using automated simultaneous phylogenetic analysis to uncover previously unknown relationships among disease-associated genes that may not have been apparent using traditional, single-gene methods. Furthermore, the results provide the first integrated evolutionary history of an Alzheimer Disease gene network and identify potentially important co-evolutionary clustering around components of oxidative stress pathways

    Fitness Uniform Optimization

    Full text link
    In evolutionary algorithms, the fitness of a population increases with time by mutating and recombining individuals and by a biased selection of more fit individuals. The right selection pressure is critical in ensuring sufficient optimization progress on the one hand and in preserving genetic diversity to be able to escape from local optima on the other hand. Motivated by a universal similarity relation on the individuals, we propose a new selection scheme, which is uniform in the fitness values. It generates selection pressure toward sparsely populated fitness regions, not necessarily toward higher fitness, as is the case for all other selection schemes. We show analytically on a simple example that the new selection scheme can be much more effective than standard selection schemes. We also propose a new deletion scheme which achieves a similar result via deletion and show how such a scheme preserves genetic diversity more effectively than standard approaches. We compare the performance of the new schemes to tournament selection and random deletion on an artificial deceptive problem and a range of NP-hard problems: traveling salesman, set covering and satisfiability.Comment: 25 double-column pages, 12 figure

    XML Matchers: approaches and challenges

    Full text link
    Schema Matching, i.e. the process of discovering semantic correspondences between concepts adopted in different data source schemas, has been a key topic in Database and Artificial Intelligence research areas for many years. In the past, it was largely investigated especially for classical database models (e.g., E/R schemas, relational databases, etc.). However, in the latest years, the widespread adoption of XML in the most disparate application fields pushed a growing number of researchers to design XML-specific Schema Matching approaches, called XML Matchers, aiming at finding semantic matchings between concepts defined in DTDs and XSDs. XML Matchers do not just take well-known techniques originally designed for other data models and apply them on DTDs/XSDs, but they exploit specific XML features (e.g., the hierarchical structure of a DTD/XSD) to improve the performance of the Schema Matching process. The design of XML Matchers is currently a well-established research area. The main goal of this paper is to provide a detailed description and classification of XML Matchers. We first describe to what extent the specificities of DTDs/XSDs impact on the Schema Matching task. Then we introduce a template, called XML Matcher Template, that describes the main components of an XML Matcher, their role and behavior. We illustrate how each of these components has been implemented in some popular XML Matchers. We consider our XML Matcher Template as the baseline for objectively comparing approaches that, at first glance, might appear as unrelated. The introduction of this template can be useful in the design of future XML Matchers. Finally, we analyze commercial tools implementing XML Matchers and introduce two challenging issues strictly related to this topic, namely XML source clustering and uncertainty management in XML Matchers.Comment: 34 pages, 8 tables, 7 figure

    Handling High-Level Model Changes Using Search Based Software Engineering

    Full text link
    Model-Driven Engineering (MDE) considers models as first-class artifacts during the software lifecycle. The number of available tools, techniques, and approaches for MDE is increasing as its use gains traction in driving quality, and controlling cost in evolution of large software systems. Software models, defined as code abstractions, are iteratively refined, restructured, and evolved. This is due to many reasons such as fixing defects in design, reflecting changes in requirements, and modifying a design to enhance existing features. In this work, we focus on four main problems related to the evolution of software models: 1) the detection of applied model changes, 2) merging parallel evolved models, 3) detection of design defects in merged model, and 4) the recommendation of new changes to fix defects in software models. Regarding the first contribution, a-posteriori multi-objective change detection approach has been proposed for evolved models. The changes are expressed in terms of atomic and composite refactoring operations. The majority of existing approaches detects atomic changes but do not adequately address composite changes which mask atomic operations in intermediate models. For the second contribution, several approaches exist to construct a merged model by incorporating all non-conflicting operations of evolved models. Conflicts arise when the application of one operation disables the applicability of another one. The essence of the problem is to identify and prioritize conflicting operations based on importance and context – a gap in existing approaches. This work proposes a multi-objective formulation of model merging that aims to maximize the number of successfully applied merged operations. For the third and fourth contributions, the majority of existing works focuses on refactoring at source code level, and does not exploit the benefits of software design optimization at model level. However, refactoring at model level is inherently more challenging due to difficulty in assessing the potential impact on structural and behavioral features of the software system. This requires analysis of class and activity diagrams to appraise the overall system quality, feasibility, and inter-diagram consistency. This work focuses on designing, implementing, and evaluating a multi-objective refactoring framework for detection and fixing of design defects in software models.Ph.D.Information Systems Engineering, College of Engineering and Computer ScienceUniversity of Michigan-Dearbornhttp://deepblue.lib.umich.edu/bitstream/2027.42/136077/1/Usman Mansoor Final.pdfDescription of Usman Mansoor Final.pdf : Dissertatio

    Current approaches to gene regulatory network modelling

    Get PDF
    Many different approaches have been developed to model and simulate gene regulatory networks. We proposed the following categories for gene regulatory network models: network parts lists, network topology models, network control logic models, and dynamic models. Here we will describe some examples for each of these categories. We will study the topology of gene regulatory networks in yeast in more detail, comparing a direct network derived from transcription factor binding data and an indirect network derived from genome-wide expression data in mutants. Regarding the network dynamics we briefly describe discrete and continuous approaches to network modelling, then describe a hybrid model called Finite State Linear Model and demonstrate that some simple network dynamics can be simulated in this model

    Inferring the role of transcription factors in regulatory networks

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Expression profiles obtained from multiple perturbation experiments are increasingly used to reconstruct transcriptional regulatory networks, from well studied, simple organisms up to higher eukaryotes. Admittedly, a key ingredient in developing a reconstruction method is its ability to integrate heterogeneous sources of information, as well as to comply with practical observability issues: measurements can be scarce or noisy. In this work, we show how to combine a network of genetic regulations with a set of expression profiles, in order to infer the functional effect of the regulations, as inducer or repressor. Our approach is based on a consistency rule between a network and the signs of variation given by expression arrays.</p> <p>Results</p> <p>We evaluate our approach in several settings of increasing complexity. First, we generate artificial expression data on a transcriptional network of <it>E. coli </it>extracted from the literature (1529 nodes and 3802 edges), and we estimate that 30% of the regulations can be annotated with about 30 profiles. We additionally prove that at most 40.8% of the network can be inferred using our approach. Second, we use this network in order to validate the predictions obtained with a compendium of real expression profiles. We describe a filtering algorithm that generates particularly reliable predictions. Finally, we apply our inference approach to <it>S. cerevisiae </it>transcriptional network (2419 nodes and 4344 interactions), by combining ChIP-chip data and 15 expression profiles. We are able to detect and isolate inconsistencies between the expression profiles and a significant portion of the model (15% of all the interactions). In addition, we report predictions for 14.5% of all interactions.</p> <p>Conclusion</p> <p>Our approach does not require accurate expression levels nor times series. Nevertheless, we show on both data, real and artificial, that a relatively small number of perturbation experiments are enough to determine a significant portion of regulatory effects. This is a key practical asset compared to statistical methods for network reconstruction. We demonstrate that our approach is able to provide accurate predictions, even when the network is incomplete and the data is noisy.</p

    Minimal Algorithmic Information Loss Methods for Dimension Reduction, Feature Selection and Network Sparsification

    Full text link
    We introduce a family of unsupervised, domain-free, and (asymptotically) model-independent algorithms based on the principles of algorithmic probability and information theory designed to minimize the loss of algorithmic information, including a lossless-compression-based lossy compression algorithm. The methods can select and coarse-grain data in an algorithmic-complexity fashion (without the use of popular compression algorithms) by collapsing regions that may procedurally be regenerated from a computable candidate model. We show that the method can preserve the salient properties of objects and perform dimension reduction, denoising, feature selection, and network sparsification. As validation case, we demonstrate that the method preserves all the graph-theoretic indices measured on a well-known set of synthetic and real-world networks of very different nature, ranging from degree distribution and clustering coefficient to edge betweenness and degree and eigenvector centralities, achieving equal or significantly better results than other data reduction and some of the leading network sparsification methods. The methods (InfoRank, MILS) can also be applied to applications such as image segmentation based on algorithmic probability.Comment: 23 pages in double column including Appendix, online implementation at http://complexitycalculator.com/MILS

    A Comprehensive Bibliometric Analysis on Social Network Anonymization: Current Approaches and Future Directions

    Full text link
    In recent decades, social network anonymization has become a crucial research field due to its pivotal role in preserving users' privacy. However, the high diversity of approaches introduced in relevant studies poses a challenge to gaining a profound understanding of the field. In response to this, the current study presents an exhaustive and well-structured bibliometric analysis of the social network anonymization field. To begin our research, related studies from the period of 2007-2022 were collected from the Scopus Database then pre-processed. Following this, the VOSviewer was used to visualize the network of authors' keywords. Subsequently, extensive statistical and network analyses were performed to identify the most prominent keywords and trending topics. Additionally, the application of co-word analysis through SciMAT and the Alluvial diagram allowed us to explore the themes of social network anonymization and scrutinize their evolution over time. These analyses culminated in an innovative taxonomy of the existing approaches and anticipation of potential trends in this domain. To the best of our knowledge, this is the first bibliometric analysis in the social network anonymization field, which offers a deeper understanding of the current state and an insightful roadmap for future research in this domain.Comment: 73 pages, 28 figure

    Evolving Graphs by Graph Programming

    Get PDF
    Graphs are a ubiquitous data structure in computer science and can be used to represent solutions to difficult problems in many distinct domains. This motivates the use of Evolutionary Algorithms to search over graphs and efficiently find approximate solutions. However, existing techniques often represent and manipulate graphs in an ad-hoc manner. In contrast, rule-based graph programming offers a formal mechanism for describing relations over graphs. This thesis proposes the use of rule-based graph programming for representing and implementing genetic operators over graphs. We present the Evolutionary Algorithm Evolving Graphs by Graph Programming and a number of its extensions which are capable of learning stateful and stateless digital circuits, symbolic expressions and Artificial Neural Networks. We demonstrate that rule-based graph programming may be used to implement new and effective constraint-respecting mutation operators and show that these operators may strictly generalise others found in the literature. Through our proposal of Semantic Neutral Drift, we accelerate the search process by building plateaus into the fitness landscape using domain knowledge of equivalence. We also present Horizontal Gene Transfer, a mechanism whereby graphs may be passively recombined without disrupting their fitness. Through rigorous evaluation and analysis of over 20,000 independent executions of Evolutionary Algorithms, we establish numerous benefits of our approach. We find that on many problems, Evolving Graphs by Graph Programming and its variants may significantly outperform other approaches from the literature. Additionally, our empirical results provide further evidence that neutral drift aids the efficiency of evolutionary search
    • 

    corecore