3,852 research outputs found

    DNA ANALYSIS USING GRAMMATICAL INFERENCE

    Get PDF
    An accurate language definition capable of distinguishing between coding and non-coding DNA has important applications and analytical significance to the field of computational biology. The method proposed here uses positive sample grammatical inference and statistical information to infer languages for coding DNA. An algorithm is proposed for the searching of an optimal subset of input sequences for the inference of regular grammars by optimizing a relevant accuracy metric. The algorithm does not guarantee the finding of the optimal subset; however, testing shows improvement in accuracy and performance over the basis algorithm. Testing shows that the accuracy of inferred languages for components of DNA are consistently accurate. By using the proposed algorithm languages are inferred for coding DNA with average conditional probability over 80%. This reveals that languages for components of DNA can be inferred and are useful independent of the process that created them. These languages can then be analyzed or used for other tasks in computational biology. To illustrate potential applications of regular grammars for DNA components, an inferred language for exon sequences is applied as post processing to Hidden Markov exon prediction to reduce the number of wrong exons detected and improve the specificity of the model significantly

    One-Step or Two-Step Optimization and the Overfitting Phenomenon: A Case Study on Time Series Classification

    Get PDF
    For the last few decades, optimization has been developing at a fast rate. Bio-inspired optimization algorithms are metaheuristics inspired by nature. These algorithms have been applied to solve different problems in engineering, economics, and other domains. Bio-inspired algorithms have also been applied in different branches of information technology such as networking and software engineering. Time series data mining is a field of information technology that has its share of these applications too. In previous works we showed how bio-inspired algorithms such as the genetic algorithms and differential evolution can be used to find the locations of the breakpoints used in the symbolic aggregate approximation of time series representation, and in another work we showed how we can utilize the particle swarm optimization, one of the famous bio-inspired algorithms, to set weights to the different segments in the symbolic aggregate approximation representation. In this paper we present, in two different approaches, a new meta optimization process that produces optimal locations of the breakpoints in addition to optimal weights of the segments. The experiments of time series classification task that we conducted show an interesting example of how the overfitting phenomenon, a frequently encountered problem in data mining which happens when the model overfits the training set, can interfere in the optimization process and hide the superior performance of an optimization algorithm

    Optimal Phase Swapping in Low Voltage Distribution Networks Based on Smart Meter Data and Optimization Heuristics

    Get PDF
    In this paper a modified version of the Harmony Search algorithm is proposed as a novel tool for phase swapping in Low Voltage Distribution Networks where the objective is to determine to which phase each load should be connected in order to reduce the unbalance when all phases are added into the neutral conductor. Unbalanced loads deteriorate power quality and increase costs of investment and operation. A correct assignment is a direct, effective alternative to prevent voltage peaks and network outages. The main contribution of this paper is the proposal of an optimization model for allocating phases consumers according to their individual consumption in the network of low-voltage distribution considering mono and bi-phase connections using real hourly load patterns, which implies that the computational complexity of the defined combinatorial optimization problem is heavily increased. For this purpose a novel metric function is defined in the proposed scheme. The performance of the HS algorithm has been compared with classical Genetic Algorithm. Presented results show that HS outperforms GA not only on terms of quality but on the convergence rate, reducing the computational complexity of the proposed scheme while provide mono and bi phase connections.This paper includes partial results of the UPGRID project. This project has re- ceived funding from the European Unions Horizon 2020 research and innovation programme under grant agreement No 646.531), for further information check the website: http://upgrid.eu. As well as by the Basque Government through the ELKARTEK programme (BID3A and BID3ABI projects)

    Designing Algorithms for Optimization of Parameters of Functioning of Intelligent System for Radionuclide Myocardial Diagnostics

    Full text link
    The influence of the number of complex components of Fast Fourier transformation in analyzing the polar maps of radionuclide examination of myocardium at rest and stress on the functional efficiency of the system of diagnostics of pathologies of myocardium was explored, and there were defined their optimum values in the information sense, which allows increasing the efficiency of the algorithms of forming the diagnostic decision rules by reducing the capacity of the dictionary of features of recognition.The information-extreme sequential cluster algorithms of the selection of the dictionary of features, which contains both quantitative and category features were developed and the results of their work were compared. The modificatios of the algorithms of the selection of the dictionary were suggested, which allows increasing both the search speed of the optimal in the information sense dictionary and reducing its capacity by 40 %. We managed to get the faultless by the training matrix decision rules, the accuracy of which is in the exam mode asymptotically approaches the limit.It was experimentally confirmed that the implementation of the proposed algorithm of the diagnosing system training has allowed to reduce the minimum representative volume of the training matrix from 300 to 81 vectors-implementations of the classes of recognition of the functional myocardium state

    Automata-based adaptive behavior for economic modeling using game theory

    Full text link
    In this paper, we deal with some specific domains of applications to game theory. This is one of the major class of models in the new approaches of modelling in the economic domain. For that, we use genetic automata which allow to buid adaptive strategies for the players. We explain how the automata-based formalism proposed - matrix representation of automata with multiplicities - allows to define a semi-distance between the strategy behaviors. With that tools, we are able to generate an automatic processus to compute emergent systems of entities whose behaviors are represented by these genetic automata

    Automata-based Adaptive Behavior for Economical Modelling Using Game Theory

    Full text link
    In this chapter, we deal with some specific domains of applications to game theory. This is one of the major class of models in the new approaches of modelling in the economic domain. For that, we use genetic automata which allow to build adaptive strategies for the players. We explain how the automata-based formalism proposed - matrix representation of automata with multiplicities - allows to define semi-distance between the strategy behaviors. With that tools, we are able to generate an automatic processus to compute emergent systems of entities whose behaviors are represented by these genetic automata

    A new approach for transport network design and optimization

    Get PDF
    The solution of the transportation network optimization problem actually requires, in most cases, very intricate and powerful computer resources, so that it is not feasible to use classical algorithms. One promising way is to use stochastic search techniques. In this context, Genetic Algorithms (GAs) seem to be - among all the available methodologies- one of the most efficient methods able to approach transport network design and optimization. Particularly, this paper will focus the attention on the possibility of modelling and optimizing Public Bus Networks by means of GAs. In the proposed algorithm, the specific class of Cumulative GAs(CGAs) will be used for solving the first level of the network optimization problem, while a classical assignment model ,or alternatively a neural network approach ,will be adopted for the Fitness Function(FF) evaluation. CGAs will then be utilized in order to generate new populations of networks, which will be evaluated by means of a suitable software package. For each new solution some indicators will be calculated .A unique FF will be finally evaluated by means of a multicriteria method. Altough the research is still in a preliminary stage, the emerging first results concerning numerical cases show very good perspectives for this new approach. A test in real cases will also follow.
    • …
    corecore