3,852 research outputs found
DNA ANALYSIS USING GRAMMATICAL INFERENCE
An accurate language definition capable of distinguishing between coding and non-coding DNA has important applications and analytical significance to the field of computational biology. The method proposed here uses positive sample grammatical inference and statistical information to infer languages for coding DNA.
An algorithm is proposed for the searching of an optimal subset of input sequences for the inference of regular grammars by optimizing a relevant accuracy metric. The algorithm does not guarantee the finding of the optimal subset; however, testing shows improvement in accuracy and performance over the basis algorithm.
Testing shows that the accuracy of inferred languages for components of DNA are consistently accurate. By using the proposed algorithm languages are inferred for coding DNA with average conditional probability over 80%. This reveals that languages for components of DNA can be inferred and are useful independent of the process that created them. These languages can then be analyzed or used for other tasks in computational biology.
To illustrate potential applications of regular grammars for DNA components, an inferred language for exon sequences is applied as post processing to Hidden Markov exon prediction to reduce the number of wrong exons detected and improve the specificity of the model significantly
One-Step or Two-Step Optimization and the Overfitting Phenomenon: A Case Study on Time Series Classification
For the last few decades, optimization has been developing at a fast rate.
Bio-inspired optimization algorithms are metaheuristics inspired by nature.
These algorithms have been applied to solve different problems in engineering,
economics, and other domains. Bio-inspired algorithms have also been applied in
different branches of information technology such as networking and software
engineering. Time series data mining is a field of information technology that
has its share of these applications too. In previous works we showed how
bio-inspired algorithms such as the genetic algorithms and differential
evolution can be used to find the locations of the breakpoints used in the
symbolic aggregate approximation of time series representation, and in another
work we showed how we can utilize the particle swarm optimization, one of the
famous bio-inspired algorithms, to set weights to the different segments in the
symbolic aggregate approximation representation. In this paper we present, in
two different approaches, a new meta optimization process that produces optimal
locations of the breakpoints in addition to optimal weights of the segments.
The experiments of time series classification task that we conducted show an
interesting example of how the overfitting phenomenon, a frequently encountered
problem in data mining which happens when the model overfits the training set,
can interfere in the optimization process and hide the superior performance of
an optimization algorithm
Optimal Phase Swapping in Low Voltage Distribution Networks Based on Smart Meter Data and Optimization Heuristics
In this paper a modified version of the Harmony Search algorithm is proposed as a novel tool for phase swapping in Low Voltage Distribution Networks where the objective is to determine to which phase each load should be connected in order to reduce the unbalance when all phases are added into the neutral conductor. Unbalanced loads deteriorate power quality and increase costs of investment and operation. A correct assignment is a direct, effective alternative to prevent voltage peaks and network outages. The main contribution of this paper is the proposal of an optimization model for allocating phases consumers according to their individual consumption in the network of low-voltage distribution considering mono and bi-phase connections using real hourly load patterns, which implies that the computational complexity of the defined combinatorial optimization problem is heavily increased. For this purpose a novel metric function is defined in the proposed scheme. The performance of the HS algorithm has been compared with classical Genetic Algorithm. Presented results show that HS outperforms GA not only on terms of quality but on the convergence rate, reducing the computational complexity of the proposed scheme while provide mono and bi phase connections.This paper includes partial results of the UPGRID project. This project has re-
ceived funding from the European Unions Horizon 2020 research and innovation
programme under grant agreement No 646.531), for further information check
the website: http://upgrid.eu. As well as by the Basque Government through
the ELKARTEK programme (BID3A and BID3ABI projects)
Designing Algorithms for Optimization of Parameters of Functioning of Intelligent System for Radionuclide Myocardial Diagnostics
The influence of the number of complex components of Fast Fourier transformation in analyzing the polar maps of radionuclide examination of myocardium at rest and stress on the functional efficiency of the system of diagnostics of pathologies of myocardium was explored, and there were defined their optimum values in the information sense, which allows increasing the efficiency of the algorithms of forming the diagnostic decision rules by reducing the capacity of the dictionary of features of recognition.The information-extreme sequential cluster algorithms of the selection of the dictionary of features, which contains both quantitative and category features were developed and the results of their work were compared. The modificatios of the algorithms of the selection of the dictionary were suggested, which allows increasing both the search speed of the optimal in the information sense dictionary and reducing its capacity by 40 %. We managed to get the faultless by the training matrix decision rules, the accuracy of which is in the exam mode asymptotically approaches the limit.It was experimentally confirmed that the implementation of the proposed algorithm of the diagnosing system training has allowed to reduce the minimum representative volume of the training matrix from 300 to 81 vectors-implementations of the classes of recognition of the functional myocardium state
Automata-based adaptive behavior for economic modeling using game theory
In this paper, we deal with some specific domains of applications to game
theory. This is one of the major class of models in the new approaches of
modelling in the economic domain. For that, we use genetic automata which allow
to buid adaptive strategies for the players. We explain how the automata-based
formalism proposed - matrix representation of automata with multiplicities -
allows to define a semi-distance between the strategy behaviors. With that
tools, we are able to generate an automatic processus to compute emergent
systems of entities whose behaviors are represented by these genetic automata
Automata-based Adaptive Behavior for Economical Modelling Using Game Theory
In this chapter, we deal with some specific domains of applications to game
theory. This is one of the major class of models in the new approaches of
modelling in the economic domain. For that, we use genetic automata which allow
to build adaptive strategies for the players. We explain how the automata-based
formalism proposed - matrix representation of automata with multiplicities -
allows to define semi-distance between the strategy behaviors. With that tools,
we are able to generate an automatic processus to compute emergent systems of
entities whose behaviors are represented by these genetic automata
A new approach for transport network design and optimization
The solution of the transportation network optimization problem actually requires, in most cases, very intricate and powerful computer resources, so that it is not feasible to use classical algorithms. One promising way is to use stochastic search techniques. In this context, Genetic Algorithms (GAs) seem to be - among all the available methodologies- one of the most efficient methods able to approach transport network design and optimization. Particularly, this paper will focus the attention on the possibility of modelling and optimizing Public Bus Networks by means of GAs. In the proposed algorithm, the specific class of Cumulative GAs(CGAs) will be used for solving the first level of the network optimization problem, while a classical assignment model ,or alternatively a neural network approach ,will be adopted for the Fitness Function(FF) evaluation. CGAs will then be utilized in order to generate new populations of networks, which will be evaluated by means of a suitable software package. For each new solution some indicators will be calculated .A unique FF will be finally evaluated by means of a multicriteria method. Altough the research is still in a preliminary stage, the emerging first results concerning numerical cases show very good perspectives for this new approach. A test in real cases will also follow.
- …