79 research outputs found
On the role of metaheuristic optimization in bioinformatics
Metaheuristic algorithms are employed to solve complex and large-scale optimization problems in many different fields, from transportation and smart cities to finance. This paper discusses how metaheuristic algorithms are being applied to solve different optimization problems in the area of bioinformatics. While the text provides references to many optimization problems in the area, it focuses on those that have attracted more interest from the optimization community. Among the problems analyzed, the paper discusses in more detail the molecular docking problem, the protein structure prediction, phylogenetic inference, and different string problems. In addition, references to other relevant optimization problems are also given, including those related to medical imaging or gene selection for classification. From the previous analysis, the paper generates insights on research opportunities for the Operations Research and Computer Science communities in the field of bioinformatics
Metaheurísticas, optimización multiobjetivo y paralelismo para descubrir motifs en secuencias de ADN
La resolución de problemas complejos mediante técnicas evolutivas es uno de los aspectos más investigados en Informática.
El objetivo principal de esta tesis doctoral es desarrollar nuevos algoritmos capaces de resolver estos problemas con el menor tiempo computacional posible, mejorando la calidad de los resultados obtenidos por los métodos ya existentes. Para ello, combinamos tres conceptos importantes: metaheurísticas, optimización multiobjetivo y paralelismo. Con este fin, primero buscamos un problema de optimización importante que aún no fuese resuelto de forma eficiente y encontramos el Problema del Descubrimiento de Motifs (PDM). El PDM tiene como objetivo descubrir pequeños patrones repetidos (motifs) en conjuntos de secuencias de ADN que puedan poseer cierto significado biológico. Para abordarlo, definimos una formulación multiobjetivo adecuada a los requerimientos del mundo real, implementamos un total de diez algoritmos de distinta naturaleza (población, trayectoria, inteligencia colectiva...), analizando aspectos como la capacidad de escalar y converger. Finalmente, diseñamos diversas técnicas paralelas, haciendo uso de entornos de programación como OpenMP y MPI, que tratan de combinar las propiedades de varias metaheurísticas en una única aplicación. Los resultados obtenidos son estudiados en detalle a través de la aplicación de numerosos test estadísticos, y las predicciones son comparadas con las descubiertas por un total de trece herramientas biológicas bien conocidas en la literatura.
Las conclusiones obtenidas demuestran que la utilización de la optimización multiobjetivo en técnicas metaheurísticas favorece el descubrimiento de soluciones de calidad y que el paralelismo es útil para combinar las propiedades evolutivas de diferentes algoritmos.The resolution of complex problems by using evolutionary algorithms is one of the most researched issues in Computer Science.
The main goal of this thesis is directly related with the development of new algorithms that can solve this kind of problems with the least possible computational time, improving the results achieved by the existing methods. To this end, we combine three important concepts: metaheuristics, multiobjective optimization, and parallelism. For doing this, we first look for a significant optimization problem that had not been solved in an efficient way and we find the Motif Discovery Problem (MDP). MDP aims to discover over-represented short patterns (motifs) in a set of DNA sequences that may have some biological significance. To address it, we defined a multiobjective formulation adjusted to the real-world biological requirements, we implemented a total of ten algorithms of different nature (population, trajectory, collective intelligence...), analyzing aspects such as the ability to scale and converge. Finally, we designed parallel techniques, by using parallel and distributed programming environments as OpenMP and MPI, which try to combine the properties of several metaheuristics in a single application. The obtained results are discussed in detail through numerous statistical tests, and the achieved predictions are compared with those discovered by a total of thirteen well-known biological tools.
The drawn conclusions demonstrate that using multiobjective optimization in metaheuristic techniques favors the discovery of quality solutions, and that parallelism is useful for combining the properties of different evolutionary algorithms.Ministerio de Economía y Competitividad - FEDER (TIN2008-06491-C04-04; TIN2012-30685)
Gobierno de Extremadura (GR10025-TIC015
New Fundamental Technologies in Data Mining
The progress of data mining technology and large public popularity establish a need for a comprehensive text on the subject. The series of books entitled by "Data Mining" address the need by presenting in-depth description of novel mining algorithms and many useful applications. In addition to understanding each section deeply, the two books present useful hints and strategies to solving problems in the following chapters. The contributing authors have highlighted many future research directions that will foster multi-disciplinary collaborations and hence will lead to significant development in the field of data mining
Using MapReduce Streaming for Distributed Life Simulation on the Cloud
Distributed software simulations are indispensable in the study of large-scale life models but often require the use of technically complex lower-level distributed computing frameworks, such as MPI. We propose to overcome the complexity challenge by applying the emerging MapReduce (MR) model to distributed life simulations and by running such simulations on the cloud. Technically, we design optimized MR streaming algorithms for discrete and continuous versions of Conway’s life according to a general MR streaming pattern. We chose life because it is simple enough as a testbed for MR’s applicability to a-life simulations and general enough to make our results applicable to various lattice-based a-life models. We implement and empirically evaluate our algorithms’ performance on Amazon’s Elastic MR cloud. Our experiments demonstrate that a single MR optimization technique called strip partitioning can reduce the execution time of continuous life simulations by 64%. To the best of our knowledge, we are the first to propose and evaluate MR streaming algorithms for lattice-based simulations. Our algorithms can serve as prototypes in the development of novel MR simulation algorithms for large-scale lattice-based a-life models.https://digitalcommons.chapman.edu/scs_books/1014/thumbnail.jp
Recommended from our members
Discovering multi-purpose modules through deep multitask learning
Machine learning scientists aim to discover techniques that can be applied across diverse sets of problems. Such techniques need to exploit regularities that are shared across tasks. This begs the question: What shared regularity is not yet being exploited? Complex tasks may share structure that is difficult for humans to discover. The goal of deep multitask learning is to discover and exploit this structure automatically by training a joint model across tasks. To this end, this dissertation introduces a deep multitask learning framework for collecting generic functional modules that are used in different ways to solve different problems. Within this framework, a progression of systems is developed based on assembling shared modules into task models and leveraging the complementary advantages of gradient descent and evolutionary optimization. In experiments, these systems confirm that modular sharing improves performance across a range of application areas, including general video game playing, computer vision, natural language processing, and genomics; yielding state-of-the-art results in several cases. The conclusion is that multi-purpose modules discovered by deep multitask learning can exceed those developed by humans in performance and generality.Computer Science
Grain & Noise - Artists in Synthetic Biology Labs: Constructive Disturbances of Art in Science
The collaboration between scientists and artists in the form of Artist-in-Lab residencies may not only cause a productive disturbance for a day's work in the laboratory, but also reveal new ways of understanding. Research and science communication company Biofaction has brought together artists and synthetic biologists throughout Europe in a residence program that spans four truly cross-disciplinary collaborations. The contributors to this volume share their reflections of the dynamic frictions that occurred when their artistic and scientific worlds met. These stories, where chemistry labs, tobacco plants, genetically edited bacteria, and new-to-nature enzymes collide with music, photography, film, and visual arts, infuse the ongoing dialogue between art and sciences with grain, noise, and synergies
Recommended from our members
A generic approach to behaviour-driven biochemical model construction
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Modelling of biochemical systems has received considerable attention over the last decade from bioengineering, biochemistry, computer science, and mathematics. This thesis investigates the applications of computational techniques to computational systems biology, for the construction of biochemical models in terms of topology and kinetic rates. Due to the complexity of biochemical systems, it is natural to construct models representing the biochemical systems incrementally in a piecewise manner. Syntax and semantics of two patterns are defined for the instantiation of components which are extendable, reusable and fundamental building blocks for models composition. We propose and implement a set of genetic operators and composition rules to tackle issues of piecewise composing models from scratch. Quantitative Petri nets are evolved by the genetic operators, and evolutionary process of modelling are guided by the composition rules. Metaheuristic algorithms are widely applied in BioModel Engineering to support intelligent and heuristic analysis of biochemical systems in terms of structure and kinetic rates. We illustrate parameters of biochemical models based on Biochemical Systems Theory, and then the topology and kinetic rates of the models are manipulated by employing evolution strategy and simulated annealing respectively. A new hybrid modelling framework is proposed and implemented for the models construction. Two heuristic algorithms are performed on two embedded layers in the hybrid framework: an outer layer for topology mutation and an inner layer for rates optimization. Moreover, variants of the hybrid piecewise modelling framework are investigated. Regarding flexibility of these variants, various combinations of evolutionary operators, evaluation criteria and design principles can be taken into account. We examine performance of five sets of the variants on specific aspects of modelling. The comparison of variants is not to explicitly show that one variant clearly outperforms the others, but it provides an indication of considering important features for various aspects of the modelling. Because of the very heavy computational demands, the process of modelling is paralleled by employing a grid environment, GridGain. Application of the GridGain and heuristic algorithms to analyze biological processes can support modelling of biochemical systems in a computational manner, which can also benefit mathematical modelling in computer science and bioengineering. We apply our proposed modelling framework to model biochemical systems in a hybrid piecewise manner. Modelling variants of the framework are comparatively studied on specific aims of modelling. Simulation results show that our modelling framework can compose synthetic models exhibiting similar species behaviour, generate models with alternative topologies and obtain general knowledge about key modelling features
3rd EGEE User Forum
We have organized this book in a sequence of chapters, each chapter associated with an application or technical theme introduced by an overview of the contents, and a summary of the main conclusions coming from the Forum for the chapter topic. The first chapter gathers all the plenary session keynote addresses, and following this there is a sequence of chapters covering the application flavoured sessions. These are followed by chapters with the flavour of Computer Science and Grid Technology. The final chapter covers the important number of practical demonstrations and posters exhibited at the Forum. Much of the work presented has a direct link to specific areas of Science, and so we have created a Science Index, presented below. In addition, at the end of this book, we provide a complete list of the institutes and countries involved in the User Forum
- …