17 research outputs found
A Class Representative Model for Pure Parsimony Haplotyping under Uncertain Data
The Pure Parsimony Haplotyping (PPH) problem is a NP-hard combinatorial optimization problem that consists of finding the minimum number of haplotypes necessary to explain a given set of genotypes. PPH has attracted more and more attention in recent years due to its importance in analysis of many fine-scale genetic data. Its application fields range from mapping complex disease genes to inferring population histories, passing through designing drugs, functional genomics and pharmacogenetics. In this article we investigate, for the first time, a recent version of PPH called the Pure Parsimony Haplotype problem under Uncertain Data (PPH-UD). This version mainly arises when the input genotypes are not accurate, i.e., when some single nucleotide polymorphisms are missing or affected by errors. We propose an exact approach to solution of PPH-UD based on an extended version of Catanzaro et al. [1] class representative model for PPH, currently the state-of-the-art integer programming model for PPH. The model is efficient, accurate, compact, polynomial-sized, easy to implement, solvable with any solver for mixed integer programming, and usable in all those cases for which the parsimony criterion is well suited for haplotype estimation
Modèles et méthodes pour les études d'association à l'échelle du génome
The interdisciplinary field of systems biology has evolved rapidly over the last few years. Different disciplines have contributed to the development of both its experimental and theoretical branches. Although computational biology has been an increasing activity in computer science for more than a two decades, it has been only in the past few years that optimization models have been increasingly developed and analyzed by researchers whose primary background is Operations Research ( OR ). This dissertation aims at contributing to the field of computational biology by applying mathematical programming to certain problems in molecular biology. Specifically, we address three problems in the domain of Genome-Wide Association Studies: (i) the Pure Parsimony Haplotyping under Uncertain Data Problem that consists in finding the minimum number of haplotypes necessary to explain a given set of genotypes containing possible reading errors; (ii) the Parsimonious Loss of Heterozygosity Problem that consists of partitioning suspected polymorphisms from a set of individuals into a minimum number of deletion areas; (iii) and the Multiple Individuals Polymorphic ALU Insertions Recognition Problem that consists of finding the set of locations in the genome where ALU sequences are inserted in some individual(s). All three problems are NP-hard combinatorial optimization problems. Therefore, we analyse their combinatorial structure and we propose an exact approach to solution for each of them. The proposed models are efficient, accurate, compact, polynomial-sized and usable in all those cases for which the parsimony criterion is well suited for estimation.Le domaine interdisciplinaire de la biologie des systèmes a évolué rapidement au cours des dernières années. Différentes disciplines ont contribué au développement de la branche expérimentale aussi bien que de la branche théorique Bien que la biologie computationnelle a été une activité en en croissance en informatique depuis plusde deux décennies, ce n’est que depuis quelques années que des modèles d’optimisation ont été de plus en plus développés et analysés par des chercheurs dont la spécialité de base est la recherche opérationnelle. Cette thèse vise à apporter une contribution au domaine de la biologie computationnelle en appliquant la programmation mathématique à certains problèmes de biologie moléculaire. Plus précisément, nous abordons trois problèmes dans le domaine de GenomeWide Association Studies: (i) le problème appelé Pure Parsimony Haplotyping under Uncertain Data, qui consiste à trouver le nombre minimum d’haplotypes nécessaire pour expliquer un ensemble donné de génotypes contenant des erreurs de lecture potentielles; (ii) le problème appelé Parsimonious Loss of Heterozygosity Problem, qui consiste dans le partitionnement des polymorphismes soupçonnés à partir d’un ensemble d’individus en un nombre minimal de zones de suppression; (iii) et le troisième problème appelé Multiple Individuals Polymorphic ALU Insertions Recognition Problem, qui consiste à trouver l’ensemble des emplacements dans le génome où une séquence d’ ALU est insérée dans certains individus. Les trois problèmes sont des problèmes d’optimisation combinatoire NP-difficile. Par conséquent, nous avons analysé leur structure combinatoire et proposé une approche exacte de résolution pour chacun d’entre eux. Les modèles proposés sont efficaces, précis, compacts, de taille polynomiale, et utilisables dans tous les cas pour lesquels le critère de parcimonie est bien adapté à l’estimation
Modèles et méthodes pour les études d'association à l'échelle du génome
The interdisciplinary field of systems biology has evolved rapidly over the last few years. Different disciplines have contributed to the development of both its experimental and theoretical branches. Although computational biology has been an increasing activity in computer science for more than a two decades, it has been only in the past few years that optimization models have been increasingly developed and analyzed by researchers whose primary background is Operations Research ( OR ). This dissertation aims at contributing to the field of computational biology by applying mathematical programming to certain problems in molecular biology. Specifically, we address three problems in the domain of Genome-Wide Association Studies: (i) the Pure Parsimony Haplotyping under Uncertain Data Problem that consists in finding the minimum number of haplotypes necessary to explain a given set of genotypes containing possible reading errors; (ii) the Parsimonious Loss of Heterozygosity Problem that consists of partitioning suspected polymorphisms from a set of individuals into a minimum number of deletion areas; (iii) and the Multiple Individuals Polymorphic ALU Insertions Recognition Problem that consists of finding the set of locations in the genome where ALU sequences are inserted in some individual(s). All three problems are NP-hard combinatorial optimization problems. Therefore, we analyse their combinatorial structure and we propose an exact approach to solution for each of them. The proposed models are efficient, accurate, compact, polynomial-sized and usable in all those cases for which the parsimony criterion is well suited for estimation.Le domaine interdisciplinaire de la biologie des systèmes a évolué rapidement au cours des dernières années. Différentes disciplines ont contribué au développement de la branche expérimentale aussi bien que de la branche théorique Bien que la biologie computationnelle a été une activité en en croissance en informatique depuis plusde deux décennies, ce n’est que depuis quelques années que des modèles d’optimisation ont été de plus en plus développés et analysés par des chercheurs dont la spécialité de base est la recherche opérationnelle. Cette thèse vise à apporter une contribution au domaine de la biologie computationnelle en appliquant la programmation mathématique à certains problèmes de biologie moléculaire. Plus précisément, nous abordons trois problèmes dans le domaine de GenomeWide Association Studies: (i) le problème appelé Pure Parsimony Haplotyping under Uncertain Data, qui consiste à trouver le nombre minimum d’haplotypes nécessaire pour expliquer un ensemble donné de génotypes contenant des erreurs de lecture potentielles; (ii) le problème appelé Parsimonious Loss of Heterozygosity Problem, qui consiste dans le partitionnement des polymorphismes soupçonnés à partir d’un ensemble d’individus en un nombre minimal de zones de suppression; (iii) et le troisième problème appelé Multiple Individuals Polymorphic ALU Insertions Recognition Problem, qui consiste à trouver l’ensemble des emplacements dans le génome où une séquence d’ ALU est insérée dans certains individus. Les trois problèmes sont des problèmes d’optimisation combinatoire NP-difficile. Par conséquent, nous avons analysé leur structure combinatoire et proposé une approche exacte de résolution pour chacun d’entre eux. Les modèles proposés sont efficaces, précis, compacts, de taille polynomiale, et utilisables dans tous les cas pour lesquels le critère de parcimonie est bien adapté à l’estimation
La gestione del credito alla clientela
Gli obiettivi di apprendimento di questo capitolo sono i seguenti:
• comprendere il significato e le modalità operative secondo cui l’intermediario bancario valuta il merito creditizio del cliente e le caratteristiche di rischio della singola operazione di prestito;
• analizzare il ruolo del rating e delle garanzie nel processo di affidamento;
• conoscere i principali criteri seguiti dall’intermediario bancario nella composizione quali-quantitativa del portafoglio prestiti;
• esaminare brevemente gli strumenti e le tecniche adottati dagli intermediari per
la gestione avanzata del rischio di credito (cartolarizzazione, derivati creditizi).The objectives of this chapter are:
• understand the meaning and mode of operation that the intermediary bank assesses the creditworthiness of the customer and the risk characteristics of the single loan transaction ;
• analyze the role of the rating and the guarantees in the lending process ;
• know the main criteria used by the intermediary bank in the quantitative composition of the loan portfolio ;
• examine briefly the tools and techniques used by intermediaries to
advanced management of credit risk (eg securitization , credit derivatives )
La gestione del credito alla clientela
Gli obiettivi di apprendimento di questo capitolo sono i seguenti:
• comprendere il significato e le modalità operative secondo cui l’intermediario bancario valuta il merito creditizio del cliente e le caratteristiche di rischio della singola operazione di prestito;
• analizzare il ruolo del rating e delle garanzie nel processo di affidamento;
• conoscere i principali criteri seguiti dall’intermediario bancario nella composizione quali-quantitativa del portafoglio prestiti;
• esaminare brevemente gli strumenti e le tecniche adottati dagli intermediari per
la gestione avanzata del rischio di credito (cartolarizzazione, derivati creditizi).The objectives of this chapter are:
• understand the meaning and mode of operation that the intermediary bank assesses the creditworthiness of the customer and the risk characteristics of the single loan transaction ;
• analyze the role of the rating and the guarantees in the lending process ;
• know the main criteria used by the intermediary bank in the quantitative composition of the loan portfolio ;
• examine briefly the tools and techniques used by intermediaries to
advanced management of credit risk (eg securitization , credit derivatives )
Organizzazione, controlli e risk management
In questo capitolo sono analizzate tre delle componenti che nel loro insieme caratterizzano il sistema di governo della banca in termini di:
• elementi organizzativi;
• definizione del sistema dei controlli;
• attività di risk management.
Conseguentemente, gli obiettivi di questo capitolo sono i seguenti:
• comprendere la rilevanza della tematica dell’organizzazione aziendale nella complessiva gestione dell’impresa bancaria;
• esaminare le caratteristiche fondamentali del sistema dei controlli interni dell’intermediario creditizio;
• analizzare la funzione di risk management.This chapter examines three components that together characterize the system of governance of the bank in terms of:
• organizational elements;
• definition of the control system;
• risk management activities.
Consequently, the objectives of this chapter are:
• understand the relevance of the theme of the organization in the overall management of the bank;
• examine the key features of the internal control system of the credit intermediary;
• analyze the risk management functio
A Branch&Price Algorithm for the Minimum Cost Clique Cover Problem in Max-Point Tolerance Graphs
A point-interval (Iv , pv ) is a pair constituted by an interval Iv of R and a point pv ∈ Iv. A graph G = (V,E) is a Max-Point-Tolerance (MPT) graph if each vertex v ∈ V can be mapped to a point-interval in such a way that (u, v) is an edge of G iff Iu ∩Iv ⊇ {pu,pv}. MPT graphs constitute a superclass of interval graphs and naturally arise in genetic analysis as a way to represent specific rela- tionships among DNA fragments extracted from a population of individuals. One of the most important applications of MPT graphs concerns the search for an asso- ciation between major human diseases and chromosome regions from patients that exhibit loss of heterozygosity events. This task can be formulated as a minimum cost clique cover problem in a MPT graph and gives rise to a NP-hard combi- natorial optimization problem known in the literature as the Parsimonious Loss of Heterozygosity Problem (PLOHP). In this article, we investigate ways to speed up the best known exact solution algorithm for the PLOHP as well as techniques to enlarge the size of the instances that can be optimally solved. In particular, we present a Branch&Price algorithm for the PLOHP and we develop a number of preprocessing techniques and decomposition strategies to dramatically reduce the size of its instances. Computational experiments show that the proposed approach is 10-30x faster than previous approaches described in the literature, and suggest new directions for the development of future exact solution approaches that may prove of fundamental assistance in practice
Graphical representation of an instance of PPH and the corresponding solution.
<p>Graphical representation of an instance of PPH and the corresponding solution.</p