24 research outputs found

    Designing Focused Chemical Libraries Enriched in Protein-Protein Interaction Inhibitors using Machine-Learning Methods

    Get PDF
    Protein-protein interactions (PPIs) may represent one of the next major classes of therapeutic targets. So far, only a minute fraction of the estimated 650,000 PPIs that comprise the human interactome are known with a tiny number of complexes being drugged. Such intricate biological systems cannot be cost-efficiently tackled using conventional high-throughput screening methods. Rather, time has come for designing new strategies that will maximize the chance for hit identification through a rationalization of the PPI inhibitor chemical space and the design of PPI-focused compound libraries (global or target-specific). Here, we train machine-learning-based models, mainly decision trees, using a dataset of known PPI inhibitors and of regular drugs in order to determine a global physico-chemical profile for putative PPI inhibitors. This statistical analysis unravels two important molecular descriptors for PPI inhibitors characterizing specific molecular shapes and the presence of a privileged number of aromatic bonds. The best model has been transposed into a computer program, PPI-HitProfiler, that can output from any drug-like compound collection a focused chemical library enriched in putative PPI inhibitors. Our PPI inhibitor profiler is challenged on the experimental screening results of 11 different PPIs among which the p53/MDM2 interaction screened within our own CDithem platform, that in addition to the validation of our concept led to the identification of 4 novel p53/MDM2 inhibitors. Collectively, our tool shows a robust behavior on the 11 experimental datasets by correctly profiling 70% of the experimentally identified hits while removing 52% of the inactive compounds from the initial compound collections. We strongly believe that this new tool can be used as a global PPI inhibitor profiler prior to screening assays to reduce the size of the compound collections to be experimentally screened while keeping most of the true PPI inhibitors. PPI-HitProfiler is freely available on request from our CDithem platform website, www.CDithem.com

    Etude des Algorithmes génétiques et application aux données de protéomique

    No full text
    Genetic Algorithms are optimization methods aiming at solving complex problems. They are likely to play an interesting role in proteomics. This discipline is a quite new one which studies the proteins in individuals. It provides high dimension data. The first part deals with history, working of genetic algorithms and introduces some theoretical results. In the next part the building of a genetic algorithm is presented to solve biomarker selection in mass spectrometry and 2D electrophoresis gels alignment. This part focuses on the difficulty to choose an appropriate criterion to optimize. The last part deals with theoretical results. Convergence of elitist genetic algorithms is proved for non homogeneous case and orientated mutations. Then we built a convergence criterion mixing theoretical basements and appliability, which is based on occurrences of the locally optimal solution. Finally, the efficiency of introducing catastrophic events to avoid some convergence problems is shown.Les algorithmes génétiques sont des méthodes d'optimisation destinées à des problèmes complexes. Ils peuvent jouer un rôle intéressant dans le cadre de la protéomique. Cette discipline est assez récente, elle étudie le patrimoine en protéines des individus. Elle produit des données de grande dimension. La première partie aborde l'histoire, le fonctionnement des algorithmes génétiques et certains résultats théoriques. La partie suivante détaille la mise au point d'un tel algorithme pour la sélection de biomarqueurs en spectrométrie de masse et l'alignement de gels d'électrophorèse 2D. Cette partie met en évidence la difficulté de construction du critère à optimiser. La dernière partie aborde des résultats théoriques. La convergence des algorithmes génétiques avec élitisme est démontrée dans le cas non homogène et de mutations dirigées. Nous avons ensuite construit un critère de convergence alliant fondements théoriques et applicabilité, basé sur les occurrences de la solution localement optimale. Enfin, l'efficacité de l'introduction d'événements catastrophiques dans la résolution pratique de certains problèmes de convergence est montrée

    Extensions of simple component analysis and simple linear discriminant analysis using genetic algorithms

    No full text
    Extensions of Simple Component Analysis are proposed. Two methods are obtained: a new Simple Component Analysis and a Simple Linear Discriminant Analysis. These two methodologies use Genetic Algorithms, optimize a criterion (derived from the usual method) and add constraints. The objective is to obtain loadings constituted of a small number of integers determining blocks of variables. The programs implementing the methods have been developed using the R© language. Four applications are made and show a good robustness of the algorithms and a proximity to the optimal solution (from the usual PCA and LDA).

    Carta de Pere Pascual a Félix López Deza (Junta de Energía Nuclear, Madrid) demanant transferències a Ramon Pascual per pagaments a Michel, Minnaert i Santisteban

    No full text
    L’histoire paléohydrologique du delta du Vidourle au cours du dernier millénaire est examinée par l’étude comparative de deux types de proxies : les archives sédimentaires et les séries temporelles de crues historiques. L’analyse de la rythmicité des crues du Vidourle repose sur une étude micromorphologique à haute résolution des 3 m supérieurs de la carotte du Lièvre, localisée dans l’axe de l’embouchure moderne du Vidourle dans l’étang de Mauguio (Hérault). La vitesse de sédimentation est établie par 7 datations au carbone 14. Le modèle âge-profondeur illustre une rupture dans le taux de sédimentation un peu après l’An Mil et une nette accélération de la sédimentation à partir du XVIIe s. Les tests de stationnarité réalisés montrent que la série de crues toutes classes confondues n’obéit pas à une distribution homogène. Aussi bien pour le Vidourle que pour l’ensemble des fleuves côtiers du bas Languedoc oriental pris en compte, une phase très active d’un siècle environ de 1680 à 1780 est caractérisée par la recrudescence des épisodes « extraordinaires » et « catastrophiques ». Elle met en évidence la variabilité cyclique de la composante hydro-climatique. Même si le forçage sédimentaire lié à la proximité du carottage par rapport au front du delta perturbe le signal, il est possible pour le dernier millénaire de restituer des phases d’hydrologie abondante responsables de crues de haute énergie et de les replacer dans le contexte des mises en valeur et des variations climatiques dans le nord-ouest du bassin méditerranéen au cours du dernier millénaire.The palaeohydrological history of the lower flood plain of the Vidourle River during the Little Ice Age has been carried out by comparing two kinds of proxies: sedimentary archives and attested historical floods. The Vidourle flood frequency has been established after the micromorphological study, at a high resolution, of the upper three meters of the Lièvre core, sampled near the Vidourle River mouth, in the Mauguio coastal lagoon. Seven 14C dates enabled us to estimate the velocity of fluvial deposition. The age-deep modeling shows a change in the deposition rate after 1000 AD and its increase since the seventeenth century. Stationary analysis exhibits a non-homogeneous behavior for all classes of floods. For the Vidourle River, like for all coastal rivers of the eastern Languedoc, “extraordinary” and catastrophic floods increased from 1680 to 1780 A.D. They reveal a cyclic variability of hydroclimatic factors even the proximity of the delta front can disturb the signal. It is possible to identify numerous hydrological periods responsible of high energy floods and replace them in the context of human activity and climatic fluctuations of the North-Western Mediterranean basin during the last millennium

    A new genetic algorithm in proteomics: Feature selection for SELDI-TOF data

    No full text
    Mass spectrometry from clinical specimens is used in order to identify biomarkers in a diagnosis. Thus, a reliable method for both feature selection and classification is required. A novel method is proposed to find biomarkers in SELDI-TOF in order to perform robust classification.The feature selection is based on a new genetic algorithm. Concerning the classification, a method which takes into account the great variability on intensity by using decision stumps has been developed. Moreover, as the samples are often small, it is more appropriate to use the decision stumps simultaneously than building a complete tree. The thresholds of the decision stumps are determined in the same genetic algorithm. Finally, the method was generalized to more than two groups based on pairwise coupling. The obtained algorithm was applied on two data sets: a publicly available one containing two groups allowing a comparison with other methods from the literature and a new one containing three groups.
    corecore