Search CORE

31 research outputs found

The model of proteolysis

Author: Anna Gambin
Piotr Dittwald
Publication venue
Publication date: 28/04/2010
Field of study

This document presents the original approach for estimating parameters of proteolysis process. Data used to fit the model are taken from mass
spectrometric experiments. For parameters estimation the Levenberg-Marquadt algorithm is used. The motivation for model is a hypothesis
that discrimination between cancer patients and healthy donors can be based on activity of peptide cleaving enzymes (i.e. peptidases)

Nature Precedings

Metody obliczeniowe dla wielkoskalowych danych w diagnostyce medycznej

Author: Dittwald Piotr
Publication venue
Publication date
Field of study

This thesis covers a topic of fast and reliable processing of the high-throughput biomedical data, that is currently needed in genetics and proteomics. We therefore concentrate on these two rapidly developing research areas in life sciences. First, we perform a systematic analyses of human reference genome build in the context of its potential local instability caused by recurrent genomic rearrangements, e.g. deletions, duplications, and inversions. Our approach enables also to analyze large and unique clinical database. Secondly, we present various analyses of mass spectrometry data. In particular, we propose isotopic distribution at many levels of accuracy; more precisely we consider aggregated and fine isotopic structures. We also show some case application studies involving high-throughput processing, potentially applicable in proteomics and lipidomics. Of note, this thesis is also an exemplification of interdisciplinary approach for basic science, where a deeper and complex understanding of both biomedical and computational aspects can be mutually beneficial.Niniejsza rozprawa opisuje efektywne metody przetwarzania wielkoskalowych danych w biologii molekularnej, co jest szczególnie istotne w genetyce i proteomice. Właśnie te dwie dynamicznie rozwijające się gałęzie nauk o życiu stanowią obszar naszych zainteresowań. Na początku przeprowadzamy systematyczną analizę referencyjnego genomu człowieka. Nasze badania dotyczą jego potencjalnej lokalnej niestabilności spowodowanej przez nawracające rearanżacje, takie jak delecje, duplikacje oraz inwersje. Przedstawione podejście pozwala również, w przypadku delecji i duplikacji, przeanalizować dużą i unikalną bazę danych przypadków klinicznych. W drugiej części rozprawy prezentujemy modele wykorzystywane w analizie danych spektrometrycznych. W szczególności zajmujemy się wpływem wariantów izotopowych na wyniki uzyskiwane w eksperymentach. Nasze badania prowadzimy wykorzystując różne stopnie dokładności przy reprezentowaniu rozkładów izotopowych -- podejście zagregowane oraz dokładne. Ponadto przedstawiamy przykłady analizy wieloskalowych danych w proteomice. Pragniemy podkreślić, że niniejsza rozprawa prezentuje interdyscyplinarne podejście do badań podstawowych. Ponadto, nasze badania są przykładem kompleksowego wykorzystania w nauce o życiu metod obliczeniowych popartych teorią nauk matematycznych

Repozytorium UW

MIND: A Double-Linear Model To Accurately Determine Monoisotopic Precursor Mass in High-Resolution Top-Down Proteomics

Author: Baggerman Geert
Claesen J\ufcrgen
Dittwald Piotr
Gambin Anna
Hooyberghs Jef
Laukens Kris
Lennyte Frederik
Lermyte Frederik
O'Connor Peter B.
Sobott Frank
Valkenborg Dirk
Publication venue: 'American Chemical Society (ACS)'
Publication date: 01/01/2019
Field of study

Top-down proteomics approaches are becoming ever more popular, due to the advantages offered by knowledge of the intact protein mass in correctly identifying the various proteoforms that potentially arise due to point mutation, alternative splicing, post-translational modifications, etc. Usually, the average mass is used in this context; however, it is known that this can fluctuate significantly due to both natural and technical causes. Ideally, one would prefer to use the monoisotopic precursor mass, but this falls below the detection limit for all but the smallest proteins. Methods that predict the monoisotopic mass based on the average mass are potentially affected by imprecisions associated with the average mass. To address this issue, we have developed a framework based on simple, linear models that allows prediction of the monoisotopic mass based on the exact mass of the most-abundant (aggregated) isotope peak, which is a robust measure of mass, insensitive to the aforementioned natural and technical causes. This linear model was tested experimentally, as well as in silico, and typically predicts monoisotopic masses with an accuracy of only a few parts per million. A confidence measure is associated with the predicted monoisotopic mass to handle the off-by-one-Da prediction error. Furthermore, we introduce a correction function to extract the “true” (i.e., theoretically) most-abundant isotope peak from a spectrum, even if the observed isotope distribution is distorted by noise or poor ion statistics. The method is available online as an R shiny app: https://valkenborg-lab.shinyapps.io/mind

Institutional Repository Universiteit Antwerpen

Document Server@UHasselt (Universiteit Hasselt)

White Rose Research Online

Human endogenous retroviral elements promote genome instability via non-allelic homologous recombination

Author: A Sanchez-Valle
A Shuvarikov
AJ Sharp
AL Duker
Andrey Shuvarikov
Ankita Patel
Anna Gambin
AS Waldman
B Burwinkel
BC Ballif
C Kamp
C Sun
Chad A Shaw
Christine R Beck
CR Beck
DJ Turner
ES Lander
F Baudat
H Kurahashi
Ian M Campbell
J Jurka
J Paces
JA Bailey
JA Lee
JA Rosenfeld
JF Hughes
Jill A Rosenfeld
JR Lupski
KE Hermetz
L Edelmann
LT Reiter
MA Nimmakayalu
N Bannert
P Dittwald
P Dittwald
P Jern
P Liu
P Liu
P Stankiewicz
Patricia Hixson
Paweł Stankiewicz
Piotr Dittwald
PJ Hastings
R Löwer
S Myers
SA Temtamy
SW Cheung
Tomasz Gambin
Y Segal
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Inferring serum proteolytic activity from LC-MS/MS data

Author: A Gambin
A Lakshmikuttyamma
Anna Gambin
B Kluge
D Kuester
EP Diamandis
EP Diamandis
F Impens
J Aitchison
J Amiguet
J Han
J Nocedal
J Villanueva
Jakub Karczmarski
Jerzy Ostrowski
JH McKerrow
LA Liotta
M Lourakis
M Sugimoto
ND Rawlings
ND Rawlings
NV Kampen
Piotr Dittwald
T Hastie
TD Schneider
Publication venue: BioMed Central
Publication date: 01/01/2012
Field of study

Abstract Background In this paper we deal with modeling serum proteolysis process from tandem mass spectrometry data. The parameters of peptide degradation process inferred from LC-MS/MS data correspond directly to the activity of specific enzymes present in the serum samples of patients and healthy donors. Our approach integrate the existing knowledge about peptidases' activity stored in MEROPS database with the efficient procedure for estimation the model parameters. Results Taking into account the inherent stochasticity of the process, the proteolytic activity is modeled with the use of Chemical Master Equation (CME). Assuming the stationarity of the Markov process we calculate the expected values of digested peptides in the model. The parameters are fitted to minimize the discrepancy between those expected values and the peptide activities observed in the MS data. Constrained optimization problem is solved by Levenberg-Marquadt algorithm. Conclusions Our results demonstrates the feasibility and potential of high-level analysis for LC-MS proteomic data. The estimated enzyme activities give insights into the molecular pathology of colorectal cancer. Moreover the developed framework is general and can be applied to study proteolytic activity in different systems.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Computational planning of the synthesis of complex natural products

Author: Badowski Tomasz
Bayly Alison A.
Beker Wiktor
Dittwald Piotr
Gajewska Ewa P.
Golebiowska Patrycja
Grzybowski Bartosz A.
Klucznik Tomasz
Mikulak-Klucznik Barbara
Mlynarski Jacek
Molga Karol
Mrksich Milan
Popik Oskar
Scheidt Karl A.
Staszewska-Krajewska Olga
Szymkuc Sara
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/12/2020
Field of study

Training algorithms to computationally plan multistep organic syntheses has been a challenge for more than 50 years(1-7). However, the field has progressed greatly since the development of early programs such as LHASA(1,7), for which reaction choices at each step were made by human operators. Multiple software platforms(6,8-14) are now capable of completely autonomous planning. But these programs 'think' only one step at a time and have so far been limited to relatively simple targets, the syntheses of which could arguably be designed by human chemists within minutes, without the help of a computer. Furthermore, no algorithm has yet been able to design plausible routes to complex natural products, for which much more far-sighted, multistep planning is necessary(15,16) and closely related literature precedents cannot be relied on. Here we demonstrate that such computational synthesis planning is possible, provided that the program's knowledge of organic chemistry and data-based artificial intelligence routines are augmented with causal relationships(17,18), allowing it to 'strategize' over multiple synthetic steps. Using a Turing-like test administered to synthesis experts, we show that the routes designed by such a program are largely indistinguishable from those designed by humans. We also successfully validated three computer-designed syntheses of natural products in the laboratory. Taken together, these results indicate that expert-level automated synthetic planning is feasible, pending continued improvements to the reaction knowledge base and further code optimization. A synthetic route-planning algorithm, augmented with causal relationships that allow it to strategize over multiple steps, can design complex natural-product syntheses that are indistinguishable from those designed by human experts

IBS Publications Repository

ScholarWorks@UNIST

BRAIN 2.0 : time and memory complexity improvements in the algorithm for calculating the isotope distribution

Author: Dittwald Piotr
Valkenborg Dirk
Publication venue
Publication date: 01/01/2014
Field of study

Recently, an elegant iterative algorithm called BRAIN (Baffling Recursive Algorithm for Isotopic distributioN calculations) was presented. The algorithm is based on the classic polynomial method for calculating aggregated isotope distributions, and it introduces algebraic identities using Newton-Girard and Viète’s formulae to solve the problem of polynomial expansion. Due to the iterative nature of the BRAIN method, it is a requirement that the calculations start from the lightest isotope variant. As such, the complexity of BRAIN scales quadratically with the mass of the putative molecule, since it depends on the number of aggregated peaks that need to be calculated. In this manuscript, we suggest two improvements of the algorithm to decrease both time and memory complexity in obtaining the aggregated isotope distribution. We also illustrate a concept to represent the element isotope distribution in a generic manner. This representation allows for omitting the root calculation of the element polynomial required in the original BRAIN method. A generic formulation for the roots is of special interest for higher order element polynomials such that root finding algorithms and its inaccuracies can be avoided. [Figure: see text] ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s13361-013-0796-5) contains supplementary material, which is available to authorized users

Springer - Publisher Connector

PubMed Central

Institutional Repository Universiteit Antwerpen

The model of proteolysis

Author: Anna Gambin
Piotr Dittwald
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Navigating around Patented Routes by Preserving Specific Motifs along Computer-Planned Retrosynthetic Pathways

Author: Dittwald Piotr
Grzybowski Bartosz A.
Molga Karol
Publication venue: 'Elsevier BV'
Publication date: 01/02/2019
Field of study

By keeping track of lists of specific bonds one wishes to preserve, a computer program is able to identify the key disconnections used in the patented syntheses and design synthetic routes that circumvent these approaches. Here, we provide examples of computer-designed syntheses relevant to medicinal chemistry, in which the machine avoids "strategic" disconnections common to industrial patents and is forced to use different starting materials. The ability of modern retrosynthetic planners to navigate around patented solutions may have significant implications for the ways in which intellectual property related to multistep syntheses is protected and/or challenged

IBS Publications Repository

ScholarWorks@UNIST