Search CORE

12 research outputs found

Bioprospecting for Genes Encoding Hydrocarbon-Degrading Enzymes from Metagenomic Samples Isolated from Northern Adriatic Sea Sediments

Author: Baranasic Damir
Blažina Maria
Cullum John
Diminic Janko
Gacesa Ranko
Hranueli Daslav
Kolesarić Domagoj
Korlević Marino
Long Paul F.
Najdek-Dragić Mirjana
Orlić Sandi
Oršolić Davor
Starcevic Antonio
Zucko Jurica
Publication venue: 'Faculty of Food Technology and Biotechnology - University of Zagreb'
Publication date: 01/01/2018
Field of study

Three metagenomic libraries were constructed using surface sediment samples from the northern Adriatic Sea. Two of the samples were taken from a highly polluted and an unpolluted site respectively. The third sample from a polluted site had been enriched using crude oil. The results of the metagenome analyses were incorporated in the REDPET relational database (http://redpet.bioinfo.pbf.hr/REDPET), which was generated using the previously developed MEGGASENSE platform. The database includes taxonomic data to allow the assessment of the biodiversity of metagenomic libraries and a general functional analysis of genes using hidden Markov model (HMM) profiles based on the KEGG database. A set of 22 specialised HMM-profiles was developed to detect putative genes for hydrocarbon-degrading enzymes. Use of these profiles showed that the metagenomic library generated after selection on crude oil had enriched genes for aerobic n-alkane degradation. The use of this system for bioprospecting was exemplified using potential alkB and almA genes from this library

Directory of Open Access Journals

Full-text Institutional Repository of the Ruđer Bošković Institute

King's Research Portal

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Optimiranje metoda i reprezentacija za prediktivno modeliranje mehanizama djelovanja i afiniteta vezanja biološki aktivnih molekula

Author: Oršolić Davor
Publication venue: University of Zagreb. Faculty of Food Technology and Biotechnology. Department of Biochemical Engineering. Section for Bioinformatics.
Publication date: 01/01/2023
Field of study

The vastness of the chemical space of compound scaffolds is humongous and it represents a large playground for potential lead drug discovery or repurposing. With the accumulation of experimental data over the years, together with the development of more complex statistical frameworks, screening of such elaborate chemical spaces is finally possible. There are several well-defined problem areas for drug screening efforts, the most popular being the inhibition activity against a multitude of protein targets in human cells related to often occurring diseases. Some examples of highly targeted protein spaces include protein kinases, g-protein coupled receptors, and/or (non)selective serotonin re-uptake inhibitors. Mutation and dysregulation in any of the three of the mentioned protein groups can result in hereditary disorders, tumors, and mental disorders. Contrary to the available machine learning frameworks for prediction of direct physical interactions between compounds and protein targets, certain chemical activity predictions are not well-represented or defined in the literature, e.g. phytotoxic activity. In this work, publicly available data is collected with regard to the experimentally measured binding affinities of diverse compounds against one of the most popular target protein families, protein kinases. This protein super-family is one of the most important enzyme groups responsible for the regulation of most of the important cellular processes, including cell metabolism, cell growth, and division. Protein kinases regulate biochemical cycles by transferring high energy phosphoryl group from adenosine-3-phosphate (ATP) to specific amino acid residues of the target protein substrates. All members of this enzyme family are characterised by the highly conserved protein kinase (PK) domain, but depending on the phosphorylation site and the activation mechanisms of individual members of this family, this superfamily can be divided into several kinase groups. Due the specific characteristics of this protein group and kinase inhibitors, it is important to investigate how each of these chemical or biological spaces impact models performance and how to achieve more optimal predictive performance. On the other hand, we examine a different subspace of biological activity, focusing mostly on synthetic compounds with determined phytotoxic or herbicidal activity. We define this problem as a multiclass classification problem by using two predefined classification systems: main one, by the Herbicide Resistance Action Committee (HRAC), and the second one, by the Weed Science Society of America (WSSA). Considering that no defined machine learning framework for modeling and prediction of herbicidal activity was publicly available, an effort was made to collect the representative data set and define the optimal computational approach to maximize the prediction accuracy for mode of action (MoA) prediction. Considering that the classification of phytotoxic compounds was mostly performed by visual inspection of phenotypic changes in the affected weeds, there is a great need for an automated, systematic approach to this endeavor. Due to the limited size of the collected data, consisting of molecular structures of known activity and denoted by a MoA group, we further tested several “shallow” learners. The panel of tested algorithms includes naive bayes (NB), support vector machines (SVM), extreme-gradient boosting approach (XGBoost) and random forest (RF). All the approaches mentioned were trained in a ten times repeated ten fold (10x10-fold) cross validation mode. A comparison of trained models over all hundred resamples was performed using a non-frequentist approach - Bayesian analysis. For the first time for the herbicide activity modeling, we have implemented a computational framework from feature processing and selection to the training of several learners and, ultimately, a statistical comparison of their performance. However, due to the sheer size of the publicly available experimental data for protein kinase inhibitors, modeling of physical interactions between small compound spaces and the human kinome has allowed for more complex modeling techniques - but has also been more challenging in defining and engineering the feature space for over 8000 compounds and the nuance of the protein kinase family. Both of the aforementioned methods are founded on the QSAR (Quantitative structure-activity relationship) modeling principles. The definition of the applicability domain (AD) for a specified problem is one of the pillars of QSAR modeling. However, defining the boundaries of the chemical space within which the model can make accurate predictions is not simple and is dependent on the nature of the trained model. In the case of predicting general biological activity in the form of a phenotypic signal, as is the case with herbicidal activity, the applicability domain can be simply defined in two-dimensional space by considering the structural similarity of available molecules and a model output, such as the probability of belonging to a particular class. Predicting the physical interaction between any two entities, such as compounds and protein targets, adds complexity that cannot be accommodated by the conventional applicability domain. In this instance, we intend to extend the standard applicability domain to include both entities and generate a quantitative estimate of prediction confidence using the conformal prediction framework. Conformal predictors can reliably estimate a prediction region based on the computed nonconformity of test samples. The disadvantage of this method is that the nonconformity is defined in the label space of predefined calibration samples, resulting in estimates that work well in general but are not specific to any tested compound-target pair, thus failing for samples that are not already available in the training set. Combining concepts from both frameworks, we dynamically define similarity-based applicability domains or conformity regions for each new sample and then calculate nonconformity scores - we refer to this approach as the dynamic applicability domain (dAD). The dAD approach was shown to produce tighter prediction regions when compared to the original conformal predictors algorithm. More importantly, complementary to the prediction regions, when it comes to realistic use-case scenarios (S2, S3), dAD achieves lower error rates for any confidence level. More importantly, merging the concept of applicability domain with a conformal predictors corrects for existing bottlenecks in the traditional applicability domain definition and allows for the evaluation of model behavior in an abstract interaction space between any number of interacting entities. This way, it is a valuable and informative approach for validation of data quality in subregions of interaction space specific for biomolecular complexes.Veličina prostora potencijalnih kemijskih struktura je ogromna te omogućava pretraživanje i testiranje novih potencijalnih terapeutika ili prenamjenu već postojećih u svrhu ciljanja drugih proteina. Kroz vrijeme, sve veće nakupljanje eksperimentalnih podataka i razvoja naprednih statističkih pristupa omogućilo je učinkovito ciljano pretraživanje kemijskog prostora. Postoji nekoliko dobro definiranih problematičnih područja gdje se automatizirano pretraživanje novih terapeutika pokazalo učinkovitim, a najpopularnija je inhibicija aktivnosti mnoštva ciljanih proteina u ljudskim stanicama povezanih s učestalnim bolestima. Među proteinske skupine od velikog interasa spadaju proteinske kinaze, g-protein spregnuti receptori i/ili (ne)selektivni inhibitori ponovne pohrane serotonina. Mutacija i disregulacija u bilo kojoj od tri navedene skupine proteina može rezultirati nasljednim poremećajima, tumorima i mentalnim poremećajima. Suprotno dostupnim okvirima strojnog učenja za predviđanje izravnih fizičkih interakcija između spojeva i proteina od interesa, određena predviđanja kemijske aktivnosti nisu dobro predstavljena ili definirana u literaturi, npr. herbicidno djelovanje. U ovom radu prikupljena je većina javno dostupnih podataka s eksperimentalno izmjerenim afinitetima vezanja različitih spojeva protiv jedne od najpopularnijih proteinskih porodica od interesa, proteinskih kinaza. Ova super-porodica proteina jedna je od najvažnijih enzimskih skupina odgovornih za regulaciju većine važnih staničnih procesa, uključujući regulaciju staničnog metabolizma, rasta i diobe stanica. Kinaze reguliraju biokemijske cikluse prijenosom fosforilnih skupina visoke energije s molekule adenozin-3-fosfata (ATP) na specifične aminokiselinske bočne lance ciljnih proteinskih supstrata. Svi članovi ove obitelji enzima karakterizirani su visoko očuvanom proteinskom kinaznom (PK) domenom, ali ovisno o mjestu fosforilacije i mehanizmima aktivacije, članovi ove porodice mogu se podijeliti u nekoliko kinaznih skupina. S obzirom na specifičnost proteinske porodice kinaza, kao i kinaznih inhibitora, vrlo je važno analizirati utjecaj svakog pojedinačnog kemijskog, odnosno biološkog prostora, na izvedbu i učinkovitost samog modela, kao i način za postizanje optimalnijeg riješenja. S druge strane, osim prostora proteinskih kinaznih inhibitora, ispitujemo i drugačiji potprostor biološke aktivnosti, fokusirajući se uglavnom na sintetičke primjere molekula s izmjerenom fitotoksičnom aktivnošću. Budući da ova specifična aktivnost, u smislu fizičke interakcije između spojeva i ciljanih proteina, obično nije dobro dokumentirana za ovaj specifični zadatak - definiramo ovaj problem kao problem klasifikacije s više oznaka uzimajući unaprijed definirane sustave klasifikacije od strane Odbora za otpornost na herbicide (engl. Herbicide Resistance Action Committee, HRAC) i Američko društvo za znanost o korovima (engl. Weed Science Society of America, WSSA). Zbog nedostatka javno dostupnih definiranih okvira strojnog učenja za modeliranje i predviđanje učinkovitosti herbicida tijekom provedenog istraživanja, nastojimo sakupiti reprezentativan skup podataka i uspostaviti optimalan računalni pristup radi povećanja točnosti predviđanja mehanizma djelovanja (MoA). Imajući u vidu da se klasifikacija fitotoksičnih spojeva obično vrši vizualnom inspekcijom promjena fenotipa biljaka nakon izlaganja, postoji izražena potreba za automatizacijom ovog pristupa. Zbog ograničene veličine prikupljenih podataka koji se sastoje od molekularnih struktura poznate aktivnosti i označenih MoA skupinom, dodatno testiramo nekoliko "plitkih" modela strojnog učenja. Panel testiranih algoritama uključuje Naive Bayes (NB), stroj potpornih vektora (engl. support vector machine, SVM), pristup ekstremnog pojačanja gradijenta (engl. extreme gradient boosting, XGBoost) i nasumične šume (engl. random forest, RF). Svi spomenuti pristupi naučeni su u deset puta ponovljenom desetostrukom (10x10-strukom) načinu unakrsne validacije. Usporedba treniranih modela na svih stotinu ponovnih uzoraka provedena je nefrekvencijskim pristupom - Bayesovom analizom. Po prvi put za modeliranje aktivnosti herbicida, implementirali smo računalni okvir od obrade značajki i odabira, od učenja nekoliko modela, i konačno, statističke usporedbe njihove izvedbe. Obje navedene metode temelje se na principima kvantitativnog modeliranja odnosa između strukture i aktivnosti (engl. quantitative structure-activity relationship, QSAR). Definicija domene primjenjivosti za određeni problem jedan je od temelja QSAR-a. Međutim, definiranje granica kemijskog prostora unutar kojeg model može napraviti točna predviđanja nije jednostavno i ovisi o prirodi naučenog modela. U slučaju predviđanja opće biološke aktivnosti u obliku fenotipskog signala, kao što je slučaj s herbicidnom aktivnošću, domena primjenjivosti može se jednostavno definirati u dvodimenzionalnom prostoru uzimajući u obzir strukturnu sličnost dostupnih molekula i modelnog produkta kao npr. vjerojatnost pripadnosti određenoj klasi. Predviđanje fizičke interakcije između bilo koja dva entiteta, kao što su spojevi i proteinski ciljevi, dodaje složenost koja se ne može prilagoditi konvencionalnoj domeni primjenjivosti. U ovom slučaju, namjeravamo proširiti standardnu domenu primjenjivosti kako bismo uključili oba entiteta i generirali kvantitativnu procjenu pouzdanosti predviđanja korištenjem okvira predviđanja nesukladnosti primjera (engl. conformal predictors). Navedenim postpukom može se pouzdano procijeniti područje predviđanja na temelju izračunate nesukladnosti ispitnih uzoraka. Nedostatak ove metode je taj što je nesukladnost definirana u prostoru oznaka unaprijed definiranih kalibracijskih uzoraka, što rezultira procjenama koje općenito dobro funkcioniraju, ali nisu specifične ni za jedan testirani par kemijskog spoja i proteina, stoga nisu uspješne za uzorke koji su malo izvan distribucije podataka u skupu za učenje. Kombinirajući koncepte iz oba okvira, dinamički definiramo domene primjenjivosti temeljene na sličnosti, sto nazivamo regijama sukladnosti za svaki novi uzorak, a zatim izračunavamo rezultate nesukladnosti - ovaj pristup nazivamo dinamičkom domenom primjenjivosti (engl. dynamic applicability domain, dAD). Pokazalo se da dAD pristup proizvodi strože intervale predviđanja u usporedbi s izvornim algoritmom konformnih prediktora. Još važnije, komplementarno regijama predviđanja, dAD postiže niže stope pogreške za bilo koju razinu pouzdanosti. Što je posebno važno za teže scenarije testiranja, kao što su scenariji otkrivanja (S2) i prenamjene (S3)

Croatian Digital Dissertations Repository

Canada – Alberta Province – Banff – Lake Louise

Author: Oršolić Davor
Publication venue: Digital USD
Publication date: 10/09/2013
Field of study

https://digital.sandiego.edu/pccanadawestern/1014/thumbnail.jp

Repository of Josip Juraj Strossmayer University of Osijek

Croatian Digital Thesis Repository

University of San Diego

Optimiranje metoda i reprezentacija za prediktivno modeliranje mehanizama djelovanja i afiniteta vezanja biološki aktivnih molekula

Author: Oršolić Davor
Publication venue: University of Zagreb. Faculty of Food Technology and Biotechnology. Department of Biochemical Engineering. Section for Bioinformatics.
Publication date: 01/01/2023
Field of study

University of Zagreb Repository

In silico characterisation of metagenomic alkane 1-monooxygenases

Author: Oršolić Davor
Publication venue: University of Zagreb. Faculty of Food Technology and Biotechnology. Department of Biochemical Engineering. Section for Bioinformatics.
Publication date: 28/09/2015
Field of study

Cilj istraživanja bio je provesti filogenetsku analizu, funkcionalnu karakterizaciju i trodimenzionalno strukturno modeliranje hipotetskih alkan monooksigenaza iz metagenomske knjižnice sastavljene iz umjereno onečišćenog uzorka sakupljenog iz sedimenta lučnog sidrišta u Puli. Filogenetska analiza provedena je s pomoću MEGA programa, te je dala hipotetski prikaz evolucijske povezanosti između proteina. Funkcionalnom karakterizacijom putem UniProt, InterPro i CD baze podataka potvrđena je pripadnost ispitivanih proteina alkan 1-monooksigenazama, dok strukturno modeliranje s pomoću SWISS-MODEL i I-TASSER web servisa nisu dali značajne rezultate niti za jedan od proteina. Neuspjelost strukturnog modeliranja pripisana je nedostatku homolognih proteina s eksperimentalno određenom trodimenzionalnom strukturom.Phylogenetic analysis, functional characterization and structural modeling of putative alkan monooxygenases from metagenomic library constructed from a moderately polluted sample collected from a tanker berth station in Pula was performed. Phylogenetic analysis conducted via MEGA software resulted in phylogenetic tree representing hypothesis of evolutionary relationships between proteins. Functional characterization using UniProt, InterPro and CDD search engines confirmed similarities between putative proteins and alkan 1-monooxygenases, while structural modeling via SWISS-MODEL and I-TASSER server didn't give significant results. Inefficiency of structural modeling was attributed to absence of homologous proteins with experimentally determined three-dimensional structure

Croatian Digital Thesis Repository

In silico characterisation of metagenomic alkane 1-monooxygenases

Author: Oršolić Davor
Publication venue: University of Zagreb. Faculty of Food Technology and Biotechnology. Department of Biochemical Engineering. Section for Bioinformatics.
Publication date: 28/09/2015
Field of study

University of Zagreb Repository

Croatian Digital Thesis Repository

cfDNA methylation in liquid biopsies as potential testicular seminoma biomarker

Author: Barešić Anja
Gelo Nina
Ježek Davor
Katušić Bojanac Ana
Kaštelan Željko
Krasić Jure
Kuliš Tomislav
Mašić Silvija
Nikolac Gabaj Nora
Oršolić Davor
Raos Dora
Sinčić Nino
Tomašković Igor
Tomić Miroslav
Ulamec Monika
Publication venue
Publication date: 01/12/2022
Field of study

Background: Seminoma is a testicular tumor type, routinely diagnosed after orchidectomy. As cfDNA represents a source of minimally invasive seminoma patient management, this study aimed to investigate whether cfDNA methylation of six genes from liquid biopsies, have potential as novel seminoma biomarkers. Materials & methods: cfDNA methylation from liquid biopsies was assessed by pyrosequencing and compared with healthy volunteers' samples. Results: Detailed analysis revealed specific CpGs as possible seminoma biomarkers, but receiver operating characteristic curve analysis showed modest diagnostic performance. In an analysis of panels of statistically significant CpGs, two DNA methylation panels emerged as potential seminoma screening panels, one in blood CpG8/CpG9/CpG10 (KITLG) and the other in seminal plasma CpG1(MAGEC2)/CpG1(OCT3/4). Conclusion: The presented data promote the development of liquid biopsy epigenetic biomarkers in the screening of seminoma patients

Veterinary medicine - Repository of PHD, master's thesis

University of Zagreb Repository

Bioprospecting for Genes Encoding Hydrocarbon-Degrading Enzymes from Metagenomic Samples Isolated from Northern Adriatic Sea Sediments

Author: Antonio Starcevic
Damir Baranasic
Daslav Hranueli
Davor Oršolić
Domagoj Kolesarić
Janko Diminic
John Cullum
Jurica Zucko
Maria Blažina
Marino Korlević
Mirjana Najdek
Paul F. Long
Ranko Gacesa
Sandi Orlic
Publication venue: 'University of Zagreb - University Computing Centre'
Publication date: 01/01/2018
Field of study

Three metagenomic libraries were constructed using surface sediment samples from the northern Adriatic Sea. Two of the samples were taken from a highly polluted and an unpolluted site respectively. The third sample from a polluted site had been enriched using crude oil. The results of the metagenome analyses were incorporated in the REDPET relational database (http://redpet.bioinfo.pbf.hr/REDPET), which was generated using the previously developed MEGGASENSE platform. The database includes taxonomic data to allow the assessment of the biodiversity of metagenomic libraries and a general functional analysis of genes using hidden Markov model (HMM) profiles based on the KEGG database. A set of 22 specialised HMM profiles was developed to detect putative genes for hydrocarbon-degrading enzymes. Use of these profiles showed that the metagenomic library generated after selection on crude oil had enriched genes for aerobic n-alkane degradation. The use of this system for bioprospecting was exemplified using potential alkB and almA genes from this library

Directory of Open Access Journals

TLR5 Variants Are Associated with the Risk for COPD and NSCLC Development, Better Overall Survival of the NSCLC Patients and Increased Chemosensitivity in the H1299 Cell Line

Author: Alexander N. R. Weber
Andrea Vukić Dugac
Asta Försti
Calogerina Catalano
Davor Nestić
Dragomira Majhen
Gordana Drpa
Irena Sokolović
Jelena Knežević
Jurica Baranašić
Lada Rumora
Maja Šutić
Marko Jakopović
Martina Bosnar
Matea Kurtović
Miroslav Samaržija
Nada Oršolić
Sanda Škrinjarić-Cincar
Sanja Popovic-Grle
Stefanie Huhn
Publication venue: 'MDPI AG'
Publication date: 01/09/2022
Field of study

Chronic obstructive pulmonary disease (COPD) is considered as the strongest independent risk factor for lung cancer (LC) development, suggesting an overlapping genetic background in both diseases. A common feature of both diseases is aberrant immunity in respiratory epithelia that is mainly regulated by Toll-like receptors (TLRs), key regulators of innate immunity. The function of the flagellin-sensing TLR5 in airway epithelia and pathophysiology of COPD and LC has remained elusive. We performed case–control genetic association and functional studies on the importance of TLR5 in COPD and LC development, comparing Caucasian COPD/LC patients (n = 974) and healthy donors (n = 1283). Association analysis of three single nucleotide polymorphisms (SNPs) (rs725084, rs2072493_N592S, and rs5744174_F616L) indicated the minor allele of rs2072493_N592S to be associated with increased risk for COPD (OR = 4.41, p < 0.0001) and NSCLC (OR = 5.17, p < 0.0001) development and non-small cell LC risk in the presence of COPD (OR = 1.75, p = 0.0031). The presence of minor alleles (rs5744174 and rs725084) in a co-dominant model was associated with overall survival in squamous cell LC patients. Functional analysis indicated that overexpression of the rs2072493_N592S allele affected the activation of NF-κB and AP-1, which could be attributed to impaired phosphorylation of p38 and ERK. Overexpression of TLR5N592S was associated with increased chemosensitivity in the H1299 cell line. Finally, genome-wide transcriptomic analysis on WI-38 and H1299 cells overexpressing TLR5WT or TLR5N592S, respectively, indicated the existence of different transcription profiles affecting several cellular pathways potentially associated with a dysregulated immune response. Our results suggest that TLR5 could be recognized as a potential biomarker for COPD and LC development with functional relevance

Multidisciplinary Digital Publishing Institute

Veterinary medicine - Repository of PHD, master's thesis

Directory of Open Access Journals

PubMed Central

University of Zagreb Repository