2,146 research outputs found

    MCMC-ODPR : primer design optimization using Markov Chain Monte Carlo sampling

    Get PDF
    Background Next generation sequencing technologies often require numerous primer designs that require good target coverage that can be financially costly. We aimed to develop a system that would implement primer reuse to design degenerate primers that could be designed around SNPs, thus find the fewest necessary primers and the lowest cost whilst maintaining an acceptable coverage and provide a cost effective solution. We have implemented Metropolis-Hastings Markov Chain Monte Carlo for optimizing primer reuse. We call it the Markov Chain Monte Carlo Optimized Degenerate Primer Reuse (MCMC-ODPR) algorithm. Results After repeating the program 1020 times to assess the variance, an average of 17.14% fewer primers were found to be necessary using MCMC-ODPR for an equivalent coverage without implementing primer reuse. The algorithm was able to reuse primers up to five times. We compared MCMC-ODPR with single sequence primer design programs Primer3 and Primer-BLAST and achieved a lower primer cost per amplicon base covered of 0.21 and 0.19 and 0.18 primer nucleotides on three separate gene sequences, respectively. With multiple sequences, MCMC-ODPR achieved a lower cost per base covered of 0.19 than programs BatchPrimer3 and PAMPS, which achieved 0.25 and 0.64 primer nucleotides, respectively. Conclusions MCMC-ODPR is a useful tool for designing primers at various melting temperatures at good target coverage. By combining degeneracy with optimal primer reuse the user may increase coverage of sequences amplified by the designed primers at significantly lower costs. Our analyses showed that overall MCMC-ODPR outperformed the other primer-design programs in our study in terms of cost per covered base

    High-Throughput SNP Genotyping by SBE/SBH

    Full text link
    Despite much progress over the past decade, current Single Nucleotide Polymorphism (SNP) genotyping technologies still offer an insufficient degree of multiplexing when required to handle user-selected sets of SNPs. In this paper we propose a new genotyping assay architecture combining multiplexed solution-phase single-base extension (SBE) reactions with sequencing by hybridization (SBH) using universal DNA arrays such as all kk-mer arrays. In addition to PCR amplification of genomic DNA, SNP genotyping using SBE/SBH assays involves the following steps: (1) Synthesizing primers complementing the genomic sequence immediately preceding SNPs of interest; (2) Hybridizing these primers with the genomic DNA; (3) Extending each primer by a single base using polymerase enzyme and dideoxynucleotides labeled with 4 different fluorescent dyes; and finally (4) Hybridizing extended primers to a universal DNA array and determining the identity of the bases that extend each primer by hybridization pattern analysis. Our contributions include a study of multiplexing algorithms for SBE/SBH genotyping assays and preliminary experimental results showing the achievable tradeoffs between the number of array probes and primer length on one hand and the number of SNPs that can be assayed simultaneously on the other. Simulation results on datasets both randomly generated and extracted from the NCBI dbSNP database suggest that the SBE/SBH architecture provides a flexible and cost-effective alternative to genotyping assays currently used in the industry, enabling genotyping of up to hundreds of thousands of user-specified SNPs per assay.Comment: 19 page

    Primer design for SNP genotyping based on allele-specific amplification Application to organ transplantation pharmacogenomics

    Full text link
    [EN] Diagnostic methods based on single nucleotide polymorphism (SNP) biomarkers are essential for the real adoption of personalized medicine. Allele specific amplification in a homogeneous format or combined to microarray hybridization are powerful approaches for SNP genotyping. However, primers must be properly selected to minimize cross-reactivity, dimer formation and nonspecific hybridization. This study presents a design workflow diagram for the selection of required oligonucleotides for multiplex assays. Based on thermodynamic restrictions, the oligonucleotide sets are chosen for a specific amplification of wild- and mutant-type templates. Design constraints include the structural stability of primer-template duplexes, template-probe duplexes and self-annealing complexes or hairpins for each targeted gene. The performance of the design algorithm was evaluated for the simultaneous genotyping of three SNPs related to immunosuppressive drugs administered after solid organ transplantation. The assayed polymorphisms were rs1045642 (ABCS] gene), rs1801133 (MTHFR gene) and rs776746 (CYP3A5 gene). Candidates were confirmed by discriminating homozygote and heterozygote populations using a fluorescence solution method and two colorimetric microarray methods on polycarbonate chips. The analysis of patient samples provided excellent genotyping results compared to those obtained by a reference method. The study demonstrates that the development of the allele-specific methods as pharmacogenetic tools can be simplified. (C) 2016 Elsevier B.V. All rights reserved.Tortajada-Genaro, LA.; Puchades Pla, R.; Maquieira Catala, Á. (2017). Primer design for SNP genotyping based on allele-specific amplification Application to organ transplantation pharmacogenomics. Journal of Pharmaceutical and Biomedical Analysis. 136:14-21. doi:10.1016/j.jpba.2016.12.030S142113

    Computational approaches for improving treatment and prevention of viral infections

    Get PDF
    The treatment of infections with HIV or HCV is challenging. Thus, novel drugs and new computational approaches that support the selection of therapies are required. This work presents methods that support therapy selection as well as methods that advance novel antiviral treatments. geno2pheno[ngs-freq] identifies drug resistance from HIV-1 or HCV samples that were subjected to next-generation sequencing by interpreting their sequences either via support vector machines or a rules-based approach. geno2pheno[coreceptor-hiv2] determines the coreceptor that is used for viral cell entry by analyzing a segment of the HIV-2 surface protein with a support vector machine. openPrimeR is capable of finding optimal combinations of primers for multiplex polymerase chain reaction by solving a set cover problem and accessing a new logistic regression model for determining amplification events arising from polymerase chain reaction. geno2pheno[ngs-freq] and geno2pheno[coreceptor-hiv2] enable the personalization of antiviral treatments and support clinical decision making. The application of openPrimeR on human immunoglobulin sequences has resulted in novel primer sets that improve the isolation of broadly neutralizing antibodies against HIV-1. The methods that were developed in this work thus constitute important contributions towards improving the prevention and treatment of viral infectious diseases.Die Behandlung von HIV- oder HCV-Infektionen ist herausfordernd. Daher werden neue Wirkstoffe, sowie neue computerbasierte Verfahren benötigt, welche die Therapie verbessern. In dieser Arbeit wurden Methoden zur Unterstützung der Therapieauswahl entwickelt, aber auch solche, welche neuartige Therapien vorantreiben. geno2pheno[ngs-freq] bestimmt, ob Resistenzen gegen Medikamente vorliegen, indem es Hochdurchsatzsequenzierungsdaten von HIV-1 oder HCV Proben mittels Support Vector Machines oder einem regelbasierten Ansatz interpretiert. geno2pheno[coreceptor-hiv2] bestimmt den HIV-2 Korezeptorgebrauch dadurch, dass es einen Abschnitt des viralen Oberflächenproteins mit einer Support Vector Machine analysiert. openPrimeR kann optimale Kombinationen von Primern für die Multiplex-Polymerasekettenreaktion finden, indem es ein Mengenüberdeckungsproblem löst und auf ein neues logistisches Regressionsmodell für die Vorhersage von Amplifizierungsereignissen zurückgreift. geno2pheno[ngs-freq] und geno2pheno[coreceptor-hiv2] ermöglichen die Personalisierung antiviraler Therapien und unterstützen die klinische Entscheidungsfindung. Durch den Einsatz von openPrimeR auf humanen Immunoglobulinsequenzen konnten Primersätze generiert werden, welche die Isolierung von breit neutralisierenden Antikörpern gegen HIV-1 verbessern. Die in dieser Arbeit entwickelten Methoden leisten somit einen wichtigen Beitrag zur Verbesserung der Prävention und Therapie viraler Infektionskrankheiten

    ProbeMaker: an extensible framework for design of sets of oligonucleotide probes

    Get PDF
    BACKGROUND: Procedures for genetic analyses based on oligonucleotide probes are powerful tools that can allow highly parallel investigations of genetic material. Such procedures require the design of large sets of probes using application-specific design constraints. RESULTS: ProbeMaker is a software framework for computer-assisted design and analysis of sets of oligonucleotide probe sequences. The tool assists in the design of probes for sets of target sequences, incorporating sequence motifs for purposes such as amplification, visualization, or identification. An extension system allows the framework to be equipped with application-specific components for evaluation of probe sequences, and provides the possibility to include support for importing sequence data from a variety of file formats. CONCLUSION: ProbeMaker is a suitable tool for many different oligonucleotide design and analysis tasks, including the design of probe sets for various types of parallel genetic analyses, experimental validation of design parameters, and in silico testing of probe sequence evaluation algorithms

    Oligonukleotiidide hübridisatsioonimudeli rakendamine PCR-i ja mikrokiipide optimeerimiseks

    Get PDF
    Väitekirja elektrooniline versioon ei sisalda publikatsioone.Nukleiinhapped on orgaaniliste makromolekulide hulgas unikaalsed tänu oma võimele kodeerida, dekodeerida ja kanda üle digitaalset informatsiooni. See omadus on aluseks nende kasutamisele arenevates tehnoloogiavaldkondades, alates kliinilisest diagnostikast kuni nanotehnoloogia ja informatsiooni talletamiseni. On aga oluline mõista, et digitaalse informatsiooni töötlemise ja säilitamise aluseks nukleiinhapetes on nende keemilised omadused. Tähtsaim nendest on hübridiseerumine - nukleiinhapete võime moodustada spontaanselt kaheahelaline heeliks kahe komplementaarse või osaliselt komplementaarse üheahelalise molekuli liitumisel. Nukleiinhapete hübridisatsiooni termodünaamika arvestamine võimaldab selle protsessi käitumist suure täpsusega modelleerida ja täiustada paljusid biotehnoloogilisi protsesse. Käesolevas väitekirjas on hübridisatsioonimudelit kasutatud multipleks-PCR-i ja detektsiooni mikrokiipide optimeerimiseks. Me töötasime välja ökonoomse algoritmi jaotamaks PCR praimeripaarid multipleksigruppidesse vastavalt nende omavahelisele sobivusele. Algoritm on realiseeritud nii iseseisva programmi kui veebirakendusena. Me uurisime multipleks PCR ebaõnnestumise põhjuseid ja näitasime, et suur arv mittespetsiifilisi seondumiskohti lähte DNA-l vähendab praimerite töötamise edukust. Need praimeripaarid, millel oli liiga suur arv mittespetsiifilisi seondumisi mitte ainult ei töötanud ise halvasti, vaid vähendasid ka teiste nendega koos amplifiseeritud praimeripaaride õnnestumise tõenäosust. Me töötasime välja arvutiprogrammi genereerimaks täieliku nimekirja kõigist võimalikest bakteriaalse tmRNA hübridiseerimisproovidest mis eristaksid omavahel kahte gruppi organisme. Proovide valideerimise käigus me näitasime, et valides hübridisatsioonienergia läviväärtuse suurema kui 4 kcl/mol on võimalik täielikult vältida valepositiivseid signaale. Me uurisime võimalust suurendada bakteriaalse RNA hübridiseerumiskiirust lisades lühikesi spetsiifilisi oligonukleotiide, mis hübridiseerudes lähtemolekulile ei lase selle sekundaarstruktuuril moodustuda. Seda meetodit kasutades tõusis hübridiseerumiskiirus temperatuuril 37C neli korda.Nucleic acids are unique among all organic macromolecules by the ability to encode, decode and transmit digital information. This property is used in emergent technologies as diverse as medical diagnosis, nanoscale engineering and information storage. Still it is important to understand that the basis of this digital information processing are the chemical properties of nucleic acids, the most important being the spontaneous formation of double-stranded helix between complementary or semi-complementary single-stranded molecules, called hybridization. Taking into account the thermodynamic properties of nucleic acid hybridization allows researchers to model the process with great accuracy and thus improve many associated technologies. In current thesis the hybridization model is used to optimize multiplex PCR and microarray hybridization. We developed an efficient algorithm to distribute PCR primer pairs into multiplex groups based on their compatibility with each other. The algorithm is also implemented as both standalone and web-based computer program. We analyzed the probable causes of failure of multiplex PCR and demonstrated that the large number of nonspecific hybridization sites in template DNA is detrimental to PCR quality. Primer pairs with too many nonspecific hybridization sites not only worked poorly but caused the failure of other primer pairs as well. We developed a computer program to generate exhaustive list of all possible hybridization probes for the detection of bacterial tmRNA, capable of distinguishing between two groups of source RNA. The probes were evaluated on microarray and shown that by keeping the hybridization energy cutoff between target and non-target groups over 4 kcal/mol all false-positive signals were eliminated. We analyzed the possibility of increasing the hybridization speed of bacterial tmRNA on low temperatures by applying short specific oligonucleotides that selectively hybridize with template molecules and break their secondary structure. Using this method the hybridization speed was increased fourfold at 37C

    PrimerHunter: a primer design tool for PCR-based virus subtype identification

    Get PDF
    Rapid and reliable virus subtype identification is critical for accurate diagnosis of human infections, effective response to epidemic outbreaks and global-scale surveillance of highly pathogenic viral subtypes such as avian influenza H5N1. The polymerase chain reaction (PCR) has become the method of choice for virus subtype identification. However, designing subtype-specific PCR primer pairs is a very challenging task: on one hand, selected primer pairs must result in robust amplification in the presence of a significant degree of sequence heterogeneity within subtypes, on the other, they must discriminate between the subtype of interest and closely related subtypes. In this article, we present a new tool, called PrimerHunter, that can be used to select highly sensitive and specific primers for virus subtyping. Our tool takes as input sets of both target and nontarget sequences. Primers are selected such that they efficiently amplify any one of the target sequences, and none of the nontarget sequences. PrimerHunter ensures the desired amplification properties by using accurate estimates of melting temperature with mismatches, computed based on the nearest neighbor model via an efficient fractional programming algorithm. Validation experiments with three avian influenza HA subtypes confirm that primers selected by PrimerHunter have high sensitivity and specificity for target sequences

    The art of PCR assay development: data-driven multiplexing

    Get PDF
    The present thesis describes the discovery and application of a novel methodology, named Data-Driven Multiplexing, which uses artificial intelligence and conventional molecular instruments to develop rapid, scalable and cost-effective clinical diagnostic tests. Detection of genetic material from living organisms is a biologically engineered process where organic molecules interact with each other and with chemical components to generate a meaningful signal of the presence, quantity or quality of target nucleic acids. Nucleic acid detection, such as DNA or RNA detection, identifies a specific organism based on its genetic material. In particular, DNA amplification approaches, such as for antimicrobial resistance (AMR) or COVID-19 detection, are crucial for diagnosing and managing various infectious diseases. One of the most widely used methods is Polymerase Chain Reaction (PCR), which can detect the presence of nucleic acids rapidly and accurately. The unique interaction of the genetic material and synthetic short DNA sequences called primers enable this harmonious biological process. This thesis aims to bioinformatically modulate the interaction between primers and genetic material, enhancing the diagnostic capabilities of conventional PCR instruments by applying artificial intelligence processing to the resulting signals. To achieve the goal mentioned above, experiments and data from several conventional platforms, such as real-time and digital PCR, are used in this thesis, along with state-of-the-art and innovative algorithms for classification problems and final application in real-world clinical scenarios. This work exhibits a powerful technology to optimise the use of the data, conveying the following message: the better use of the data in clinical diagnostics enables higher throughput of conventional instruments without the need for hardware modification, maintaining the standard practice workflows. In Part I, a novel method to analyse amplification data is proposed. Using a state-of-the-art digital PCR instrument and multiplex PCR assays, we demonstrate the simultaneous detection of up to nine different nucleic acids in a single-well and single-channel format. This novel concept called Amplification Curve Analysis (ACA) leverages kinetic information encoded in the amplification curve to classify the biological nature of the target of interest. This method is applied to the novel design of PCR assays for multiple detections of AMR genes and further validated with clinical samples collected at Charing Cross Hospital, London, UK. The ACA showed a high classification accuracy of 99.28% among 253 clinical isolates when multiplexing. Similar performance is also demonstrated with isothermal amplification chemistries using synthetic DNA, showing a 99.9% of classification accuracy for detecting respiratory-related infectious pathogens. In Part II, two intelligent mathematical algorithms are proposed to solve two significant challenges when developing a Data-driven multiplex PCR assay. Chapter 7 illustrates the use of filtering algorithms to remove the presence of outliers in the amplification data. This demonstrates that the information contained in the kinetics of the reaction itself provides a novel way to remove non-specific and not efficient reactions. By extracting meaningful features and adding custom selection parameters to the amplification data, we increase the machine learning classifier performance of the ACA by 20% when outliers are removed. In Chapter 8, a patented algorithm called Smart-Plexer is presented. This allows the hybrid development of multiplex PCR assays by computing the optimal single primer set combination in a multiplex assay. The algorithm's effectiveness stands in using experimental laboratory data as input, avoiding heavy computation and unreliable predictions of the sigmoidal shape of PCR curves. The output of the Smart-Plexer is an optimal assay for the simultaneous detection of seven coronavirus-related pathogens in a single well, scoring an accuracy of 98.8% in identifying the seven targets correctly among 14 clinical samples. Moreover, Chapter 9 focuses on applying novel multiplex assays in point-of-care devices and developing a new strategy for improving clinical diagnostics. In summary, inspired by the emerging requirement for more accurate, cost-effective and higher throughput diagnostics, this thesis shows that coupling artificial intelligence with assay design pipelines is crucial to address current diagnostic challenges. This requires crossing different fields, such as bioinformatics, molecular biology and data science, to develop an optimal solution and hence to maximise the value of clinical tests for nucleic acid detection, leading to more precise patient treatment and easier management of infectious control.Open Acces
    corecore