27 research outputs found

    Microparticle Array on Gel Microstructure Chip for Multiplexed Biochemical Assays

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Development of a quantum dot-encoded microsphere suspension assay for the genotyping of single nucleotide polymorphisms

    Get PDF
    This thesis describes the investigation of quantum dot-doped particle fluorescent technology commercially available for its application to analyte profiling in suspension. The first part of the thesis described the characterisation of the quantum dot-encoded microspheres, QDEMs, developed by Crystalplex (PA, USA). The multiple fluorescence signatures of QDEMs were analysed using microscopy and flow cytometry technology which provided high-content measurements with a single excitation sources and multiple emission wavelength detectors. The sensitivity and stability of the materials was evaluated under typical biomedical conditions encounter in multiple analyte suspension assays. Novel analytical parameters were defined to study QDEM stability and confocal microscopy detection system was used to provide structural and fluorescent imagines of the fluorescent microspheres under various conditions. Composition of the aqueous environment, temperature and physical forces applied to QDEM induced changes in their fluorescent codes and structural properties. Optimal conditions were then defined for the application of the material to biomedical assays. In a second stage, a conjugation method was developed to produce optimised QDEM bioconjugates for the detection of single strand DNA in suspension. The impact of the conjugation buffer, the concentration and the structure of oligonucleotides was evaluated to optimise QDEM bioconjugates. Then, a novel approach was investigated to optimise the hybridisation of ssDNA to QDEM bioconjugates. Experimental design with response surface methodology determined optimum conditions for the hybridisation of oligonucleotides to QDEM surface in suspension array. Finally, the specific hybridisation of ssDNA to QDEM bioconjugates in a small liquid format adapted to single nucleotide polymorphism detection was demonstrated. The work presented here shows the potential of QDEM bioconjugates for suspension array technology and DNA genotyping. Further, this report highlights the challenges that remain for QDEM fluorescent technology to be reliable for biomedical and suspension array applications.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    The mapping task and its various applications in next-generation sequencing

    Get PDF
    The aim of this thesis is the development and benchmarking of computational methods for the analysis of high-throughput data from tiling arrays and next-generation sequencing. Tiling arrays have been a mainstay of genome-wide transcriptomics, e.g., in the identification of functional elements in the human genome. Due to limitations of existing methods for the data analysis of this data, a novel statistical approach is presented that identifies expressed segments as significant differences from the background distribution and thus avoids dataset-specific parameters. This method detects differentially expressed segments in biological data with significantly lower false discovery rates and equivalent sensitivities compared to commonly used methods. In addition, it is also clearly superior in the recovery of exon-intron structures. Moreover, the search for local accumulations of expressed segments in tiling array data has led to the identification of very large expressed regions that may constitute a new class of macroRNAs. This thesis proceeds with next-generation sequencing for which various protocols have been devised to study genomic, transcriptomic, and epigenomic features. One of the first crucial steps in most NGS data analyses is the mapping of sequencing reads to a reference genome. This work introduces algorithmic methods to solve the mapping tasks for three major NGS protocols: DNA-seq, RNA-seq, and MethylC-seq. All methods have been thoroughly benchmarked and integrated into the segemehl mapping suite. First, mapping of DNA-seq data is facilitated by the core mapping algorithm of segemehl. Since the initial publication, it has been continuously updated and expanded. Here, extensive and reproducible benchmarks are presented that compare segemehl to state-of-the-art read aligners on various data sets. The results indicate that it is not only more sensitive in finding the optimal alignment with respect to the unit edit distance but also very specific compared to most commonly used alternative read mappers. These advantages are observable for both real and simulated reads, are largely independent of the read length and sequencing technology, but come at the cost of higher running time and memory consumption. Second, the split-read extension of segemehl, presented by Hoffmann, enables the mapping of RNA-seq data, a computationally more difficult form of the mapping task due to the occurrence of splicing. Here, the novel tool lack is presented, which aims to recover missed RNA-seq read alignments using de novo splice junction information. It performs very well in benchmarks and may thus be a beneficial extension to RNA-seq analysis pipelines. Third, a novel method is introduced that facilitates the mapping of bisulfite-treated sequencing data. This protocol is considered the gold standard in genome-wide studies of DNA methylation, one of the major epigenetic modifications in animals and plants. The treatment of DNA with sodium bisulfite selectively converts unmethylated cytosines to uracils, while methylated ones remain unchanged. The bisulfite extension developed here performs seed searches on a collapsed alphabet followed by bisulfite-sensitive dynamic programming alignments. Thus, it is insensitive to bisulfite-related mismatches and does not rely on post-processing, in contrast to other methods. In comparison to state-of-the-art tools, this method achieves significantly higher sensitivities and performs time-competitive in mapping millions of sequencing reads to vertebrate genomes. Remarkably, the increase in sensitivity does not come at the cost of decreased specificity and thus may finally result in a better performance in calling the methylation rate. Lastly, the potential of mapping strategies for de novo genome assemblies is demonstrated with the introduction of a new guided assembly procedure. It incorporates mapping as major component and uses the additional information (e.g., annotation) as guide. With this method, the complete mitochondrial genome of Eulimnogammarus verrucosus has been successfully assembled even though the sequencing library has been heavily dominated by nuclear DNA. In summary, this thesis introduces algorithmic methods that significantly improve the analysis of tiling array, DNA-seq, RNA-seq, and MethylC-seq data, and proposes standards for benchmarking NGS read aligners. Moreover, it presents a new guided assembly procedure that has been successfully applied in the de novo assembly of a crustacean mitogenome.Diese Arbeit befasst sich mit der Entwicklung und dem Benchmarken von Verfahren zur Analyse von Daten aus Hochdurchsatz-Technologien, wie Tiling Arrays oder Hochdurchsatz-Sequenzierung. Tiling Arrays bildeten lange Zeit die Grundlage für die genomweite Untersuchung des Transkriptoms und kamen beispielsweise bei der Identifizierung funktioneller Elemente im menschlichen Genom zum Einsatz. In dieser Arbeit wird ein neues statistisches Verfahren zur Auswertung von Tiling Array-Daten vorgestellt. Darin werden Segmente als exprimiert klassifiziert, wenn sich deren Signale signifikant von der Hintergrundverteilung unterscheiden. Dadurch werden keine auf den Datensatz abgestimmten Parameterwerte benötigt. Die hier vorgestellte Methode erkennt differentiell exprimierte Segmente in biologischen Daten bei gleicher Sensitivität mit geringerer Falsch-Positiv-Rate im Vergleich zu den derzeit hauptsächlich eingesetzten Verfahren. Zudem ist die Methode bei der Erkennung von Exon-Intron Grenzen präziser. Die Suche nach Anhäufungen exprimierter Segmente hat darüber hinaus zur Entdeckung von sehr langen Regionen geführt, welche möglicherweise eine neue Klasse von macroRNAs darstellen. Nach dem Exkurs zu Tiling Arrays konzentriert sich diese Arbeit nun auf die Hochdurchsatz-Sequenzierung, für die bereits verschiedene Sequenzierungsprotokolle zur Untersuchungen des Genoms, Transkriptoms und Epigenoms etabliert sind. Einer der ersten und entscheidenden Schritte in der Analyse von Sequenzierungsdaten stellt in den meisten Fällen das Mappen dar, bei dem kurze Sequenzen (Reads) auf ein großes Referenzgenom aligniert werden. Die vorliegende Arbeit stellt algorithmische Methoden vor, welche das Mapping-Problem für drei wichtige Sequenzierungsprotokolle (DNA-Seq, RNA-Seq und MethylC-Seq) lösen. Alle Methoden wurden ausführlichen Benchmarks unterzogen und sind in der segemehl-Suite integriert. Als Erstes wird hier der Kern-Algorithmus von segemehl vorgestellt, welcher das Mappen von DNA-Sequenzierungsdaten ermöglicht. Seit der ersten Veröffentlichung wurde dieser kontinuierlich optimiert und erweitert. In dieser Arbeit werden umfangreiche und auf Reproduzierbarkeit bedachte Benchmarks präsentiert, in denen segemehl auf zahlreichen Datensätzen mit bekannten Mapping-Programmen verglichen wird. Die Ergebnisse zeigen, dass segemehl nicht nur sensitiver im Auffinden von optimalen Alignments bezüglich der Editierdistanz sondern auch sehr spezifisch im Vergleich zu anderen Methoden ist. Diese Vorteile sind in realen und simulierten Daten unabhängig von der Sequenzierungstechnologie oder der Länge der Reads erkennbar, gehen aber zu Lasten einer längeren Laufzeit und eines höheren Speicherverbrauchs. Als Zweites wird das Mappen von RNA-Sequenzierungsdaten untersucht, welches bereits von der Split-Read-Erweiterung von segemehl unterstützt wird. Aufgrund von Spleißen ist diese Form des Mapping-Problems rechnerisch aufwendiger. In dieser Arbeit wird das neue Programm lack vorgestellt, welches darauf abzielt, fehlende Read-Alignments mit Hilfe von de novo Spleiß-Information zu finden. Es erzielt hervorragende Ergebnisse und stellt somit eine sinnvolle Ergänzung zu Analyse-Pipelines für RNA-Sequenzierungsdaten dar. Als Drittes wird eine neue Methode zum Mappen von Bisulfit-behandelte Sequenzierungsdaten vorgestellt. Dieses Protokoll gilt als Goldstandard in der genomweiten Untersuchung der DNA-Methylierung, einer der wichtigsten epigenetischen Modifikationen in Tieren und Pflanzen. Dabei wird die DNA vor der Sequenzierung mit Natriumbisulfit behandelt, welches selektiv nicht methylierte Cytosine zu Uracilen konvertiert, während Methylcytosine davon unberührt bleiben. Die hier vorgestellte Bisulfit-Erweiterung führt die Seed-Suche auf einem reduziertem Alphabet durch und verifiziert die erhaltenen Treffer mit einem auf dynamischer Programmierung basierenden Bisulfit-sensitiven Alignment-Algorithmus. Das verwendete Verfahren ist somit unempfindlich gegenüber Bisulfit-Konvertierungen und erfordert im Gegensatz zu anderen Verfahren keine weitere Nachverarbeitung. Im Vergleich zu aktuell eingesetzten Programmen ist die Methode sensitiver und benötigt eine vergleichbare Laufzeit beim Mappen von Millionen von Reads auf große Genome. Bemerkenswerterweise wird die erhöhte Sensitivität bei gleichbleibend guter Spezifizität erreicht. Dadurch könnte diese Methode somit auch bessere Ergebnisse bei der präzisen Bestimmung der Methylierungsraten erreichen. Schließlich wird noch das Potential von Mapping-Strategien für Assemblierungen mit der Einführung eines neuen, Kristallisation-genanntes Verfahren zur unterstützten Assemblierung aufgezeigt. Es enthält Mapping als Hauptbestandteil und nutzt Zusatzinformation (z.B. Annotationen) als Unterstützung. Dieses Verfahren ermöglichte die erfolgreiche Assemblierung des kompletten mitochondrialen Genoms von Eulimnogammarus verrucosus trotz einer vorwiegend aus nukleärer DNA bestehenden genomischen Bibliothek. Zusammenfassend stellt diese Arbeit algorithmische Methoden vor, welche die Analysen von Tiling Array, DNA-Seq, RNA-Seq und MethylC-Seq Daten signifikant verbessern. Es werden zudem Standards für den Vergleich von Programmen zum Mappen von Daten der Hochdurchsatz-Sequenzierung vorgeschlagen. Darüber hinaus wird ein neues Verfahren zur unterstützten Genom-Assemblierung vorgestellt, welches erfolgreich bei der de novo-Assemblierung eines mitochondrialen Krustentier-Genoms eingesetzt wurde

    Fast bead detection and inexact microarray pattern matching for in-situ encoded bead-based array

    No full text
    VISAPP 2012 - Proceedings of the International Conference on Computer Vision Theory and Applications25-1

    Integrative bioinformatics applications for complex human disease contexts

    Get PDF
    This thesis presents new methods for the analysis of high-throughput data from modern sources in the context of complex human diseases, at the example of a bioinformatics analysis workflow. New measurement techniques improve the resolution with which cellular and molecular processes can be monitored. While RNA sequencing (RNA-seq) measures mRNA expression, single-cell RNA-seq (scRNA-seq) resolves this on a per-cell basis. Long-read sequencing is increasingly used in genomics. With imaging mass spectrometry (IMS) the protein level in tissues is measured spatially resolved. All these techniques induce specific challenges, which need to be addressed with new computational methods. Collecting knowledge with contextual annotations is important for integrative data analyses. Such knowledge is available through large literature repositories, from which information, such as miRNA-gene interactions, can be extracted using text mining methods. After aggregating this information in new databases, specific questions can be answered with traceable evidence. The combination of experimental data with these databases offers new possibilities for data integrative methods and for answering questions relevant for complex human diseases. Several data sources are made available, such as literature for text mining miRNA-gene interactions (Chapter 2), next- and third-generation sequencing data for genomics and transcriptomics (Chapters 4.1, 5), and IMS for spatially resolved proteomics (Chapter 4.4). For these data sources new methods for information extraction and pre-processing are developed. For instance, third-generation sequencing runs can be monitored and evaluated using the poreSTAT and sequ-into methods. The integrative (down-stream) analyses make use of these (heterogeneous) data sources. The cPred method (Chapter 4.2) for cell type prediction from scRNA-seq data was successfully applied in the context of the SARS-CoV-2 pandemic. The robust differential expression (DE) analysis pipeline RoDE (Chapter 6.1) contains a large set of methods for (differential) data analysis, reporting and visualization of RNA-seq data. Topics of accessibility of bioinformatics software are discussed along practical applications (Chapter 3). The developed miRNA-gene interaction database gives valuable insights into atherosclerosis-relevant processes and serves as regulatory network for the prediction of active miRNA regulators in RoDE (Chapter 6.1). The cPred predictions, RoDE results, scRNA-seq and IMS data are unified as input for the 3D-index Aorta3D (Chapter 6.2), which makes atherosclerosis related datasets browsable. Finally, the scRNA-seq analysis with subsequent cPred cell type prediction, and the robust analysis of bulk-RNA-seq datasets, led to novel insights into COVID-19. Taken all discussed methods together, the integrative analysis methods for complex human disease contexts have been improved at essential positions.Die Dissertation beschreibt Methoden zur Prozessierung von aktuellen Hochdurchsatzdaten, sowie Verfahren zu deren weiterer integrativen Analyse. Diese findet Anwendung vor allem im Kontext von komplexen menschlichen Krankheiten. Neue Messtechniken erlauben eine detailliertere Beobachtung biomedizinischer Prozesse. Mit RNA-Sequenzierung (RNA-seq) wird mRNA-Expression gemessen, mit Hilfe von moderner single-cell-RNA-seq (scRNA-seq) sogar für (sehr viele) einzelne Zellen. Long-Read-Sequenzierung wird zunehmend zur Sequenzierung ganzer Genome eingesetzt. Mittels bildgebender Massenspektrometrie (IMS) können Proteine in Geweben räumlich aufgelöst quantifiziert werden. Diese Techniken bringen spezifische Herausforderungen mit sich, die mit neuen bioinformatischen Methoden angegangen werden müssen. Für die integrative Datenanalyse ist auch die Gewinnung von geeignetem Kontextwissen wichtig. Wissenschaftliche Erkenntnisse werden in Artikeln veröffentlicht, die über große Literaturdatenbanken zugänglich sind. Mittels Textmining können daraus Informationen extrahiert werden, z.B. miRNA-Gen-Interaktionen, die in eigenen Datenbank aggregiert werden um spezifische Fragen mit nachvollziehbaren Belegen zu beantworten. In Kombination mit experimentellen Daten bieten sich so neue Möglichkeiten für integrative Methoden. Durch die Extraktion von Rohdaten und deren Vorprozessierung werden mehrere Datenquellen erschlossen, wie z.B. Literatur für Textmining von miRNA-Gen-Interaktionen (Kapitel 2), Long-Read- und RNA-seq-Daten für Genomics und Transcriptomics (Kapitel 4.2, 5) und IMS für Protein-Messungen (Kapitel 4.4). So dienen z.B. die poreSTAT und sequ-into Methoden der Vorprozessierung und Auswertung von Long-Read-Sequenzierungen. In der integrativen (down-stream) Analyse werden diese (heterogenen) Datenquellen verwendet. Für die Bestimmung von Zelltypen in scRNA-seq-Experimenten wurde die cPred-Methode (Kapitel 4.2) erfolgreich im Kontext der SARS-CoV-2-Pandemie eingesetzt. Auch die robuste Pipeline RoDE fand dort Anwendung, die viele Methoden zur (differentiellen) Datenanalyse, zum Reporting und zur Visualisierung bereitstellt (Kapitel 6.1). Themen der Benutzbarkeit von (bioinformatischer) Software werden an Hand von praktischen Anwendungen diskutiert (Kapitel 3). Die entwickelte miRNA-Gen-Interaktionsdatenbank gibt wertvolle Einblicke in Atherosklerose-relevante Prozesse und dient als regulatorisches Netzwerk für die Vorhersage von aktiven miRNA-Regulatoren in RoDE (Kapitel 6.1). Die cPred-Methode, RoDE-Ergebnisse, scRNA-seq- und IMS-Daten werden im 3D-Index Aorta3D (Kapitel 6.2) zusammengeführt, der relevante Datensätze durchsuchbar macht. Die diskutierten Methoden führen zu erheblichen Verbesserungen für die integrative Datenanalyse in komplexen menschlichen Krankheitskontexten

    Surface Plasmon Resonance for Biosensing

    Get PDF
    The rise of photonics technologies has driven an extremely fast evolution in biosensing applications. Such rapid progress has created a gap of understanding and insight capability in the general public about advanced sensing systems that have been made progressively available by these new technologies. Thus, there is currently a clear need for moving the meaning of some keywords, such as plasmonic, into the daily vocabulary of a general audience with a reasonable degree of education. The selection of the scientific works reported in this book is carefully balanced between reviews and research papers and has the purpose of presenting a set of applications and case studies sufficiently broad enough to enlighten the reader attention toward the great potential of plasmonic biosensing and the great impact that can be expected in the near future for supporting disease screening and stratification

    Computational approaches to discovering differentiation genes in the peripheral nervous system of drosophila melanogaster

    Get PDF
    In the common fruit fly, Drosophila melanogaster, neural cell fate specification is triggered by a group of conserved transcriptional regulators known as proneural factors. Proneural factors induce neural fate in uncommitted neuroectodermal progenitor cells, in a process that culminates in sensory neuron differentiation. While the role of proneural factors in early fate specification has been described, less is known about the transition between neural specification and neural differentiation. The aim of this thesis is to use computational methods to improve the understanding of terminal neural differentiation in the Peripheral Nervous System (PNS) of Drosophila. To provide an insight into how proneural factors coordinate the developmental programme leading to neural differentiation, expression profiling covering the first 3 hours of PNS development in Drosophila embryos had been previously carried out by Cachero et al. [2011]. The study revealed a time-course of gene expression changes from specification to differentiation and suggested a cascade model, whereby proneural factors regulate a group of intermediate transcriptional regulators which are in turn responsible for the activation of specific differentiation target genes. In this thesis, I propose to select potentially important differentiation genes from the transcriptional data in Cachero et al. [2011] using a novel approach centred on protein interaction network-driven prioritisation. This is based on the insight that biological hypotheses supported by diverse data sources can represent stronger candidates for follow-up studies. Specifically, I propose the usage of protein interaction network data because of documented transcriptome-interactome correlations, which suggest that differentially expressed genes encode products that tend to belong to functionally related protein interaction clusters. Experimental protein interaction data is, however, remarkably sparse. To increase the informative power of protein-level analyses, I develop a novel approach to augment publicly available protein interaction datasets using functional conservation between orthologous proteins across different genomes, to predict interologs (interacting orthologs). I implement this interolog retrieval methodology in a collection of open-source software modules called Bio:: Homology::InterologWalk, the first generalised framework using web-services for “on-the- fly” interolog projection. Bio::Homology::InterologWalk works with homology data for any of the hundreds of genomes in Ensembl and Ensembgenomes Metazoa, and with experimental protein interaction data curated by EBI Intact. It generates putative protein interactions and optionally collates meta-data into a prioritisation index that can be used to help select interologs with high experimental support. The methodology proposed represents a significant advance over existing interolog data sources, which are restricted to specific biological domains with fixed underlying data sources often only accessible through basic web-interfaces. Using Bio::Homology::InterologWalk, I build interolog models in Drosophila sensory neurons and, guided by the transcriptome data, find evidence implicating a small set of genes in a conserved sensory neuronal specialisation dynamic, the assembly of the ciliary dendrite in mechanosensory neurons. Using network community-finding algorithms I obtain functionally enriched communities, which I analyse using an array of novel computational techniques. The ensuing datasets lead to the elucidation of a cluster of interacting proteins encoded by the target genes of one of the intermediate transcriptional regulators of neurogenesis and ciliogenesis, fd3F. These targets are validated in vivo and result in improved knowledge of the important target genes activated by the transcriptional cascade, suggesting a scenario for the mechanisms orchestrating the ordered assembly of the cilium during differentiation
    corecore