262 research outputs found

    11th German Conference on Chemoinformatics (GCC 2015) : Fulda, Germany. 8-10 November 2015.

    Get PDF

    Structured data abstractions and interpretable latent representations for single-cell multimodal genomics

    Get PDF
    Single-cell multimodal genomics involves simultaneous measurement of multiple types of molecular data, such as gene expression, epigenetic marks and protein abundance, in individual cells. This allows for a comprehensive and nuanced understanding of the molecular basis of cellular identity and function. The large volume of data generated by single-cell multimodal genomics experiments requires specialised methods and tools for handling, storing, and analysing it. This work provides contributions on multiple levels. First, it introduces a single-cell multimodal data standard — MuData — designed to facilitate the handling, storage and exchange of multimodal data. MuData provides interfaces that enable transparent access to multimodal annotations as well as data from individual modalities. This data structure has formed the foundation for the multimodal integration framework, which enables complex and composable workflows that can be naturally integrated with existing omics-specific analysis approaches. Joint analysis of multimodal data can be performed using integration methods. In order to enable integration of single-cell data, an improved multi-omics factor analysis model (MOFA+) has been designed and implemented building on the canonical dimensionality reduction approach for multi-omics integration. Inferring later factors that explain variation across multiple modalities of the data, MOFA+ enables the modelling of latent factors with cell group-specific patterns of activity. MOFA+ model has been implemented as part of the respective multi-omics integration framework, and its utility has been extended by software solutions that facilitate interactive model exploration and interpretation. The newly improved model for multi-omics integration of single cells has been applied to the study of gene expression signatures upon targeted gene activation. In a dataset featuring targeted activation of candidate regulators of zygotic genome activation (ZGA) — a crucial transcriptional event in early embryonic development, — modelling expression of both coding and non-coding loci with MOFA+ allowed to rank genes by their potency to activate a ZGA-like transcriptional response. With identification of Patz1, Dppa2 and Smarca5 as potent inducers of ZGA-like transcription in mouse embryonic stem cells, these findings have contributed to the understanding of molecular mechanisms behind ZGA and laid the foundation for future research of ZGA in vivo. In summary, this work’s contributions include the development of data handling and integration methods as well as new biological insights that arose from applying these methods to studying gene expression regulation in early development. This highlights how single-cell multimodal genomics can aid to generate valuable insights into complex biological systems

    Monte Carlo simulation studies of DNA hybridization and DNA-directed nanoparticle assembly

    Get PDF
    A coarse-grained lattice model of DNA oligonucleotides is proposed to investigate how fundamental thermodynamic processes are encoded by the nucleobase sequence at the microscopic level, and to elucidate the general mechanisms by which single-stranded oligonucleotides hybridize to their complements either in solution or when tethered to nanoparticles. Molecular simulations based on a high-coordination cubic lattice are performed using the Monte Carlo method. The dependence of the model's thermal stability on sequence complementarity is shown to be qualitatively consistent with experiment and statistical mechanical models. From the analysis of the statistical distribution of base-paired states and of the associated free-energy landscapes, two general hybridization scenarios are found. For sequences that do not follow a two-state process, hybridization is weakly cooperative and proceeds in multiple sequential steps involving stable intermediates with increasing number of paired bases. In contrast, sequences that conform to two-state thermodynamics exhibit moderately rough landscapes, in which multiple metastable intermediates appear over broad free-energy barriers. These intermediates correspond to duplex species that bridge the configurational and energetic gaps between duplex and denatured states with minimal loss of conformational entropy, and lead to a strongly cooperative hybridization. Remarkably, two-state thermodynamic signatures are generally observed in both scenarios. The role of cooperativity in the assembly of nanoparticles tethered with model DNA oligonucleotides is similarly addressed with the Monte Carlo method, where nanoparticles are represented as finely discretized hard-core spheres on a cubic lattice. The energetic and structural mechanisms of self-assembling are investigated by simulating the aggregation of small "satellite" particles from the bulk onto a large "core" particle. A remarkable enhancement of the system's thermal stability is attained by increasing the number of strands per satellite particle available to hybridize with those on the core particle. This cooperative process is driven by the formation of multiple bridging duplexes under favorable conditions of reduced translational entropy and the resultant energetic compensation; this behavior rapidly weakens above a certain threshold of linker strands per satellite particle. Cooperativity also enhances the structural organization of the assemblies by systematically narrowing the radial distribution of the satellite particles bound the core

    Simulations and Experiments: How close can we get?

    Get PDF
    The interactions between biomolecules and their environment can be studied by experiments and simulations. Results from experiments and simulations are often interpretations based on the raw data. For an accurate comparison of both approaches, the interpretation of the raw data from experiments and simulation have to be in compliance. The design of such simulations and interpretation of raw data is demonstrated in this thesis for two examples; fluorescence resonance energy transfer (FRET) experiments and surface adsorption of biomolecules on inorganic surfaces like gold. FRET experiments allow to probe molecular distances via the distance-dependent energy transfer efficiency from an excited donor dye to its acceptor counterpart. In single molecule settings, not only average distances, but also distance distributions or even fluctuations can be probed, providing a powerful tool to study flexibilities and structural changes in biomolecules. However, the measured energy transfer efficiency does not only depend on the distance between the two dyes, but also on their mutual orientation, which is typically inaccessible to experiments. Thus, assumptions on the orientation distributions and averages have to be employed, which severely limit the accuracy of the distance distributions extracted from FRET experiments alone. In this work, I combined efficiency distributions from FRET experiments with dye orientation statistics from molecular dynamics (MD) simulations to calculate improved estimates of the distance distributions. From the time-dependent mutual dye orientations, the FRET efficiency was calculated and the statistics of individual photo-absorption, FRET, and photo-emission events were determined from subsequent Monte Carlo (MC) simulations. All recorded emission events were then collected to bursts from which efficiencies were calculated in close resemblance to the actual FRET experiment. The feasibility of this approach has been tested by direct comparison to experimental data. As my test system, I chose a poly-proline chain with Alexa 488 and Alexa 594 dyes attached. Quantitative agreement of calculated efficiency distributions from simulations with the experimental ones was obtained. In addition, the presence of cis-isomers and specific dye conformations were identified as the sources of the experimentally observed heterogeneity. This agreement of in silico FRET with experiments allows employment of the dye orientation dynamics from simulations in the distance reconstruction. For multiple levels of approximation, the dye orientation dynamics was used in dye orientation models. At each level, fewer assumptions were applied to the dye orientation model. Each model was then used to reconstruct distance distributions from experimental efficiency distributions. Comparison of reconstructed distance distributions with those from simulations revealed a systematically improved accuracy of the reconstruction in conjunction with a reduction of model assumptions. This result demonstrates that dye orientations from MD simulations, combined with MC photon generation, can indeed be used to improve the accuracy of distance distribution reconstruction from experimental FRET efficiencies. A second example of simulations and interpretation in compliance with experiments are the studies of protein adsorption on gold surfaces. Interactions between biomolecules and inorganic surfaces, e.g. during the biomineralization of bone, are fundamental for multicellular organisms. Moreover, understanding these interactions is the basis for biotechnological applications such as biochips or nano-sensing. In the framework of the PROSURF project, a multi-scale approach for the simulation of biomolecular adsorption was implemented. First, parameters for MD simulations were derived from ab initio calculations. These parameters were then applied to simulate the adsorption of single amino acids and to calculate their adsorption free energy profiles. For the screening of adsorbed protein conformations, rigid body Brownian dynamics (BD) docking on surfaces was benchmarked with the free energy profiles from the MD simulations. Comparison of the protein adsorption rate from surface plasmon resonance experiments and BD docking yielded good agreement and therefore justifies the multi-scale approach. Additionally, MD simulations of protein adsorption on gold surfaces revealed an unexpected importance of positively charged residues on the surface for the initial adsorption steps. The multi-scale approach presented here allows the study of biomolecular interactions with inorganic surfaces consistently at multiple levels of theory: Atomistic details of the adsorption process can be studied by MD simulations whereas BD allows the extensive screening of protein libraries or adsorption geometries. In summary, compliance of simulation and experimental setup allows benchmarking of the simulation accuracy by comparison to experiments. In contrast to employing experiments alone, the combination of experiments and simulations enhances the accuracy of interpreted results from experimental raw data
    • …
    corecore