8,901 research outputs found

    Model-based clustering with data correction for removing artifacts in gene expression data

    Full text link
    The NIH Library of Integrated Network-based Cellular Signatures (LINCS) contains gene expression data from over a million experiments, using Luminex Bead technology. Only 500 colors are used to measure the expression levels of the 1,000 landmark genes measured, and the data for the resulting pairs of genes are deconvolved. The raw data are sometimes inadequate for reliable deconvolution leading to artifacts in the final processed data. These include the expression levels of paired genes being flipped or given the same value, and clusters of values that are not at the true expression level. We propose a new method called model-based clustering with data correction (MCDC) that is able to identify and correct these three kinds of artifacts simultaneously. We show that MCDC improves the resulting gene expression data in terms of agreement with external baselines, as well as improving results from subsequent analysis.Comment: 28 page

    Estrogen-dependent dynamic profile of eNOS-DNA associations in prostate cancer

    Get PDF
    In previous work we have documented the nuclear translocation of endothelial NOS (eNOS) and its participation in combinatorial complexes with Estrogen Receptor Beta (ERβ) and Hypoxia Inducible Factors (HIFs) that determine localized chromatin remodeling in response to estrogen (E2) and hypoxia stimuli, resulting in transcriptional regulation of genes associated with adverse prognosis in prostate cancer (PCa). To explore the role of nuclear eNOS in the acquisition of aggressive phenotype in PCa, we performed ChIP-Sequencing on chromatin-associated eNOS from cells from a primary tumor with poor outcome and from metastatic LNCaP cells. We found that: 1. the eNOS-bound regions (peaks) are widely distributed across the genome encompassing multiple transcription factors binding sites, including Estrogen Response Elements. 2. E2 increased the number of peaks, indicating hormone-dependent eNOS re-localization. 3. Peak distribution was similar with/without E2 with ≈ 55% of them in extragenic DNA regions and an intriguing involvement of the 5′ domain of several miRs deregulated in PCa. Numerous potentially novel eNOS-targeted genes have been identified suggesting that eNOS participates in the regulation of large gene sets. The parallel finding of downregulation of a cluster of miRs, including miR-34a, in PCa cells associated with poor outcome led us to unveil a molecular link between eNOS and SIRT1, an epigenetic regulator of aging and tumorigenicity, negatively regulated by miR-34a and in turn activating eNOS. E2 potentiates miR-34a downregulation thus enhancing SIRT1 expression, depicting a novel eNOS/SIRT1 interplay fine-tuned by E2-activated ER signaling, and suggesting that eNOS may play an important role in aggressive PCa

    Nanomechanics combined with HDX reveals allosteric drug binding sites of CFTR NBD1.

    Get PDF
    Cystic fibrosis (CF) is a frequent genetic disease in Caucasians that is caused by the deletion of F508 (DF508) in the nucleotide binding domain 1 (NBD1) of the CF transmembrane conductance regulator (CFTR). The DF508 compromises the folding energetics of the NBD1, as well as the folding of three other CFTR domains. Combination of FDA approved corrector molecules can efficiently but incompletely rescue the DF508-CFTR folding and stability defect. Thus, new pharmacophores that would reinstate the wild-type-like conformational stability of the DF508-NBD1 would be highly beneficial. The most prominent molecule, 5-bromoindole-3-acetic acid (BIA) that can thermally stabilize the NBD1 has low potency and efficacy. To gain insights into the NBD1 (un)folding dynamics and BIA binding site localization, we combined molecular dynamics (MD) simulations, atomic force spectroscopy (AFM) and hydrogen- deuterium exchange (HDX) experiments. We found that the NBD1 a-subdomain with three adjacent strands from the b-subdomain plays an important role in early folding steps, when crucial non-native interactions are formed via residue F508. Our AFM and HDX experiments showed that BIA associates with this a-core region and increases the resistance of the DF508-NBD1 against mechanical unfolding, a phenomenon that could be exploited in future developments of folding correctors

    Unravelling the molecular dynamics of c-MYC’s TAD domain: a journey from simulation optimisation to drug discovery

    Get PDF
    c-MYC, part of the MYC family of transcription factors, is often deregulated in cancer, and since the early 1980’s has been identified as a prime oncogenic factor. Despite much research interest, c-MYC’s structural dynamics remain largely uncharted due to its intrinsic structural disorder. Disordered proteins are challenging to study using solely structural experimental methods, thus lately attention has turned towards the development of reliable in-silico methods to get an accurate molecular description. Molecular Dynamics simulations, commonly and successfully used to study globular proteins, can also be optimised to correctly reproduce natural protein disorder. The simulation results were assessed for convergence and conformational equilibrium, achieved by comparing the c-MYC’s Molecular Dynamics conformational landscape to similar data derived from an abundantly sampled probabilistic distribution. After the preparatory and validation work, the efforts turned to the appraisal of c-MYC’s first 88 amino acids. The revelation of its conformational states and structural dynamics opened the door for drug discovery and proof-of-concept that c-MYC should not be considered ‘undruggable’. Further exploration into the protein first 150 residues, corresponding to its transactivation domain, uncovered important structural dynamics controlled by key phosphodegron residues. Phosphorylation and mutagenesis studies demonstrated how these control mechanisms, which serve to modulate accessibility to crucial regions, are facilitated by isomerisation events within the phosphodegron. Overarchingly, this study substantiates the robustness of well-parameterised computational simulations, and machine learning methods, in uncovering the workings of otherwise difficult to study disordered proteins.Open Acces

    Theoretical Analysis of Biomolecular Systems: Computational Simulations, Core-set Markov State Models, Clustering, Molecular Docking

    Get PDF
    The analysis of the structural and the dynamical behavior of biomolecules is very important to under- stand their biological function, stability or physico-chemical properties. In this thesis, it is highlighted how different theoretical methods to characterize the aforementioned structural and dynamical properties can be used and combined, to obtain kinetic information or to detect biomolecule-ligand interactions. The basis for most of the analyses, performed in the course of this work, are molecular dynamics sim- ulations sampling the conformational space of the biomolecule of interest. Using molecular dynamics simulations, the remarkable stable water-soluble-binding-protein is examined first. On a theoretical ba- sis, structural modifications that can influence the stability of the protein are discussed. Additionally, by combining the simulations with a QM/MM optimization scheme and quantum chemical calculations, spectroscopical properties can be investigated. Markov State Models are applied frequently to capture the slow dynamics within simulation trajectories. They are based on a discretization of the conformational space. This discretization, however, introduces an error in the outcome of the analysis. The application of a core-set discretization can reduce this error. In this thesis, it is discussed how density-based cluster algorithms can be used to determine these core sets, and the application on linear and cyclic peptides is highlighted. The performance of a promising cluster algorithm is investigated and error sources in the construction of the Markov models are discussed. Finally, it is shown how molecular docking combined with molecular dynamics simulations can be used to determine the binding behavior of ligands towards biomolecules. In this context, the important in- teractions within the active site of an enzyme, and different binding modes of DNA intercalators are identified

    Multivariate Statistical Methodologies used in In-vitro Raman Spectroscopy: Simulations and Applications for Drug and Nanoparticle Interactions

    Get PDF
    Raman spectroscopy is a growing technology in the fields of in-vitro drug and nanoparticle screening. The label free capability provided by vibrational spectroscopy, as well as the ability of the technique to probe the chemical nature of samples, makes it a good candidate for use in these fields. Crucial to the progress of these methods is the development and validation of robust and accurate multivariate statistical analysis protocols. In this thesis, both established and novel methods are examined using both real and simulated datasets. In particular, simulated datasets are used to validate and assess the accuracy of these methods in a spectroscopic setting. Firstly, partial least squares regression (PLSR) is examined using a simulated model based on real experimental data. This is applied to investigate the application of the algorithm to continuously varying data with known spectral perturbations introduced over a range of concentrations and responses. The results show that, while PLSR is valid for some dose ranges, sub-lethal, low concentrations and thus subtle spectral changes in the data may lead to difficulties in model construction. Multiple trends present in the data were also investigated and possible model error based on spectral bleedthrough in the regression coefficients RCs is explored. Principal component analysis (PCA) was also investigated using simulated datasets based on known changes in the data. Some of the limitations of PCA for data partitioning and trend analysis are overcome by a novel variant termed, ‘seeded’ PCA. 1st and 2nd derivative data is also explored for improvements in Raman spectral analysis using seeded PCA

    Mass spectral imaging of clinical samples using deep learning

    Get PDF
    A better interpretation of tumour heterogeneity and variability is vital for the improvement of novel diagnostic techniques and personalized cancer treatments. Tumour tissue heterogeneity is characterized by biochemical heterogeneity, which can be investigated by unsupervised metabolomics. Mass Spectrometry Imaging (MSI) combined with Machine Learning techniques have generated increasing interest as analytical and diagnostic tools for the analysis of spatial molecular patterns in tissue samples. Considering the high complexity of data produced by the application of MSI, which can consist of many thousands of spectral peaks, statistical analysis and in particular machine learning and deep learning have been investigated as novel approaches to deduce the relationships between the measured molecular patterns and the local structural and biological properties of the tissues. Machine learning have historically been divided into two main categories: Supervised and Unsupervised learning. In MSI, supervised learning methods may be used to segment tissues into histologically relevant areas e.g. the classification of tissue regions in H&E (Haemotoxylin and Eosin) stained samples. Initial classification by an expert histopathologist, through visual inspection enables the development of univariate or multivariate models, based on tissue regions that have significantly up/down-regulated ions. However, complex data may result in underdetermined models, and alternative methods that can cope with high dimensionality and noisy data are required. Here, we describe, apply, and test a novel diagnostic procedure built using a combination of MSI and deep learning with the objective of delineating and identifying biochemical differences between cancerous and non-cancerous tissue in metastatic liver cancer and epithelial ovarian cancer. The workflow investigates the robustness of single (1D) to multidimensional (3D) tumour analyses and also highlights possible biomarkers which are not accessible from classical visual analysis of the H&E images. The identification of key molecular markers may provide a deeper understanding of tumour heterogeneity and potential targets for intervention.Open Acces

    Quantification of Nanomaterials with Spectrally-Resolved Super-Resolution Microscopy

    Get PDF

    Rapid Computation of Thermodynamic Properties Over Multidimensional Nonbonded Parameter Spaces using Adaptive Multistate Reweighting

    Full text link
    We show how thermodynamic properties of molecular models can be computed over a large, multidimensional parameter space by combining multistate reweighting analysis with a linear basis function approach. This approach reduces the computational cost to estimate thermodynamic properties from molecular simulations for over 130,000 tested parameter combinations from over a thousand CPU years to tens of CPU days. This speed increase is achieved primarily by computing the potential energy as a linear combination of basis functions, computed from either modified simulation code or as the difference of energy between two reference states, which can be done without any simulation code modification. The thermodynamic properties are then estimated with the Multistate Bennett Acceptance Ratio (MBAR) as a function of multiple model parameters without the need to define a priori how the states are connected by a pathway. Instead, we adaptively sample a set of points in parameter space to create mutual configuration space overlap. The existence of regions of poor configuration space overlap are detected by analyzing the eigenvalues of the sampled states' overlap matrix. The configuration space overlap to sampled states is monitored alongside the mean and maximum uncertainty to determine convergence, as neither the uncertainty or the configuration space overlap alone is a sufficient metric of convergence. This adaptive sampling scheme is demonstrated by estimating with high precision the solvation free energies of charged particles of Lennard-Jones plus Coulomb functional form. We also compute entropy, enthalpy, and radial distribution functions of unsampled parameter combinations using only the data from these sampled states and use the free energies estimates to examine the deviation of simulations from the Born approximation to the solvation free energy
    • …
    corecore