14 research outputs found

    Model-driven analysis of gene expression control

    Get PDF
    During this PhD, I worked on three different aspects in the broad field of experimental and theoretical analysis of gene regulation. The first part, "Quantifying the strength of miRNA-target interactions", addresses the problem of predicting mRNA targets of miRNAs. I show that biochemical measurements of miRNA-mRNA interactions can be used to optimise the parameter inference of a pre-existing model of miRNA target prediction. This model named MIRZA, predicts miRNA-mRNA binding using 25 energy parameters that describe the miRNA-mRNA hybrid structure, with 2 base pairing parameters for the AU and GC pairs, 3 configuration parameters for the symmetric and asymmetric loops, and 21 positional parameters for the 21 nucleotides of the miRNA sequence. MIRZA was built to infer these parameters from Argonaute protein CLIP data, which captures potential targets of miRNAs. Upon the publication of precise measurements of chemical kinetic constants of miRNA-mRNA binding interactions between a mRNA target and a set of systematically mutated miRNA sequences, we reasoned that such data could be used to improve the parameters inference of the MIRZA model. After showing that the prediction of the existing model on the set of measured miRNA-mRNA pairs shows high correlation with the binding energy calculated from the measurements, I used simulations as a proof of principle of the inference procedure and to design measurements that would be needed to infer the parameters of the MIRZA model. Staying in the field of miRNA, in "Single cell mRNA profiling reveals the hierarchical response of miRNA targets to miRNA induction", I developed an approach to infer miRNA targets based on scRNA-seq data from cells that express the miRNA at different levels. A miRNA can target several hundreds of different mRNAs and is present in the cell in limited quantities, implying that the interaction of a target mRNA with a specific miRNA depends on its concentration and on the interactions of the miRNA with its other targets. In other words, since miRNA binding is exclusive, mRNA targets compete for the same miRNA pool. Therefore, the concentrations of the thereby coupled mRNAs depend not only on the miRNA concentration but also on the concentration of every competing mRNA that is targeted by the same miRNA. To study this, HEK 293 cell lines were constructed to inducibly express a miRNA (hsa-miR-199a) as well as the mRNA encoding a green fluorescent protein. Express from the same promoter as the miRNA, this mRNA allows the monitoring of the miRNA concentration. The study aimed not only to determine the parameters of individual mRNA-mRNA interactions, but also to assess the degree to which mRNAs act in a competitive manner to influence each other's expression. scRNA-seq was chosen to bring the resolution needed to reach these goals. The effect of the miRNA on a bound target is to increase its decay rate, hence the expression levels of the targets depends on the miRNA concentration and their binding energy. To gain insight into the target binding energy, we constructed a model considering mRNA transcription rate, the miRNA-mRNA binding/unbinding rate, the mRNA decay rates in the bound and unbound state, and the free/bound concentration of miRNA. We showed that the model can be factored in terms of the miRNA concentrations in individual cells and the miRNA-mRNA target interaction parameters and we solved the model to obtain estimates of miRNA-mRNA interaction parameters, which we showed explain the mRNA levels in cells more accurately than the sequence-based computationally predicted interaction energies. Finally, in "Bayesian inference of the gene expression states from single-cell RNA-seq data" I carried out fundamental technical work on the normalisation of count data obtained in scRNA-seq experiments. As introduced above, multiple strategies have been developed with the aim of reducing the high level of noise present on such data, and estimating a 'true' biological state of expression for each gene in each cell. While the project aimed to reconstruct the Waddington landscape of regulator activity based on the single cell gene expression measurements, at the start of the project we realised that there is no satisfactory solution to gene expression normalisation in single cells in the literature. Thus, we tackled this problem with a Bayesian model, considering each gene independently and inferring a posterior probability of gene expression in each cell. Our model assumes a log-normal distribution of gene expression across cells and additional Poisson noise caused by the stochastic process of gene expression and the sampling process introduced by the mRNA capture in experimental protocols. These normalised gene expression values are the basis of a motif-activity response based approach for inferring the activity of TFs and miRNAs in individual cells, and for reconstructing the underlying landscape. The application of this normalisation algorithm to reconstruct a landscape is presented in the last part, "Realizing Waddington’s metaphor: Inferring regulatory landscapes from single-cell gene expression data". There I present the mathematical principles needed to formally define a landscape following the idea of Waddington from 1957, and I propose two applications of the landscape. First I show that it defines cell types as local minima, and secondly, in the case of cells undergoing differentiation, I show how the landscape can be used to find developmental path and the transcription factors associated with the differentiation process

    Single Molecule Fluorescence Spectroscopy and Imaging: Advanced Methods and Applications in Life Sciences

    Get PDF
    The visualization of biological structures down to the molecular length scale has been recently made possible by the development of super-resolution fluorescence microscopy. These techniques now routinely resolve biological structures down to a few nanometers. Various super-resolution techniques have been developed, the most successful being Stimulated Depletion Emission (STED) microscopy and Single Molecule Localization Microscopy (SMLM). In what follows, I will focus on the latter class of techniques which is based on the fact that a single molecule image allows for localizing the molecule with a much higher accuracy than the diffraction limit of resolution of the used microscope. However, a big challenge of SMLM is to achieve a similar super-resolution along the optical axis of a microscope. For this purpose, metal-induced energy transfer (MIET) imaging was recently introduced as an elegant method for axially localizing fluorophores with nanometer precision. The underlying principle of MIET is based on an electromagnetic near-field-mediated energy transfer from an excited fluorescent emitter (donor) to a thin planar metal film (acceptor). This energy transfer leads to a distance-dependent modulation of an emitter’s fluorescence lifetime (quenching), that can be easily measured with conventional fluorescence lifetime measurement techniques. The power of MIET is that it works with any fluorophore, and it only requires a conventional fluorescence lifetime imaging (FLIM) microscope. In this thesis, I present a powerful modification and further development of MIET, that is called graphene-induced energy transfer (GIET). GIET replaces the metal film of MIET with a single sheet of graphene which reduces the quenching range by one order of magnitude, leading to a tenfold improvement in axial resolution. This enables the localization of fluorophores with sub-nanometer accuracy. We demonstrate the potential of GIET by quantifying inter-leaflet distances in supported lipid bilayers (SLBs) and discuss the potential of the technique particularly in membrane biophysics applications. The second line of this thesis is devoted to the complementary topic of fast molecular dynamics. While super-resolution microscopy succeeds in resolving structural details with nanometer resolution, it is too slow for temporally resolving the fast dynamics of the observed molecules. For this purpose, spectroscopic techniques such as single molecule fluorescence spectroscopy (SMFS) have become an important tool that can resolve molecular dynamics down to timescales of nanoseconds. In my thesis, I focus on fluorescence lifetime correlation spectroscopy (FLCS), an advanced variant of fluorescence correlation spectroscopy (FCS). Using FLCS, I could disentangle two emission states in an autofluorescent protein that have otherwise highly overlapping spectra, and I could quantify the microsecond switching rates between these two states. As compared to other existing methods, FLCS offers the unique advantage of probing such fast switching kinetics with nanosecond temporal resolution under equilibrium conditions at room temperature, making it the method of choice for similar studies of complex luminescent emitters. Finally, I will also present another study where I utilized advanced FCS for studying protein self-assembly. In summary, my thesis presents several advanced methods in SMLM and SMFS which significantly enhance the spatial and temporal resolution at the single molecule level. I believe that the presented methods will find a wide range of applications in the life sciences.2021-06-0

    Analyzing Enterprise WiFi Session Data for Modeling Building Occupancy, Evacuation, and Energy Consumption

    Full text link
    Buildings are the prime components of office complexes, university campuses, and city centers. They are expensive to build and expensive to operate. Building managers are under constant pressure to keep them efficient and safe. However, they are often stymied by lack of fine-grained data that can help them optimize occupancy levels so as to make most efficient use of space, evacuation patterns that can ensure safety in the event of emergencies, and energy usage behavior that can help reduce operating costs. While several modern buildings are increasingly being equipped with sensors for detecting people presence, movement patterns, and thermal conditions, such instrumentation can often be expensive and limited in scale. This thesis investigates the potential to use data generated by the pervasive WiFi infrastructure that is present in all buildings. Specifically, we evaluate the use of WiFi data to model room usage, anatomize emergency evacuations, and reduce energy excursion costs associated with evacuation events. We begin this thesis by surveying data-driven approaches for efficient building operation and management, while reviewing existing technologies for measuring occupancy using both existing and purpose-built sensing infrastructure. Central to this thesis is the data we have collected and analyzed on WiFi session logs from a dense wireless network consisting of nearly 5000 access points across 50 buildings in a large university campus over a period of 2 years. For our first contribution, we use this data to develop a machine learning-based method to estimate classroom occupancy in near real-time. The output of our method is compared to that from specialized people-counting sensors, and the symmetric Mean Absolute Percentage Error is no more than 13%. Our second contribution develops a systematic method to evaluate emergency evacuation events using building WiFi session data. Our systematic analysis of 43 planned and unplanned evacuation events across 14 buildings quantifies important measures such as evacuation speed, number of evacuees, and typicality of occupancy levels, demonstrating that WiFi data enables accurate and scalable evaluation of building evacuations, corroborating current manual records and revealing new insights. For our third and final contribution, we show that evacuations (particularly during summer) can result in HVAC power excursions of up to 150% above the agreed threshold, imposing heavy power tariffs. We develop a cooling strategy that allows the power cost to be traded off against thermal comfort of occupants post evacuation in a tunable manner. Application of our algorithm to typical building evacuation scenarios shows that the power excursion costs can be largely mitigated for as little as 5 minutes of delay in achieving ideal indoor temperatures. Taken together, our contributions equip building operators with tools and techniques to improve efficiency and safety by leveraging existing WiFi data with no additional infrastructure costs

    Regularized deconvolution-based approaches for estimating room occupancies

    No full text
    We address the problem of estimating the number of people in a room using information available in standard HVAC systems. We propose an estimation scheme based on two phases. In the first phase, we assume the availability of pilot data and identify a model for the dynamic relations occurring between occupancy levels, concentration and room temperature. In the second phase, we make use of the identified model to formulate the occupancy estimation task as a deconvolution problem. In particular, we aim at obtaining an estimated occupancy pattern by trading off between adherence to the current measurements and regularity of the pattern. To achieve this goal, we employ a special instance of the so-called fused lasso estimator, which promotes piecewise constant estimates by including an norm-dependent term in the associated cost function. We extend the proposed estimator to include different sources of information, such as actuation of the ventilation system and door opening/closing events. We also provide conditions under which the occupancy estimator provides correct estimates within a guaranteed probability. We test the estimator running experiments on a real testbed, in order to compare it with other occupancy estimation techniques and assess the value of having additional information sources

    Regularized Deconvolution-Based Approaches for Estimating Room Occupancies

    No full text
    We address the problem of estimating the number of people in a room using information available in standard HVAC systems. We propose an estimation scheme based on two phases. In the first phase, we assume the availability of pilot data and identify a model for the dynamic relations occurring between occupancy levels, CO2 concentration and room temperature. In the second phase, we make use of the identified model to formulate the occupancy estimation task as a deconvolution problem. In particular, we aim at obtaining an estimated occupancy pattern by trading off between adherence to the current measurements and regularity of the pattern. To achieve this goal, we employ a special instance of the so-called fused lasso estimator, which promotes piecewise constant estimates by including an â„“1 norm-dependent term in the associated cost function. We extend the proposed estimator to include different sources of information, such as actuation of the ventilation system and door opening/closing events. We also provide conditions under which the occupancy estimator provides correct estimates within a guaranteed probability. We test the estimator running experiments on a real testbed, in order to compare it with other occupancy estimation techniques and assess the value of having additional information sources

    IN SILICO METHODS FOR DRUG DESIGN AND DISCOVERY

    Get PDF
    Computer-aided drug design (CADD) methodologies are playing an ever-increasing role in drug discovery that are critical in the cost-effective identification of promising drug candidates. These computational methods are relevant in limiting the use of animal models in pharmacological research, for aiding the rational design of novel and safe drug candidates, and for repositioning marketed drugs, supporting medicinal chemists and pharmacologists during the drug discovery trajectory.Within this field of research, we launched a Research Topic in Frontiers in Chemistry in March 2019 entitled “In silico Methods for Drug Design and Discovery,” which involved two sections of the journal: Medicinal and Pharmaceutical Chemistry and Theoretical and Computational Chemistry. For the reasons mentioned, this Research Topic attracted the attention of scientists and received a large number of submitted manuscripts. Among them 27 Original Research articles, five Review articles, and two Perspective articles have been published within the Research Topic. The Original Research articles cover most of the topics in CADD, reporting advanced in silico methods in drug discovery, while the Review articles offer a point of view of some computer-driven techniques applied to drug research. Finally, the Perspective articles provide a vision of specific computational approaches with an outlook in the modern era of CADD

    [<sup>18</sup>F]fluorination of biorelevant arylboronic acid pinacol ester scaffolds synthesized by convergence techniques

    Get PDF
    Aim: The development of small molecules through convergent multicomponent reactions (MCR) has been boosted during the last decade due to the ability to synthesize, virtually without any side-products, numerous small drug-like molecules with several degrees of structural diversity.(1) The association of positron emission tomography (PET) labeling techniques in line with the “one-pot” development of biologically active compounds has the potential to become relevant not only for the evaluation and characterization of those MCR products through molecular imaging, but also to increase the library of radiotracers available. Therefore, since the [18F]fluorination of arylboronic acid pinacol ester derivatives tolerates electron-poor and electro-rich arenes and various functional groups,(2) the main goal of this research work was to achieve the 18F-radiolabeling of several different molecules synthesized through MCR. Materials and Methods: [18F]Fluorination of boronic acid pinacol esters was first extensively optimized using a benzaldehyde derivative in relation to the ideal amount of Cu(II) catalyst and precursor to be used, as well as the reaction solvent. Radiochemical conversion (RCC) yields were assessed by TLC-SG. The optimized radiolabeling conditions were subsequently applied to several structurally different MCR scaffolds comprising biologically relevant pharmacophores (e.g. β-lactam, morpholine, tetrazole, oxazole) that were synthesized to specifically contain a boronic acid pinacol ester group. Results: Radiolabeling with fluorine-18 was achieved with volumes (800 μl) and activities (≤ 2 GBq) compatible with most radiochemistry techniques and modules. In summary, an increase in the quantities of precursor or Cu(II) catalyst lead to higher conversion yields. An optimal amount of precursor (0.06 mmol) and Cu(OTf)2(py)4 (0.04 mmol) was defined for further reactions, with DMA being a preferential solvent over DMF. RCC yields from 15% to 76%, depending on the scaffold, were reproducibly achieved. Interestingly, it was noticed that the structure of the scaffolds, beyond the arylboronic acid, exerts some influence in the final RCC, with electron-withdrawing groups in the para position apparently enhancing the radiolabeling yield. Conclusion: The developed method with high RCC and reproducibility has the potential to be applied in line with MCR and also has a possibility to be incorporated in a later stage of this convergent “one-pot” synthesis strategy. Further studies are currently ongoing to apply this radiolabeling concept to fluorine-containing approved drugs whose boronic acid pinacol ester precursors can be synthesized through MCR (e.g. atorvastatin)
    corecore