365 research outputs found

    Transcriptional regulatory network refinement and quantification through kinetic modeling, gene expression microarray data and information theory

    Get PDF
    BACKGROUND: Gene expression microarray and other multiplex data hold promise for addressing the challenges of cellular complexity, refined diagnoses and the discovery of well-targeted treatments. A new approach to the construction and quantification of transcriptional regulatory networks (TRNs) is presented that integrates gene expression microarray data and cell modeling through information theory. Given a partial TRN and time series data, a probability density is constructed that is a functional of the time course of transcription factor (TF) thermodynamic activities at the site of gene control, and is a function of mRNA degradation and transcription rate coefficients, and equilibrium constants for TF/gene binding. RESULTS: Our approach yields more physicochemical information that compliments the results of network structure delineation methods, and thereby can serve as an element of a comprehensive TRN discovery/quantification system. The most probable TF time courses and values of the aforementioned parameters are obtained by maximizing the probability obtained through entropy maximization. Observed time delays between mRNA expression and activity are accounted for implicitly since the time course of the activity of a TF is coupled by probability functional maximization, and is not assumed to be proportional to expression level of the mRNA type that translates into the TF. This allows one to investigate post-translational and TF activation mechanisms of gene regulation. Accuracy and robustness of the method are evaluated. A kinetic formulation is used to facilitate the analysis of phenomena with a strongly dynamical character while a physically-motivated regularization of the TF time course is found to overcome difficulties due to omnipresent noise and data sparsity that plague other methods of gene expression data analysis. An application to Escherichia coli is presented. CONCLUSION: Multiplex time series data can be used for the construction of the network of cellular processes and the calibration of the associated physicochemical parameters. We have demonstrated these concepts in the context of gene regulation understood through the analysis of gene expression microarray time series data. Casting the approach in a probabilistic framework has allowed us to address the uncertainties in gene expression microarray data. Our approach was found to be robust to error in the gene expression microarray data and mistakes in a proposed TRN

    Data Mining and Analysis on Multiple Time Series Object Data

    Get PDF
    Huge amount of data is available in our society and the need for turning such data into useful information and knowledge is urgent. Data mining is an important field addressing that need and significant progress has been achieved in the last decade. In several important application areas, data arises in the format of Multiple Time Series Object (MTSO) data, where each data object is an array of time series over a large set of features and each has an associated class or state. Very little research has been conducted towards this kind of data. Examples include computational toxicology, where each data object consists of a set of time series over thousands of genes, and operational stress management, where each data object consists of a set of time series over different measuring points on the human body. The purpose of this dissertation is to conduct a systematic data mining study over microarray time series data, with applications on computational toxicology. More specifically, we aim to consider several issues: feature selection algorithms for different classification cases, gene markers or feature set selection for toxic chemical exposure detection, toxic chemical exposure time prediction, wildness concept development and applications, and organizing diversified and parsimonious committee. We will formalize and analyze these research problems, design algorithms to address these problems, and perform experimental evaluations of the proposed algorithms. All these studies are based on microarray time series data set provided by Dr. McDougal

    2022 SDSU Data Science Symposium Presentation Abstracts

    Get PDF
    This document contains abstracts for presentations and posters 2022 SDSU Data Science Symposium

    2022 SDSU Data Science Symposium Presentation Abstracts

    Get PDF
    This document contains abstracts for presentations and posters 2022 SDSU Data Science Symposium

    Modeling Non-Linear Dynamic Phenomena in Biochemical Networks

    Get PDF
    Facilitated by the development of high-throughput techniques, the focus of biological research has changed in the last decades from the investigation of single cell components to a system-level approach, which aims at an understanding of interactions between these cell components. This objective requires modeling and analysis methods for these regulatory networks. In this thesis, we investigate mechanisms causing qualitative dynamic behaviors of regulatory subsystems. For this purpose, we introduce a differential equation model based on underlying molecular binding reactions, whose parameters are estimated using time series concentration data. In the first part, the model is applied to subsystems with qualitatively different dynamic behaviors: The response of the Mycobacterium tuberculosis to DNA damages is described as the relaxation of a system to its steady state after external perturbation. Specific repression of genes in Escherichia coli by the global regulator protein H-NS is explained by the interrelation of feedback mechanisms. In order to prevent overfitting, a typical problem in network inference from experimental data, we introduce an approach based on Bayesian statistics, which includes prior knowledge about the system in terms of prior probability distributions. This approach is applied to simulated data and to the regulatory network of the Saccharomyces cerevisiae cell cycle. Motivated by results on the yeast cell cycle, the second part of this thesis investigates the robustness of periodic behavior in regulatory networks. The model presented belongs to a class of differential equations whose solutions tend to converge to a steady state. Accordingly, periodic behavior is not robust with respect to parameter variations. We explain this phenomenon by applying a bifurcation analysis and investigating the stability of steady states. It is shown that large time scale differences and an inclusion of time-delays can stabilize sustained oscillations, and we postulate that they are important to maintain oscillations in biological systems

    Book of Abstracts XVIII Congreso de Biometría CEBMADRID

    Get PDF
    Abstracts of the XVIII Congreso de Biometría CEBMADRID held from 25 to 27 May in MadridInteractive modelling and prediction of patient evolution via multistate models / Leire Garmendia Bergés, Jordi Cortés Martínez and Guadalupe Gómez Melis : This research was funded by the Ministerio de Ciencia e Innovación (Spain) [PID2019104830RBI00]; and the Generalitat de Catalunya (Spain) [2017SGR622 and 2020PANDE00148].Operating characteristics of a model-based approach to incorporate non-concurrent controls in platform trials / Pavla Krotka, Martin Posch, Marta Bofill Roig : EU-PEARL (EU Patient-cEntric clinicAl tRial pLatforms) project has received funding from the Innovative Medicines Initiative (IMI) 2 Joint Undertaking (JU) under grant agreement No 853966. This Joint Undertaking receives support from the European Union’s Horizon 2020 research and innovation programme and EFPIA and Children’s Tumor Foundation, Global Alliance for TB Drug Development non-profit organisation, Spring works Therapeutics Inc.Modeling COPD hospitalizations using variable domain functional regression / Pavel Hernández Amaro, María Durbán Reguera, María del Carmen Aguilera Morillo, Cristobal Esteban Gonzalez, Inma Arostegui : This work is supported by the grant ID2019-104901RB-I00 from the Spanish Ministry of Science, Innovation and Universities MCIN/AEI/10.13039/501100011033.Spatio-temporal quantile autoregression for detecting changes in daily temperature in northeastern Spain / Jorge Castillo-Mateo, Alan E. Gelfand, Jesús Asín, Ana C. Cebrián / Spatio-temporal quantile autoregression for detecting changes in daily temperature in northeastern Spain : This work was partially supported by the Ministerio de Ciencia e Innovación under Grant PID2020-116873GB-I00; Gobierno de Aragón under Research Group E46_20R: Modelos Estocásticos; and JC-M was supported by Gobierno de Aragón under Doctoral Scholarship ORDEN CUS/581/2020.Estimation of the area under the ROC curve with complex survey data / Amaia Iparragirre, Irantzu Barrio, Inmaculada Arostegui : This work was financially supported in part by IT1294-19, PID2020-115882RB-I00, KK-2020/00049. The work of AI was supported by PIF18/213.INLAMSM: Adjusting multivariate lattice models with R and INLA / Francisco Palmí Perales, Virgilio Gómez Rubio and Miguel Ángel Martínez Beneito : This work has been supported by grants PPIC-2014-001-P and SBPLY/17/180501/000491, funded by Consejería de Educación, Cultura y Deportes (Junta de Comunidades de Castilla-La Mancha, Spain) and FEDER, grant MTM2016-77501-P, funded by Ministerio de Economía y Competitividad (Spain), grant PID2019-106341GB-I00 from Ministerio de Ciencia e Innovación (Spain) and a grant to support research groups by the University of Castilla-La Mancha (Spain). F. Palmí-Perales has been supported by a Ph.D. scholarship awarded by the University of Castilla-La Mancha (Spain)

    Improving outcomes in interstitial lung disease through the application of bioinformatics and systems biology

    Get PDF
    Idiopathic pulmonary fibrosis (IPF) and chronic obstructive pulmonary disease (COPD) are two distinct respiratory diseases whose features including pathogenesis and progression are not fully understood. However, both clinicians utilise changes in serial pulmonary function measurements to gain an insight into disease severity and control. More accurate prediction of disease progression would be beneficial, particularly for IPF given the variability in its clinical course as an unknown factor at the time of diagnosis. Home-based, real-time monitoring of disease progression by spirometry has provided an opportunity to optimise the delivery of treatment and reduce the length of clinical trials. Therefore, the potential to understand the mechanisms underlying disease progression and generate effective treatment has been improved. In light of this, the motivation for this project is to understand the mathematical features within daily pulmonary function time series generated by IPF patients. Hopefully, statistical models of pulmonary function time series would aid the identification of significant clinical events such as acute exacerbation. The mathematical techniques used to identify potentially important features within pulmonary function time series involved the autocorrelation function, critical transitions and detrended fluctuation analysis (DFA). Temporal properties, such as the serial correlation, abrupt changes in trends and complexity, were assessed using time series from the PROFILE clinical trial and London COPD cohort. Forced vital capacity (FVC) measurements were found to be correlated to the previous day’s reading which may inform the sampling rate of lung function during clinical trials. The presence of short-term memory within FVC time series will influence the management of missing data within clinical trials, particularly methods of imputation. Also, FVC time series’ exhibit long-term memory and adaptability supporting the role of FVC as a surrogate marker for IPF disease progression.Open Acces

    2022 SDSU Data Science Symposium Program

    Get PDF
    https://openprairie.sdstate.edu/ds_symposium_programs/1003/thumbnail.jp
    corecore