1,255 research outputs found

    Hierarchy and CIS-Regulation in Drosophila Segmentation: Rules for Pattern Formation and Clues to Evolution

    Get PDF
    In few systems is it possible to analyze the global cis-regulatory structure of developmental transcription networks. One system where this is in principle possible is segmentation in Drosophila melanogaster, although to date such an undertaking has not been attempted. Here using computational algorithms to analyze the transcriptional regulatory regions of genes of the gap and pair rule classes such an analysis is carried out. Computational analysis, transgenic reporter element assays, site directed mutagenesis, genetics, and time courses of in situ hybridizations of central genes in carefully staged embryos are combined to understand how the cis-elements function together to achieve patterning of the anterior posterior axis. The transition from the non-periodic gap patterns to the seven striped periodic patterns of the pair rule genes is analyzed in detail. This step in the genetic hierarchy is of particular interest as it generates the segmental pattern that underlies the Drosophila body plan. The analysis clarifies the primary and secondary pair rule classification system and suggests certain organizational principles in pair rule cis-regulation

    MACHINE LEARNING AND DEEP LEARNING APPROACHES FOR GENE REGULATORY NETWORK INFERENCE IN PLANT SPECIES

    Get PDF
    The construction of gene regulatory networks (GRNs) is vital for understanding the regulation of metabolic pathways, biological processes, and complex traits during plant growth and responses to environmental cues and stresses. The increasing availability of public databases has facilitated the development of numerous methods for inferring gene regulatory relationships between transcription factors and their targets. However, there is limited research on supervised learning techniques that utilize available regulatory relationships of plant species in public databases. This study investigates the potential of machine learning (ML), deep learning (DL), and hybrid approaches for constructing GRNs in plant species, specifically Arabidopsis thaliana, poplar, and maize. Challenges arise due to limited training data for gene regulatory pairs, especially in less-studied species such as poplar and maize. Nonetheless, our results demonstrate that hybrid models integrating ML and artificial neural network (ANN) techniques significantly outperformed traditional methods in predicting gene regulatory relationships. The best-performing hybrid models achieved over 95% accuracy on holdout test datasets, surpassing traditional ML and ANN models and also showed good accuracy on lignin biosynthesis pathway analysis. Employing transfer learning techniques, this study has also successfully transferred the known knowledge of gene regulation from one species to another, substantially improving performance and manifesting the viability of cross-species learning using deep learning-based approaches. This study contributes to the methodology for growing body of knowledge in GRN prediction and construction for plant species, highlighting the value of adopting hybrid models and transfer learning techniques. This study and the results will help to pave a way for future research on how to learn from known to unknown and will be conductive to the advance of modern genomics and bioinformatics

    Topology and dynamics of an artificial genetic regulatory network model

    Get PDF
    This thesis presents some of the methods of studying models of regulatory networks using mathematical and computational formalisms. A basic review of the biology behind gene regulation is introduced along with the formalisms used for modelling networks of such regulatory interactions. Topological measures of large-scale complex networks are discussed and then applied to a specific artificial regulatory network model created through a duplication and divergence mechanism. Such networks share topological features with natural transcriptional regulatory networks. Thus, it may be the case that the topologies inherent in natural networks may be primarily due to their method of creation rather than being exclusively shaped by subsequent evolution under selection. The evolvability of the dynamics of these networks are also examined by evolving networks in simulation to obtain three simple types of output dynamics. The networks obtained from this process show a wide variety of topologies and numbers of genes indicating that it is relatively easy to evolve these classes of dynamics in this model

    Highly Accurate Fragment Library for Protein Fold Recognition

    Get PDF
    Proteins play a crucial role in living organisms as they perform many vital tasks in every living cell. Knowledge of protein folding has a deep impact on understanding the heterogeneity and molecular functions of proteins. Such information leads to crucial advances in drug design and disease understanding. Fold recognition is a key step in the protein structure discovery process, especially when traditional computational methods fail to yield convincing structural homologies. In this work, we present a new protein fold recognition approach using machine learning and data mining methodologies. First, we identify a protein structural fragment library (Frag-K) composed of a set of backbone fragments ranging from 4 to 20 residues as the structural “keywords” that can effectively distinguish between major protein folds. We firstly apply randomized spectral clustering and random forest algorithms to construct representative and sensitive protein fragment libraries from a large-scale of high-quality, non-homologous protein structures available in PDB. We analyze the impacts of clustering cut-offs on the performance of the fragment libraries. Then, the Frag-K fragments are employed as structural features to classify protein structures in major protein folds defined by SCOP (Structural Classification of Proteins). Our results show that a structural dictionary with ~400 4- to 20-residue Frag-K fragments is capable of classifying major SCOP folds with high accuracy. Then, based on Frag-k, we design a novel deep learning architecture, so-called DeepFrag-k, which identifies fold discriminative features to improve the accuracy of protein fold recognition. DeepFrag-k is composed of two stages: the first stage employs a multimodal Deep Belief Network (DBN) to predict the potential structural fragments given a sequence, represented as a fragment vector, and then the second stage uses a deep convolution neural network (CNN) to classify the fragment vectors into the corresponding folds. Our results show that DeepFrag-k yields 92.98% accuracy in predicting the top-100 most popular fragments, which can be used to generate discriminative fragment feature vectors to improve protein fold recognition

    Doctor of Philosophy

    Get PDF
    dissertationKidney toxicity is the second highest cause of new drug candidate failure, after liver toxicity, leading to drug withdrawals from the market and failed clinical trials. Therefore, development of more reliable and accurate in vitro screening systems for assessing drug nephrotoxicity is of high importance. Ideally, future new drug toxicity testing models will be highly sensitive, minimize animal use, and allow for high-throughput screening modalities. Recent technologies emphasize a shift away from monolayer-cultured and immortalized cells on plastic plates to novel three-dimensional (3D) matrix-based culture systems more similar to actual tissue constructs (i.e., organoids, micro-tissues, spheroids and cell/gels). This dissertation focuses on characterization of a recently developed ex vivo 3D murine proximal tubule model and comparison to currently used two-dimensional (2D) kidney cell monocultures which are used for nephrotoxicity drug screening. The overall goal of this dissertation is to establish a scientific basis for justifying this ex vivo 3D model as a more reliable and predictive toxicity testing system for high-throughput drug screening in early drug development stages than currently available. Therefore, expression levels and profiles for key kidney proximal tubule markers and important proximal tubule transporters with known roles in drug transport were obtained from the ex vivo 3D proximal tubule model. These data demonstrate more abundant and enduring apical and basolateral transporter expression than transporters in 2D cell lines, suggesting that they will provide a more reliable response to pharmaceutical compounds in 3D cultures. To test the transporter functional ability and predictive toxicity, the model was further analyzed for its capability to detect nephrotoxicity in early stages using clinically relevant known nephrotoxic and nonnephrotoxic drugs. The obtained assay responses were compared to patient drug toxicity data from the clinic to assess the model’s translational capacity. Data collected during the concentration-response study demonstrate that the model is capable of accurately predicting nephrotoxicity for all tested compounds, producing toxicity responses highly correlated to known human clinical experience with the tested drugs. The 3D murine proximal tubule model’s predictive values were found to be highly improved compared with predictive values from the current industrial “gold standard” for 2D kidney cell cultures still widely used for similar toxicity assessments. Furthermore, this model could be used to gain insight into specific drug structural mechanisms that drive toxicity and facilitate rapid assay of different chemical moieties during structure activity relationship screening. The knowledge gained through this study provides greater understanding of how drug toxicity testing results compare between standard 2D cell assays, 3D assays, and human clinical data. The data underscore the 3D proximal tubule model’s promising future as an improved, more reliable toxicity testing system for high-throughput screening during early phases of drug development

    Computational Methods for the Analysis of Genomic Data and Biological Processes

    Get PDF
    In recent decades, new technologies have made remarkable progress in helping to understand biological systems. Rapid advances in genomic profiling techniques such as microarrays or high-performance sequencing have brought new opportunities and challenges in the fields of computational biology and bioinformatics. Such genetic sequencing techniques allow large amounts of data to be produced, whose analysis and cross-integration could provide a complete view of organisms. As a result, it is necessary to develop new techniques and algorithms that carry out an analysis of these data with reliability and efficiency. This Special Issue collected the latest advances in the field of computational methods for the analysis of gene expression data, and, in particular, the modeling of biological processes. Here we present eleven works selected to be published in this Special Issue due to their interest, quality, and originality
    corecore