59 research outputs found

    Knowledge discovery in biological databases : a neural network approach

    Get PDF
    Knowledge discovery, in databases, also known as data mining, is aimed to find significant information from a set of data. The knowledge to be mined from the dataset may refer to patterns, association rules, classification and clustering rules, and so forth. In this dissertation, we present a neural network approach to finding knowledge in biological databases. Specifically, we propose new methods to process biological sequences in two case studies: the classification of protein sequences and the prediction of E. Coli promoters in DNA sequences. Our proposed methods, based oil neural network architectures combine techniques ranging from Bayesian inference, coding theory, feature selection, dimensionality reduction, to dynamic programming and machine learning algorithms. Empirical studies show that the proposed methods outperform previously published methods and have excellent performance on the latest dataset. We have implemented the proposed algorithms into an infrastructure, called Genome Mining, developed for biosequence classification and recognition

    Clustering protein sequences with a novel metric transformed from sequence similarity scores and sequence alignments with neural networks

    Get PDF
    BACKGROUND: The sequencing of the human genome has enabled us to access a comprehensive list of genes (both experimental and predicted) for further analysis. While a majority of the approximately 30000 known and predicted human coding genes are characterized and have been assigned at least one function, there remains a fair number of genes (about 12000) for which no annotation has been made. The recent sequencing of other genomes has provided us with a huge amount of auxiliary sequence data which could help in the characterization of the human genes. Clustering these sequences into families is one of the first steps to perform comparative studies across several genomes. RESULTS: Here we report a novel clustering algorithm (CLUGEN) that has been used to cluster sequences of experimentally verified and predicted proteins from all sequenced genomes using a novel distance metric which is a neural network score between a pair of protein sequences. This distance metric is based on the pairwise sequence similarity score and the similarity between their domain structures. The distance metric is the probability that a pair of protein sequences are of the same Interpro family/domain, which facilitates the modelling of transitive homology closure to detect remote homologues. The hierarchical average clustering method is applied with the new distance metric. CONCLUSION: Benchmarking studies of our algorithm versus those reported in the literature shows that our algorithm provides clustering results with lower false positive and false negative rates. The clustering algorithm is applied to cluster several eukaryotic genomes and several dozens of prokaryotic genomes

    Activation of a Metabolic Gene Regulatory Network Downstream of mTOR Complex 1

    Get PDF
    Aberrant activation of the mammalian target of rapamycin complex 1 (mTORC1) is a common molecular event in a variety of pathological settings, including genetic tumor syndromes, cancer, and obesity. However, the cell-intrinsic consequences of mTORC1 activation remain poorly defined. Through a combination of unbiased genomic, metabolomic, and bioinformatic approaches, we demonstrate that mTORC1 activation is sufficient to stimulate specific metabolic pathways, including glycolysis, the oxidative arm of the pentose phosphate pathway, and de novo lipid biosynthesis. This is achieved through the activation of a transcriptional program affecting metabolic gene targets of hypoxia-inducible factor (HIF1α) and sterol regulatory element-binding protein (SREBP1 and SREBP2). We find that SREBP1 and 2 promote proliferation downstream of mTORC1, and the activation of these transcription factors is mediated by S6K1. Therefore, in addition to promoting protein synthesis, mTORC1 activates specific bioenergetic and anabolic cellular processes that are likely to contribute to human physiology and disease

    Observation of Gigahertz Topological Valley Hall Effect in Nanoelectromechanical Phononic Crystals

    Get PDF
    Topological phononics offers numerous opportunities in manipulating elastic waves that can propagate in solids without being backscattered. Due to the lack of nanoscale imaging tools that aid the system design, however, acoustic topological metamaterials have been mostly demonstrated in macroscale systems operating at low (kilohertz to megahertz) frequencies. Here, we report the realization of gigahertz topological valley Hall effect in nanoelectromechanical AlN membranes. Propagation of elastic wave through phononic crystals is directly visualized by microwave microscopy with unprecedented sensitivity and spatial resolution. The valley Hall edge states, protected by band topology, are vividly seen in both real- and momentum-space. The robust valley-polarized transport is evident from the wave transmission across local disorder and around sharp corners, as well as the power distribution into multiple edge channels. Our work paves the way to exploit topological physics in integrated acousto-electronic systems for classical and quantum information processing in the microwave regime.This work was supported by the NSF through the Laboratory for Research on the Structure of Matter, an NSF Materials Research Science & Engineering Center (MRSEC; DMR-1720530). The TMIM work was supported by NSF Division of Materials Research Award DMR-2004536 and Welch Foundation Grant F-1814. The data analysis was partially supported by the NSF through the Center for Dynamics and Control of Materials, an NSF MRSEC under Cooperative Agreement DMR-1720595. This work was carried out in part at the Singh Center for Nanotechnology, which is supported by the NSF National Nanotechnology Coordinated Infrastructure Program under grant NNCI-2025608. The metamaterial design and simulation work was supported by the US Office of Naval Research (ONR) Multidisciplinary University Research Initiative (MURI) grant N00014- 20-1-2325 on Robust Photonic Materials with High-Order Topological Protection and grant N00014-21-1-2703. We would like to express our appreciation for useful discussions with Prof. Troy Olsson and Dr. Qian Niu.Center for Dynamics and Control of Material

    Uncovering mechanisms of transcriptional regulations by systematic mining of cis regulatory elements with gene expression profiles

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Contrary to the traditional biology approach, where the expression patterns of a handful of genes are studied at a time, microarray experiments enable biologists to study the expression patterns of many genes simultaneously from gene expression profile data and decipher the underlying hidden biological mechanism from the observed gene expression changes. While the statistical significance of the gene expression data can be deduced by various methods, the biological interpretation of the data presents a challenge.</p> <p>Results</p> <p>A method, called CisTransMine, is proposed to help infer the underlying biological mechanisms for the observed gene expression changes in microarray experiments. Specifically, this method will predict potential cis-regulatory elements in promoter regions which could regulate gene expression changes. This approach builds on the MotifADE method published in 2004 and extends it with two modifications: up-regulated genes and down-regulated genes are tested separately and in addition, tests have been implemented to identify combinations of transcription factors that work synergistically. The method has been applied to a genome wide expression dataset intended to study myogenesis in a mouse C2C12 cell differentiation model. The results shown here both confirm the prior biological knowledge and facilitate the discovery of new biological insights.</p> <p>Conclusion</p> <p>The results validate that the CisTransMine approach is a robust method to uncover the hidden transcriptional regulatory mechanisms that can facilitate the discovery of mechanisms of transcriptional regulation.</p

    A mosquito small RNA genomics resource reveals dynamic evolution and host responses to viruses and transposons

    Get PDF
    Although mosquitoes are major transmission vectors for pathogenic arboviruses, viral infection has little impact on mosquito health. This immunity is due in part to mosquito RNA interference (RNAi) pathways that generate antiviral small interfering RNAs (siRNAs) and Piwi-interacting RNAs (piRNAs). RNAi also maintains genome integrity by potently repressing mosquito transposon activity in the germline and soma. However, viral and transposon small RNA regulatory pathways have not been systematically examined together in mosquitoes. Therefore, we developed an integrated Mosquito Small RNA Genomics (MSRG) resource that analyzes the transposon and virus small RNA profiles in mosquito cell cultures and somatic and gonadal tissues across four medically important mosquito species. Our resource captures both somatic and gonadal small RNA expression profiles within mosquito cell cultures, and we report the evolutionary dynamics of a novel Mosquito-Conserved piRNA Cluster Locus (MCpiRCL) composed of satellite DNA repeats. In the larger culicine mosquito genomes we detected highly regular periodicity in piRNA biogenesis patterns coinciding with the expansion of Piwi pathway genes. Finally, our resource enables detection of crosstalk between piRNA and siRNA populations in mosquito cells during a response to virus infection. The MSRG resource will aid efforts to dissect and combat the capacity of mosquitoes to tolerate and spread arboviruses
    • …
    corecore