432 research outputs found

    Neural Network Feature Selection in Complex Trait Analysis

    Get PDF
    Neural networks are well-established tools in the pattern recognition community. They have previously been suggested for the analysis of complex genetic traits [Lucek et al., 1998], however with mixed results. While the method showed interesting results and even pointed to two genes previously not identified, also some doubts were raised as to the stability of results in a later analysis [Marinov and Weeks, 2001]. We give a brief overview of neural networks and their application to gene finding in affected-sibling-pair studies. We then show that the method is indeed unstable, identifying different genes and ranking them differently in multiple runs. Worse yet, we show that the method suffers from a high prediction error when the trained network is used to predict affection status from previously unseen marker data, giving an error rate that is higher than would be obtained from mere guessing. An analysis of the causes is given, identifying dataset sparsity and marker dimensionality as the two main concerns. We discuss pruning as a means to control these problems, and then show how the method can be combined with a marker subset selection step carried out by a genetic algorithm. Results are given on a simulated dataset

    An Algorithm to Select Target Specific Probes for DNA Chips

    Get PDF
    Motivation: The selection of target specific probes is a relevant problem in the design of DNA chips. Given a set S of genomic sequences, the task is to find at least one oligonucleotide, called probe, for each target sequence in S. This probe will be attached to the chip surface, and must be chosen in a way that it will not hybridize to any other sequence but the intended target. Furthermore, all probes on the chip must hybridize to their intended targets under the same reaction conditions, most importantly at the temperature T at which the experiment is conducted. Results: We present an efficient algorithm for the probe design problem. Melting temperatures are calculated for all possible probe-target interactions using an extended nearest-neighbor model, allowing for both non-Watson-Crick base-pairing and unpaired bases within a duplex. To compute temperatures efficiently, a combination of suffix trees and dynamic programming based alignment algorithms is introduced. Additional filtering steps during preprocessing increase the speed of the computation. Also, an algorithm to select the actual probes from the set of candidates is presented. The practicability of the algorithms is demonstrated by two case studies: The computation of probes for the identification of different HIV-1 subtypes, and finding probes for 28S rDNA sequences from over 400 organisms

    Reconstructing nonlinear dynamic models of gene regulation using stochastic sampling

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The reconstruction of gene regulatory networks from time series gene expression data is one of the most difficult problems in systems biology. This is due to several reasons, among them the combinatorial explosion of possible network topologies, limited information content of the experimental data with high levels of noise, and the complexity of gene regulation at the transcriptional, translational and post-translational levels. At the same time, quantitative, dynamic models, ideally with probability distributions over model topologies and parameters, are highly desirable.</p> <p>Results</p> <p>We present a novel approach to infer such models from data, based on nonlinear differential equations, which we embed into a stochastic Bayesian framework. We thus address both the stochasticity of experimental data and the need for quantitative dynamic models. Furthermore, the Bayesian framework allows it to easily integrate prior knowledge into the inference process. Using stochastic sampling from the Bayes' posterior distribution, our approach can infer different likely network topologies and model parameters along with their respective probabilities from given data. We evaluate our approach on simulated data and the challenge #3 data from the DREAM 2 initiative. On the simulated data, we study effects of different levels of noise and dataset sizes. Results on real data show that the dynamics and main regulatory interactions are correctly reconstructed.</p> <p>Conclusions</p> <p>Our approach combines dynamic modeling using differential equations with a stochastic learning framework, thus bridging the gap between biophysical modeling and stochastic inference approaches. Results show that the method can reap the advantages of both worlds, and allows the reconstruction of biophysically accurate dynamic models from noisy data. In addition, the stochastic learning framework used permits the computation of probability distributions over models and model parameters, which holds interesting prospects for experimental design purposes.</p

    A Coupled Mathematical Model of the Intracellular Replication of Dengue Virus and the Host Cell Immune Response to Infection

    Get PDF
    Dengue virus (DV) is a positive-strand RNA virus of the Flavivirus genus. It is one of the most prevalent mosquito-borne viruses, infecting globally 390 million individuals per year. The clinical spectrum of DV infection ranges from an asymptomatic course to severe complications such as dengue hemorrhagic fever (DHF) and dengue shock syndrome (DSS), the latter because of severe plasma leakage. Given that the outcome of infection is likely determined by the kinetics of viral replication and the antiviral host cell immune response (HIR) it is of importance to understand the interaction between these two parameters. In this study, we use mathematical modeling to characterize and understand the complex interplay between intracellular DV replication and the host cells' defense mechanisms. We first measured viral RNA, viral protein, and virus particle production in Huh7 cells, which exhibit a notoriously weak intrinsic antiviral response. Based on these measurements, we developed a detailed intracellular DV replication model. We then measured replication in IFN competent A549 cells and used this data to couple the replication model with a model describing IFN activation and production of IFN stimulated genes (ISGs), as well as their interplay with DV replication. By comparing the cell line specific DV replication, we found that host factors involved in replication complex formation and virus particle production are crucial for replication efficiency. Regarding possible modes of action of the HIR, our model fits suggest that the HIR mainly affects DV RNA translation initiation, cytosolic DV RNA degradation, and naïve cell infection. We further analyzed the potential of direct acting antiviral drugs targeting different processes of the DV lifecycle in silico and found that targeting RNA synthesis and virus assembly and release are the most promising anti-DV drug targets

    Rif2 Promotes a Telomere Fold-Back Structure through Rpd3L Recruitment in Budding Yeast

    Get PDF
    Using a genome-wide screening approach, we have established the genetic requirements for proper telomere structure in Saccharomyces cerevisiae. We uncovered 112 genes, many of which have not previously been implicated in telomere function, that are required to form a fold-back structure at chromosome ends. Among other biological processes, lysine deacetylation, through the Rpd3L, Rpd3S, and Hda1 complexes, emerged as being a critical regulator of telomere structure. The telomeric-bound protein, Rif2, was also found to promote a telomere fold-back through the recruitment of Rpd3L to telomeres. In the absence of Rpd3 function, telomeres have an increased susceptibility to nucleolytic degradation, telomere loss, and the initiation of premature senescence, suggesting that an Rpd3-mediated structure may have protective functions. Together these data reveal that multiple genetic pathways may directly or indirectly impinge on telomere structure, thus broadening the potential targets available to manipulate telomere function

    Proteome analysis of the HIV-1 Gag interactome

    Get PDF
    AbstractHuman immunodeficiency virus Gag drives assembly of virions in infected cells and interacts with host factors which facilitate or restrict viral replication. Although several Gag-binding proteins have been characterized, understanding of virus–host interactions remains incomplete. In a series of six affinity purification screens, we have identified protein candidates for interaction with HIV-1 Gag. Proteins previously found in virions or identified in siRNA screens for host factors influencing HIV-1 replication were recovered. Helicases, translation factors, cytoskeletal and motor proteins, factors involved in RNA degradation and RNA interference were enriched in the interaction data. Cellular networks of cytoskeleton, SR proteins and tRNA synthetases were identified. Most prominently, components of cytoplasmic RNA transport granules were co-purified with Gag. This study provides a survey of known Gag–host interactions and identifies novel Gag binding candidates. These factors are associated with distinct molecular functions and cellular pathways relevant in host–pathogen interactions

    Improved assay-dependent searching of nucleic acid sequence databases

    Get PDF
    Nucleic acid-based biochemical assays are crucial to modern biology. Key applications, such as detection of bacterial, viral and fungal pathogens, require detailed knowledge of assay sensitivity and specificity to obtain reliable results. Improved methods to predict assay performance are needed for exploiting the exponentially growing amount of DNA sequence data and for reducing the experimental effort required to develop robust detection assays. Toward this goal, we present an algorithm for the calculation of sequence similarity based on DNA thermodynamics. In our approach, search queries consist of one to three oligonucleotide sequences representing either a hybridization probe, a pair of Padlock probes or a pair of PCR primers with an optional TaqMan™ probe (i.e. in silico or ‘virtual’ PCR). Matches are reported if the query and target satisfy both the thermodynamics of the assay (binding at a specified hybridization temperature and/or change in free energy) and the relevant biological constraints (assay sequences binding to the correct target duplex strands in the required orientations). The sensitivity and specificity of our method is evaluated by comparing predicted to known sequence tagged sites in the human genome. Free energy is shown to be a more sensitive and specific match criterion than hybridization temperature
    corecore