1,138 research outputs found

    A Molecular Biology Database Digest

    Get PDF
    Computational Biology or Bioinformatics has been defined as the application of mathematical and Computer Science methods to solving problems in Molecular Biology that require large scale data, computation, and analysis [18]. As expected, Molecular Biology databases play an essential role in Computational Biology research and development. This paper introduces into current Molecular Biology databases, stressing data modeling, data acquisition, data retrieval, and the integration of Molecular Biology data from different sources. This paper is primarily intended for an audience of computer scientists with a limited background in Biology

    Assessing the effect of dynamics on the closed-loop protein-folding hypothesis

    Get PDF
    The closed-loop (loop-n-lock) hypothesis of protein folding suggests that loops of about 25 residues, closed through interactions between the loop ends (locks), play an important role in protein structure. Coarse-grain elastic network simulations, and examination of loop lengths in a diverse set of proteins, each supports a bias towards loops of close to 25 residues in length between residues of high stability. Previous studies have established a correlation between total contact distance (TCD), a metric of sequence distances between contacting residues (cf. contact order), and the log-folding rate of a protein. In a set of 43 proteins, we identify an improved correlation ( r 2 = 0.76), when the metric is restricted to residues contacting the locks, compared to the equivalent result when all residues are considered ( r 2 = 0.65). This provides qualified support for the hypothesis, albeit with an increased emphasis upon the importance of a much larger set of residues surrounding the locks. Evidence of a similar-sized protein core/extended nucleus (with significant overlap) was obtained from TCD calculations in which residues were successively eliminated according to their hydrophobicity and connectivity, and from molecular dynamics simulations. Our results suggest that while folding is determined by a subset of residues that can be predicted by application of the closed-loop hypothesis, the original hypothesis is too simplistic; efficient protein folding is dependent on a considerably larger subset of residues than those involved in lock formation. </jats:p

    Bioinformatics of Phosphoproteomics

    Get PDF

    Statistical Methods for Conservation and Alignment Quality in Proteins

    Get PDF
    Construction of multiple sequence alignments is a fundamental task in Bioinformatics. Multiple sequence alignments are used as a prerequisite in many Bioinformatics methods, and subsequently the quality of such methods can be critically dependent on the quality of the alignment. However, automatic construction of a multiple sequence alignment for a set of remotely related sequences does not always provide biologically relevant alignments.Therefore, there is a need for an objective approach for evaluating the quality of automatically aligned sequences. The profile hidden Markov model is a powerful approach in comparative genomics. In the profile hidden Markov model, the symbol probabilities are estimated at each conserved alignment position. This can increase the dimension of parameter space and cause an overfitting problem. These two research problems are both related to conservation. We have developed statistical measures for quantifying the conservation of multiple sequence alignments. Two types of methods are considered, those identifying conserved residues in an alignment position, and those calculating positional conservation scores. The positional conservation score was exploited in a statistical prediction model for assessing the quality of multiple sequence alignments. The residue conservation score was used as part of the emission probability estimation method proposed for profile hidden Markov models. The results of the predicted alignment quality score highly correlated with the correct alignment quality scores, indicating that our method is reliable for assessing the quality of any multiple sequence alignment. The comparison of the emission probability estimation method with the maximum likelihood method showed that the number of estimated parameters in the model was dramatically decreased, while the same level of accuracy was maintained. To conclude, we have shown that conservation can be successfully used in the statistical model for alignment quality assessment and in the estimation of emission probabilities in the profile hidden Markov models.Siirretty Doriast

    MicroRNA target prediction by constraint programming

    Get PDF
    MicroRNAs (miRNAs) are small regulatory RNAs of about 22 nucleotide long sequences that perform important functions such as larval development switches, cell proliferation and differentiation, apoptosis, fat metabolism, control of leaf and flower development. MicroRNA sequences are highly conserved across even unrelated species, a fact which suggests a key role in the evolutionary development. MicroRNAs are transcribed in the nucleus and perform their functions in the cytoplasm by binding to the complementary target mRNAs. MicroRNAs modulate gene expression either by suppressing translation or by mRNA cleavage and degradation. Plant microRNAs bind to their target mRNA on the coding region, almost perfectly, and perform their function by the cleavage of the mRNA, while animal microRNAs, bind imperfectly to their target mRNA, on the 3’ UTR region, and perform their functions by suppressing translation. MicroRNAs are discovered by both mutational studies and by computational methods. Hundreds of microRNAs have been cloned and sequenced in several organisms including humans, but to date, only few of them have known functions. The experimental techniques to understand the functions of miRNAs are time consuming and expensive which makes computational methods necessary. The identification of targets of plant microRNAs is straightforward due to near-perfect binding, but the imperfect binding of animal miRNAs to target mRNAs makes the computational target prediction rather difficult. In this thesis a new method is proposed for microRNA target prediction in animals using Constraint Logic Programming. With the established method a package micTar was developed to identify targets in Drosophila genome

    Automatically exploiting genomic and metabolic contexts to aid the functional annotation of prokaryote genomes

    Get PDF
    Cette thèse porte sur le développement d'approches bioinformatiques exploitant de l'information de contextes génomiques et métaboliques afin de générer des annotations fonctionnelles de gènes prokaryotes, et comporte deux projets principaux. Le premier projet focalise sur les activités enzymatiques orphelines de séquence. Environ 27% des activités définies par le International Union of Biochemistry and Molecular Biology sont encore aujourd'hui orphelines. Pour celles-ci, les méthodes bioinformatiques traditionnelles ne peuvent proposer de gènes candidats; il est donc impératif d'utiliser des méthodes exploitant des informations contextuelles dans ces cas. La stratégie CanOE (fishingCandidate genes for Orphan Enzymes) a été développée et rajoutée à la plateforme MicroScope dans ce but, intégrant des informations génomiques et métaboliques sur des milliers d'organismes prokaryotes afin de localiser des gènes probants pour des activités orphelines. Le projet miroir au précédent est celui des protéines de fonction inconnue. Un projet collaboratif a été initié au Genoscope afin de formaliser les stratégies d'exploration des fonctions de familles protéiques prokaryotes. Une version pilote du projet a été mise en place sur la famille DUF849 dont une fonction enzymatique avait été récemment découverte. Des stratégies de proposition d'activités enzymatiques alternatives et d'établissement de sous familles isofonctionnelles ont été mises en place dans le cadre de cette thèse, afin de guider les expérimentations de paillasse et d'analyser leurs résultats.The subject of this thesis concerns the development of bioinformatic strategies exploiting genomic and metabolic contextual information in order to generate functional annotations for prokaryote genes. Two main projects were involved during this work: the first focuses on sequence-orphan enzymatic activities. Today, roughly 27% of activities defined by International Union of Biochemistry and Molecular Biology are sequence-orphans. For these, traditional bioinformatic approaches cannot propose candidate genes. It is thus imperative to use alternative, context-based approaches in such cases. The CanOE strategy fishing Candidate genes for Orphan Enzymes) was developed and added to the MicroScope bioinformatics platform in this aim. It integrates genomic and metabolic information across thousands of prokaryote genomes in order to locate promising gene candidates for orphan activities. The mirror project focuses on protein families of unknown function. A collaborative project has been set up at the Genoscope in hope of formalising functional exploration strategies for prokaryote protein families. A pilot version was created on the DUF849 Pfam family, for which a single activity had recently been elucidated. Strategies for proposing novel functions and activities and creating isofunctional sub-families were researched, so as to guide biochemical experimentations and to analyse their results.EVRY-Bib. électronique (912289901) / SudocSudocFranceF

    Knowledge and knowers of the past: A study in the philosophy of evolutionary biology.

    Get PDF
    This dissertation proposes an exploration of a variety of themes in philosophy of science through the lens of a case study in evolutionary biology. It draws from a careful analysis and comparison of the hypotheses from Bill Martin and Tom Cavalier-Smith. These two scientists produced contrasted and competing accounts for one of the main events in the history of life, the origin of eukaryotic cells. This case study feeds four main philosophical themes around which this dissertation is articulated. (1) Theorizing: What kind of theory are hypotheses about unique events in the past? (2) Representation: How do hypotheses about the past represent their target? (3) Evidential claims: What kind of evidence is employed and how do they constrain these hypotheses? (4) Pluralism: What are the benefits and the risks associated with the coexistence of rival hypotheses? This work both seeks to rearticulate traditional debates in philosophy of science in the light of a lesser-known case of scientific practice and to enrich the catalogue of existing case studies in the philosophy of historical sciences

    Report on three Genomes to Life Workshops: Data Infrastructure, Modeling and Simulation, and Protein Structure Prediction

    Full text link
    corecore