Search CORE

451,086 research outputs found

A general framework for optimization of probes for gene expression microarray and its application to the fungus Podospora anserina

Author: Berteaux-Lecellier Véronique
Bidard Frédérique
Clavé Corinne
Debuchy Robert
Delacroix Hervé
Imbeaud Sandrine
Lespinet Olivier
Reymond Nancie
Silar Philippe
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background The development of new microarray technologies makes custom long oligonucleotide arrays affordable for many experimental applications, notably gene expression analyses. Reliable results depend on probe design quality and selection. Probe design strategy should cope with the limited accuracy of <it>de novo </it>gene prediction programs, and annotation up-dating. We present a novel <it>in silico </it>procedure which addresses these issues and includes experimental screening, as an empirical approach is the best strategy to identify optimal probes in the <it>in silico </it>outcome. Findings We used four criteria for <it>in silico </it>probe selection: cross-hybridization, hairpin stability, probe location relative to coding sequence end and intron position. This latter criterion is critical when exon-intron gene structure predictions for intron-rich genes are inaccurate. For each coding sequence (CDS), we selected a sub-set of four probes. These probes were included in a test microarray, which was used to evaluate the hybridization behavior of each probe. The best probe for each CDS was selected according to three experimental criteria: signal-to-noise ratio, signal reproducibility, and representative signal intensities. This procedure was applied for the development of a gene expression Agilent platform for the filamentous fungus <it>Podospora anserina </it>and the selection of a single 60-mer probe for each of the 10,556 <it>P. anserina </it>CDS. Conclusions A reliable gene expression microarray version based on the Agilent 44K platform was developed with four spot replicates of each probe to increase statistical significance of analysis.</p

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

HAL Descartes

FoxO gene family evolution in vertebrates

Author: Pan Yuchun
Wang Minghui
Wang Qishan
Zhang Xiangzhe
Zhao Hongbo
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Forkhead box, class O (FoxO) belongs to the large family of forkhead transcription factors that are characterized by a conserved forkhead box DNA-binding domain. To date, the FoxO group has four mammalian members: FoxO1, FoxO3a, FoxO4 and FoxO6, which are orthologs of DAF16, an insulin-responsive transcription factor involved in regulating longevity of worms and flies. The degree of homology between these four members is high, especially in the forkhead domain, which contains the DNA-binding interface. Yet, mouse FoxO knockouts have revealed that each FoxO gene has its unique role in the physiological process. Whether the functional divergences are primarily due to adaptive selection pressure or relaxed selective constraint remains an open question. As such, this study aims to address the evolutionary mode of FoxO, which may lead to the functional divergence. Results Sequence similarity searches have performed in genome and scaffold data to identify homologues of FoxO in vertebrates. Phylogenetic analysis was used to characterize the family evolutionary history by identifying two duplications early in vertebrate evolution. To determine the mode of evolution in vertebrates, we performed a rigorous statistical analysis with FoxO gene sequences, including relative rate ratio tests, branch-specific <it>d</it><it>N</it>/<it>d</it><it>S </it>ratio tests, site-specific <it>d</it><it>N</it>/<it>d</it><it>S </it>ratio tests, branch-site <it>d</it><it>N</it>/<it>d</it><it>S </it>ratio tests and clade level amino acid conservation/variation patterns analysis. Our results suggest that FoxO is constrained by strong purifying selection except four sites in FoxO6, which have undergone positive Darwinian selection. The functional divergence in this family is best explained by either relaxed purifying selection or positive selection. Conclusion We present a phylogeny describing the evolutionary history of the FoxO gene family and show that the genes have evolved through duplications followed by purifying selection except for four sites in FoxO6 fixed by positive selection lie mostly within the non-conserved optimal PKB motif in the C-terminal part. Relaxed selection may play important roles in the process of functional differentiation evolved through gene duplications as well.</p

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Benchmark of algorithms for multiple DNA sequence alignment across livestock species

Author: Bąk Artur
Migdałek Grzegorz
Pareek Chandra Shekhar
Żukowski Kacper
Publication venue: Nicolaus Copernicus University in Toruń
Publication date: 24/01/2021
Field of study

Background: Due to the growing amount of biological data, it is often necessary to select the most optimal estimation method for DNA sequence alignment across livestock species. One of the most important benches of genomics is to modelling homology between considered DNA sequences. A multiple sequence alignment is a potent tool for molecular and evolutionary biology, and there are several programs and algorithms applicable for this purpose. The purpose of this paper was to study the most commonly used DNA alignment algorithms to select the optimal tool dedicated for short sequences.Methods: Four steps of bioinformatics pipelines were considered to benchmark the algorithms for multiple DNA sequence alignment across livestock species: 1) selection of reference genome sequences of ARS1.2 for cattle, EquCab3.0 for horse and vicPac2 for alpaca with a low E-value using TBLASTn 2) removing gaps for these sequences 3) alignment of obtained sequences using examined algorithms 4) matching the quality of aligned sequences with sequences of reference genomes by more software. The time of computation was archived for the whole analysis. The seven programs were utilized, each based on different alignment algorithms, namely: ClustalO, ClustalW, Kalign, MAFFT, MUSCLE, Probcons and T-Coffee.Results: The result obtained in this study showed that the fastest is progressive algorithms such as Kalign or MUSCLE-FAST. Moreover, the iterative algorithms like MAFFT and MUSCLE revealed a higher quality of the alignment. The T-Coffee and Probcons programs were computational cost-effective; simultaneously, they were generating a medium-quality calculation in a relatively long time. The best quality of alignment was shown by iterative variants of the MAFFT program; however, the speed of the calculations was relatively low. The fastest algorithm was Kalign, making alignment much faster than the competitors, but achieving average results in the quality of the alignment. The average speed ratio concerning the quality of the analyzed algorithms was obtained by the progressive version of MAFFT, NS1.Conclusions: We conclude that the results of this study can be used to re-alignment of variant primers in new livestock genome releases

Akademicka Platforma Czasopism

Role of APOBEC3 in Genetic Diversity among Endogenous Murine Leukemia Viruses

Author: CSAC
International Chicken Genome Sequencing Consortium
John M Coffin
Jonathan P Stoye
Patric Jern
Wayne N Frankel
Publication venue: Public Library of Science
Publication date: 01/01/2007
Field of study

The ability of human and murine APOBECs (specifically, APOBEC3) to inhibit infecting retroviruses and retrotransposition of some mobile elements is becoming established. Less clear is the effect that they have had on the establishment of the endogenous proviruses resident in the human and mouse genomes. We used the mouse genome sequence to study diversity and genetic traits of nonecotropic murine leukemia viruses (polytropic [Pmv], modified polytropic [Mpmv], and xenotropic [Xmv] subgroups), the best-characterized large set of recently integrated proviruses. We identified 49 proviruses. In phylogenetic analyses, Pmvs and Mpmvs were monophyletic, whereas Xmvs were divided into several clades, implying a greater number of replication cycles between the integration events. Four distinct primer binding site types (Pro, Gln1, Gln2 and Thr) were dispersed within the phylogeny, indicating frequent mispriming. We analyzed the frequency and context of G-to-A mutations for the role of mA3 in formation of these proviruses. In the Pmv and Mpmv (but not Xmv) groups, mutations attributable to mA3 constituted a large fraction of the total. A significant number of nonsense mutations suggests the absence of purifying selection following mutation. A strong bias of G-to-A relative to C-to-T changes was seen, implying a strand specificity that can only have occurred prior to integration. The optimal sequence context of G-to-A mutations, TTC, was consistent with mA3. At least in the Pmv group, a significant 5′ to 3′ gradient of G-to-A mutations was consistent with mA3 editing. Altogether, our results for the first time suggest mA3 editing immediately preceding the integration event that led to retroviral endogenization, contributing to inactivation of infectivity

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Context based mixture model for cell phase identification in automated fluorescence microscopy

Author: King Randy W
Wang Meng
Wong Stephen TC
Zhou Xiaobo
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

BACKGROUND: Automated identification of cell cycle phases of individual live cells in a large population captured via automated fluorescence microscopy technique is important for cancer drug discovery and cell cycle studies. Time-lapse fluorescence microscopy images provide an important method to study the cell cycle process under different conditions of perturbation. Existing methods are limited in dealing with such time-lapse data sets while manual analysis is not feasible. This paper presents statistical data analysis and statistical pattern recognition to perform this task. RESULTS: The data is generated from Hela H2B GFP cells imaged during a 2-day period with images acquired 15 minutes apart using an automated time-lapse fluorescence microscopy. The patterns are described with four kinds of features, including twelve general features, Haralick texture features, Zernike moment features, and wavelet features. To generate a new set of features with more discriminate power, the commonly used feature reduction techniques are used, which include Principle Component Analysis (PCA), Linear Discriminant Analysis (LDA), Maximum Margin Criterion (MMC), Stepwise Discriminate Analysis based Feature Selection (SDAFS), and Genetic Algorithm based Feature Selection (GAFS). Then, we propose a Context Based Mixture Model (CBMM) for dealing with the time-series cell sequence information and compare it to other traditional classifiers: Support Vector Machine (SVM), Neural Network (NN), and K-Nearest Neighbor (KNN). Being a standard practice in machine learning, we systematically compare the performance of a number of common feature reduction techniques and classifiers to select an optimal combination of a feature reduction technique and a classifier. A cellular database containing 100 manually labelled subsequence is built for evaluating the performance of the classifiers. The generalization error is estimated using the cross validation technique. The experimental results show that CBMM outperforms all other classifies in identifying prophase and has the best overall performance. CONCLUSION: The application of feature reduction techniques can improve the prediction accuracy significantly. CBMM can effectively utilize the contextual information and has the best overall performance when combined with any of the previously mentioned feature reduction techniques

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Recommended from our members

A low bit-rate video-coding algorithm based upon variable pattern selection

Author: Dooley L.
Murshed M.
Paul M.
Publication venue
Publication date: 01/08/2002
Field of study

Recent research into pattern representation of moving regions in blocked-based motion estimation and compensation in video sequences, has focused mainly upon using a fixed number of regular shaped patterns. These are used to match the macroblocks in a frame that have two distinct regions involving static background and moving objects. In this paper a new Variable Pattern Selection (VPS) algorithm is presented which selects a preset number of best-matched patterns from a pattern codebook of regular shaped patterns. While more patterns are used than in the previous work, the performance of the VPS algorithm in using variable length coding, by exploiting the frequency of the best-matched patterns, leads to a higher compression ratio, without degrading the overall image quality

Open Research Online (The Open University)

Neural Network and Bioinformatic Methods for Predicting HIV-1 Protease Inhibitor Resistance

Author: Carpenter Gail A.
Woods Matthew
Publication venue: Boston University Center for Adaptive Systems and Department of Cognitive and Neural Systems
Publication date: 01/02/2007
Field of study

This article presents a new method for predicting viral resistance to seven protease inhibitors from the HIV-1 genotype, and for identifying the positions in the protease gene at which the specific nature of the mutation affects resistance. The neural network Analog ARTMAP predicts protease inhibitor resistance from viral genotypes. A feature selection method detects genetic positions that contribute to resistance both alone and through interactions with other positions. This method has identified positions 35, 37, 62, and 77, where traditional feature selection methods have not detected a contribution to resistance. At several positions in the protease gene, mutations confer differing degress of resistance, depending on the specific amino acid to which the sequence has mutated. To find these positions, an Amino Acid Space is introduced to represent genes in a vector space that captures the functional similarity between amino acid pairs. Feature selection identifies several new positions, including 36, 37, and 43, with amino acid-specific contributions to resistance. Analog ARTMAP networks applied to inputs that represent specific amino acids at these positions perform better than networks that use only mutation locations.Air Force Office of Scientific Research (F49620-01-1-0423); National Geospatial-Intelligence Agency (NMA 201-01-1-2016); National Science Foundation (SBE-0354378); Office of Naval Research (N00014-01-1-0624

Boston University Institutional Repository (OpenBU)