181,302 research outputs found

    Pattern-based phylogenetic distance estimation and tree reconstruction

    Get PDF
    We have developed an alignment-free method that calculates phylogenetic distances using a maximum likelihood approach for a model of sequence change on patterns that are discovered in unaligned sequences. To evaluate the phylogenetic accuracy of our method, and to conduct a comprehensive comparison of existing alignment-free methods (freely available as Python package decaf+py at http://www.bioinformatics.org.au), we have created a dataset of reference trees covering a wide range of phylogenetic distances. Amino acid sequences were evolved along the trees and input to the tested methods; from their calculated distances we infered trees whose topologies we compared to the reference trees. We find our pattern-based method statistically superior to all other tested alignment-free methods on this dataset. We also demonstrate the general advantage of alignment-free methods over an approach based on automated alignments when sequences violate the assumption of collinearity. Similarly, we compare methods on empirical data from an existing alignment benchmark set that we used to derive reference distances and trees. Our pattern-based approach yields distances that show a linear relationship to reference distances over a substantially longer range than other alignment-free methods. The pattern-based approach outperforms alignment-free methods and its phylogenetic accuracy is statistically indistinguishable from alignment-based distances.Comment: 21 pages, 3 figures, 2 table

    Aptamer panning against gold

    Get PDF
    Oligonucleotide aptamers are single-stranded sequences that exhibit high affinity and specificity for a particular non-nucleotide target including, but not limited to small molecules, proteins, and even whole cells. Aptamers are conventionally isolated and identified using a multi-round screening approach called Systematic Evolution of Ligands by Exponential Enrichment (SELEX) in which a pool of approximately 109 random candidate sequences is continuously enriched with amplified copies of “winning” sequences or adsorbates from prior selection rounds. While SELEX has revolutionized the discovery of numerous DNA and RNA-based aptamers for a variety of targets and dominated the field for two decades as a screening approach, we have developed a non-SELEX screening approach we call CISL (Competition-Induced Selection of Ligands) to identify single-stranded DNA aptamers for gold substrates. One of the key differences in our competition-based screening approach is the elimination of intermittent, time-intensive elution and amplification steps of random sequences that (1) can introduce undesired PCR side products (e.g. partially elongated duplexes) into the candidate pool and (2) bias the candidate pool towards early winners that may simply outnumber higher affinity aptamer candidates introduced at later selection rounds. Following aptamer selection against our gold-based target (e.g. planar crystalline gold, gold nanospheres, gold nanorods) we then evaluate sequences to identify base consensus as well as shared structural elements such as hairpins, internal loops, and multi-branched loops to reveal any shared patterns in the identified primary and predicted secondary structures of the ~20 identified aptamer sequences for a given gold target. Lastly, we have ranked our aptamer sequences for one of our gold targets in terms of their frequency as a bound species using a high throughput sequencing method known as “deep sequencing” or next generation sequencing (NGS). As aptamers continue to be pursued as potential analogs and even substitutes for antibodies and other ligands in the broader materials community, we continue to adapt our unconventional screening approach to hopefully enable faster and easier aptamer identification for a rich range of material targets

    Automatic Discovery of Non-Compositional Compounds in Parallel Data

    Full text link
    Automatic segmentation of text into minimal content-bearing units is an unsolved problem even for languages like English. Spaces between words offer an easy first approximation, but this approximation is not good enough for machine translation (MT), where many word sequences are not translated word-for-word. This paper presents an efficient automatic method for discovering sequences of words that are translated as a unit. The method proceeds by comparing pairs of statistical translation models induced from parallel texts in two languages. It can discover hundreds of non-compositional compounds on each iteration, and constructs longer compounds out of shorter ones. Objective evaluation on a simple machine translation task has shown the method's potential to improve the quality of MT output. The method makes few assumptions about the data, so it can be applied to parallel data other than parallel texts, such as word spellings and pronunciations.Comment: 12 pages; uses natbib.sty, here.st

    Analysis of circadian pattern reveals tissue-specific alternative transcription in leptin signaling pathway

    Get PDF
    *Background*
It has been previously reported that most mammalian genes display a circadian oscillation in their baseline expression. Consequently, the phase and amplitude of each component of a signal transduction cascade has downstream consequences. 

*Results*
We report our analysis of alternative transcripts in the leptin signaling pathway which is responsible for the systemic regulation of macronutrient storage and energy balance. We focused on the circadian expression pattern of a critical component of the leptin signaling system, suppressor of cytokine signaling 3 (SOCS3). On an Affymetrix GeneChip 430A2 microarray, this gene is represented by three probe sets targeting different regions within the 3’ end of the last exon. We demonstrate that in murine brown adipose tissue two downstream 3’ probe sets experience circadian baseline oscillation in counter-phase to the upstream probe set. Such differences in expression patterns are a telltale sign of alternative splicing within the last exon of SOCS3. In contrast, all three probe sets oscillated in a common phase in murine liver and white adipose tissue. This suggests that the regulation of SOCS3 expression in brown fat is tissue specific. Another component of the signaling pathway, Janus kinase (JAK), is directly regulated by SOCS and has alternative transcript probe sets oscillating in counter-phase in a white adipose tissue specific manner.
 
*Conclusion*
We hypothesize that differential oscillation of alternative transcripts may provide a mechanism to maintain steady levels of expression in spite of circadian baseline variation

    CanICA: Model-based extraction of reproducible group-level ICA patterns from fMRI time series

    Get PDF
    Spatial Independent Component Analysis (ICA) is an increasingly used data-driven method to analyze functional Magnetic Resonance Imaging (fMRI) data. To date, it has been used to extract meaningful patterns without prior information. However, ICA is not robust to mild data variation and remains a parameter-sensitive algorithm. The validity of the extracted patterns is hard to establish, as well as the significance of differences between patterns extracted from different groups of subjects. We start from a generative model of the fMRI group data to introduce a probabilistic ICA pattern-extraction algorithm, called CanICA (Canonical ICA). Thanks to an explicit noise model and canonical correlation analysis, our method is auto-calibrated and identifies the group-reproducible data subspace before performing ICA. We compare our method to state-of-the-art multi-subject fMRI ICA methods and show that the features extracted are more reproducible

    Genomic evidence for genes encoding leucine-rich repeat receptors linked to resistance against the eukaryotic extra- and intracellular Brassica napus pathogens Leptosphaeria maculans and Plasmodiophora brassicae

    Get PDF
    © 2018 Stotz et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.Genes coding for nucleotide-binding leucine-rich repeat (LRR) receptors (NLRs) control resistance against intracellular (cell-penetrating) pathogens. However, evidence for a role of genes coding for proteins with LRR domains in resistance against extracellular (apoplastic) fungal pathogens is limited. Here, the distribution of genes coding for proteins with eLRR domains but lacking kinase domains was determined for the Brassica napus genome. Predictions of signal peptide and transmembrane regions divided these genes into 184 coding for receptor-like proteins (RLPs) and 121 coding for secreted proteins (SPs). Together with previously annotated NLRs, a total of 720 LRR genes were found. Leptosphaeria maculans-induced expression during a compatible interaction with cultivar Topas differed between RLP, SP and NLR gene families; NLR genes were induced relatively late, during the necrotrophic phase of pathogen colonization. Seven RLP, one SP and two NLR genes were found in Rlm1 and Rlm3/Rlm4/Rlm7/Rlm9 loci for resistance against L. maculans on chromosome A07 of B. napus. One NLR gene at the Rlm9 locus was positively selected, as was the RLP gene on chromosome A10 with LepR3 and Rlm2 alleles conferring resistance against L. maculans races with corresponding effectors AvrLm1 and AvrLm2, respectively. Known loci for resistance against L. maculans (extracellular hemi-biotrophic fungus), Sclerotinia sclerotiorum (necrotrophic fungus) and Plasmodiophora brassicae (intracellular, obligate biotrophic protist) were examined for presence of RLPs, SPs and NLRs in these regions. Whereas loci for resistance against P. brassicae were enriched for NLRs, no such signature was observed for the other pathogens. These findings demonstrate involvement of (i) NLR genes in resistance against the intracellular pathogen P. brassicae and a putative NLR gene in Rlm9-mediated resistance against the extracellular pathogen L. maculans.Peer reviewe
    corecore