4,509 research outputs found
Finding all maximal perfect haplotype blocks in linear time
Recent large-scale community sequencing efforts allow at an unprecedented level of detail the identification of genomic regions that show signatures of natural selection. Traditional methods for identifying such regions from individuals' haplotype data, however, require excessive computing times and therefore are not applicable to current datasets. In 2019, Cunha et al. (Advances in bioinformatics and computational biology: 11th Brazilian symposium on bioinformatics, BSB 2018, Niteroi, Brazil, October 30 - November 1, 2018, Proceedings, 2018. 10.1007/978-3-030-01722-4_3) suggested the maximal perfect haplotype block as a very simple combinatorial pattern, forming the basis of a new method to perform rapid genome-wide selection scans. The algorithm they presented for identifying these blocks, however, had a worst-case running time quadratic in the genome length. It was posed as an open problem whether an optimal, linear-time algorithm exists. In this paper we give two algorithms that achieve this time bound, one conceptually very simple one using suffix trees and a second one using the positional Burrows-Wheeler Transform, that is very efficient also in practice.Peer reviewe
r2cat: synteny plots and comparative assembly
Summary: Recent parallel pyrosequencing methods and the increasing number of finished genomes encourage the sequencing and investigation of closely related strains. Although the sequencing itself becomes easier and cheaper with each machine generation, the finishing of the genomes remains difficult. Instead of the desired whole genomic sequence, a set of contigs is the result of the assembly. In this applications note, we present the tool r2cat (related reference contig arrangement tool) that helps in the task of comparative assembly and also provides an interactive visualization for synteny inspection
Porcine endogenous retroviruses PERV A and A/C recombinant are insensitive to a range of divergent mammalian TRIM5Β proteins including human TRIM5
The potential risk of cross-species transmission of porcine endogenous retroviruses (PERV) to humans has slowed the development of xenotransplantation, using pigs as organ donors. Here, we show that PERVs are insensitive to restriction by divergent TRIM5{alpha} molecules despite the fact that they strongly restrict a variety of divergent lentiviruses. We also show that the human PERV A/C recombinant clone 14/220 reverse transcribes with increased efficiency in human cells, leading to significantly higher infectivity. We conclude that xenotransplantation studies should consider the danger of highly infectious TRIM5{alpha}-insensitive human-tropic PERV recombinants
A Minimal Periods Algorithm with Applications
Kosaraju in ``Computation of squares in a string'' briefly described a
linear-time algorithm for computing the minimal squares starting at each
position in a word. Using the same construction of suffix trees, we generalize
his result and describe in detail how to compute in O(k|w|)-time the minimal
k-th power, with period of length larger than s, starting at each position in a
word w for arbitrary exponent and integer . We provide the
complete proof of correctness of the algorithm, which is somehow not completely
clear in Kosaraju's original paper. The algorithm can be used as a sub-routine
to detect certain types of pseudo-patterns in words, which is our original
intention to study the generalization.Comment: 14 page
Negative Selection by an Endogenous Retrovirus Promotes a Higher-Avidity CD4+ T Cell Response to Retroviral Infection
Effective T cell responses can decisively influence the outcome of retroviral infection. However, what constitutes protective T cell responses or determines the ability of the host to mount such responses is incompletely understood. Here we studied the requirements for development and induction of CD4+ T cells that were essential for immunity to Friend virus (FV) infection of mice, according to their TCR avidity for an FV-derived epitope. We showed that a self peptide, encoded by an endogenous retrovirus, negatively selected a significant fraction of polyclonal FV-specific CD4+ T cells and diminished the response to FV infection. Surprisingly, however, CD4+ T cell-mediated antiviral activity was fully preserved. Detailed repertoire analysis revealed that clones with low avidity for FV-derived peptides were more cross-reactive with self peptides and were consequently preferentially deleted. Negative selection of low-avidity FV-reactive CD4+ T cells was responsible for the dominance of high-avidity clones in the response to FV infection, suggesting that protection against the primary infecting virus was mediated exclusively by high-avidity CD4+ T cells. Thus, although negative selection reduced the size and cross-reactivity of the available FV-reactive naΓ―ve CD4+ T cell repertoire, it increased the overall avidity of the repertoire that responded to infection. These findings demonstrate that self proteins expressed by replication-defective endogenous retroviruses can heavily influence the formation of the TCR repertoire reactive with exogenous retroviruses and determine the avidity of the response to retroviral infection. Given the overabundance of endogenous retroviruses in the human genome, these findings also suggest that endogenous retroviral proteins, presented by products of highly polymorphic HLA alleles, may shape the human TCR repertoire that reacts with exogenous retroviruses or other infecting pathogens, leading to interindividual heterogeneity
Constraint qualifications in partial identification
The literature on stochastic programming typically regularizes problems using so-called Constraint Qualifications. The literature on estimation and inference under partial identification frequently restricts the geometry of identified sets with diverse high-level assumptions. These superficially appear to be different approaches to closely related problems. We extensively analyze their relation. Among other things, we show that for partial identification through pure moment inequalities, numerous regularization assumptions from the literature essentially coincide with the Mangasarian-Fromowitz Constraint Qualification. This clarifies the relation between well-known contributions, including within econometrics, and elucidates stringency, as well as ease of verification, of some high-level assumptions in seminal papers.First author draf
trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses
Summary: Multiple sequence alignments are central to many areas of bioinformatics. It has been shown that the removal of poorly aligned regions from an alignment increases the quality of subsequent analyses. Such an alignment trimming phase is complicated in large-scale phylogenetic analyses that deal with thousands of alignments. Here, we present trimAl, a tool for automated alignment trimming, which is especially suited for large-scale phylogenetic analyses. trimAl can consider several parameters, alone or in multiple combinations, for selecting the most reliable positions in the alignment. These include the proportion of sequences with a gap, the level of amino acid similarity and, if several alignments for the same set of sequences are provided, the level of consistency across different alignments. Moreover, trimAl can automatically select the parameters to be used in each specific alignment so that the signal-to-noise ratio is optimized
Recommended from our members
What will the cardiovascular disease slowdown cost? Modelling the impact of CVD trends on dementia, disability, and economic costs in England and Wales from 2020-2029.
To model the health impact and economic costs of the recent slowing of the historical decline in cardiovascular disease (CVD) incidence
- β¦