724 research outputs found
Large introns in relation to alternative splicing and gene evolution: a case study of Drosophila bruno-3
Background:
Alternative splicing (AS) of maturing mRNA can generate structurally and functionally distinct transcripts from the same gene. Recent bioinformatic analyses of available genome databases inferred a positive correlation between intron length and AS. To study the interplay between intron length and AS empirically and in more detail, we analyzed the diversity of alternatively spliced transcripts (ASTs) in the Drosophila RNA-binding Bruno-3 (Bru-3) gene. This gene was known to encode thirteen exons separated by introns of diverse sizes, ranging from 71 to 41,973 nucleotides in D. melanogaster. Although Bru-3's structure is expected to be conducive to AS, only two ASTs of this gene were previously described.
Results:
Cloning of RT-PCR products of the entire ORF from four species representing three diverged Drosophila lineages provided an evolutionary perspective, high sensitivity, and long-range contiguity of splice choices currently unattainable by high-throughput methods. Consequently, we identified three new exons, a new exon fragment and thirty-three previously unknown ASTs of Bru-3. All exon-skipping events in the gene were mapped to the exons surrounded by introns of at least 800 nucleotides, whereas exons split by introns of less than 250 nucleotides were always spliced contiguously in mRNA. Cases of exon loss and creation during Bru-3 evolution in Drosophila were also localized within large introns. Notably, we identified a true de novo exon gain: exon 8 was created along the lineage of the obscura group from intronic sequence between cryptic splice sites conserved among all Drosophila species surveyed. Exon 8 was included in mature mRNA by the species representing all the major branches of the obscura group. To our knowledge, the origin of exon 8 is the first documented case of exonization of intronic sequence outside vertebrates.
Conclusion:
We found that large introns can promote AS via exon-skipping and exon turnover during evolution likely due to frequent errors in their removal from maturing mRNA. Large introns could be a reservoir of genetic diversity, because they have a greater number of mutable sites than short introns. Taken together, gene structure can constrain and/or promote gene evolution
Measurement of the inclusive and dijet cross-sections of b-jets in pp collisions at sqrt(s) = 7 TeV with the ATLAS detector
The inclusive and dijet production cross-sections have been measured for jets
containing b-hadrons (b-jets) in proton-proton collisions at a centre-of-mass
energy of sqrt(s) = 7 TeV, using the ATLAS detector at the LHC. The
measurements use data corresponding to an integrated luminosity of 34 pb^-1.
The b-jets are identified using either a lifetime-based method, where secondary
decay vertices of b-hadrons in jets are reconstructed using information from
the tracking detectors, or a muon-based method where the presence of a muon is
used to identify semileptonic decays of b-hadrons inside jets. The inclusive
b-jet cross-section is measured as a function of transverse momentum in the
range 20 < pT < 400 GeV and rapidity in the range |y| < 2.1. The bbbar-dijet
cross-section is measured as a function of the dijet invariant mass in the
range 110 < m_jj < 760 GeV, the azimuthal angle difference between the two jets
and the angular variable chi in two dijet mass regions. The results are
compared with next-to-leading-order QCD predictions. Good agreement is observed
between the measured cross-sections and the predictions obtained using POWHEG +
Pythia. MC@NLO + Herwig shows good agreement with the measured bbbar-dijet
cross-section. However, it does not reproduce the measured inclusive
cross-section well, particularly for central b-jets with large transverse
momenta.Comment: 10 pages plus author list (21 pages total), 8 figures, 1 table, final
  version published in European Physical Journal 
HER2-family signalling mechanisms, clinical implications and targeting in breast cancer.
Approximately 20 % of human breast cancers (BC) overexpress HER2 protein, and HER2-positivity is associated with a worse prognosis. Although HER2-targeted therapies have significantly improved outcomes for HER2-positive BC patients, resistance to trastuzumab-based therapy remains a clinical problem. In order to better understand resistance to HER2-targeted therapies in HER2-positive BC, it is necessary to examine HER family signalling as a whole. An extensive literature search was carried out to critically assess the current knowledge of HER family signalling in HER2-positive BC and response to HER2-targeted therapy. Known mechanisms of trastuzumab resistance include reduced receptor-antibody binding (MUC4, p95HER2), increased signalling through alternative HER family receptor tyrosine kinases (RTK), altered intracellular signalling involving loss of PTEN, reduced p27kip1, or increased PI3K/AKT activity and altered signalling via non-HER family RTKs such as IGF1R. Emerging strategies to circumvent resistance to HER2-targeted therapies in HER2-positive BC include co-targeting HER2/PI3K, pan-HER family inhibition, and novel therapies such as T-DM1. There is evidence that immunity plays a key role in the efficacy of HER-targeted therapy, and efforts are being made to exploit the immune system in order to improve the efficacy of current anti-HER therapies. With our rapidly expanding understanding of HER2 signalling mechanisms along with the repertoire of HER family and other targeted therapies, it is likely that the near future holds further dramatic improvements to the prognosis of women with HER2-positive BC
Expedited batch processing and analysis of transposon insertions
<p>Abstract</p> <p>Background</p> <p>With advances in sequencing technology, greater and greater amounts of eukaryotic genome data are becoming available. Often, large portions of these genomes consist of transposable elements, frequently accounting for 50% or more in vertebrates. Each transposable element family may have thousands or tens of thousands of individual copies within a given genome, and therefore it can take an exorbitant amount of time and effort to process data in a meaningful fashion.</p> <p>Findings</p> <p>In order to combat this problem, we developed a set of bioinformatics techniques and programs to streamline the analysis. This includes a unique Perl script which automates the process of taking BLAST, Repeatmasker and similar data to extract and manipulate the hit sequences from the genome. This script, called Process_hits uses an object-oriented methodology to compile all hit locations from a given file for processing, organize this data into useable categories, and output it in multiple formats.</p> <p>Conclusions</p> <p>The program proved capable of handling large amounts of transposon data in an efficient fashion. It is equipped with a number of useful sub-functions, each of which is contained within its own sub-module to allow for greater expandability and as a foundation for future program design.</p
Linking microarray reporters with protein functions
<p>Abstract</p> <p>Background</p> <p>The analysis of microarray experiments requires accurate and up-to-date functional annotation of the microarray reporters to optimize the interpretation of the biological processes involved. Pathway visualization tools are used to connect gene expression data with existing biological pathways by using specific database identifiers that link reporters with elements in the pathways.</p> <p>Results</p> <p>This paper proposes a novel method that aims to improve microarray reporter annotation by BLASTing the original reporter sequences against a species-specific EMBL subset, that was derived from and crosslinked back to the highly curated UniProt database. The resulting alignments were filtered using high quality alignment criteria and further compared with the outcome of a more traditional approach, where reporter sequences were BLASTed against EnsEMBL followed by locating the corresponding protein (UniProt) entry for the high quality hits. Combining the results of both methods resulted in successful annotation of > 58% of all reporter sequences with UniProt IDs on two commercial array platforms, increasing the amount of Incyte reporters that could be coupled to Gene Ontology terms from 32.7% to 58.3% and to a local GenMAPP pathway from 9.6% to 16.7%. For Agilent, 35.3% of the total reporters are now linked towards GO nodes and 7.1% on local pathways.</p> <p>Conclusion</p> <p>Our methods increased the annotation quality of microarray reporter sequences and allowed us to visualize more reporters using pathway visualization tools. Even in cases where the original reporter annotation showed the correct description the new identifiers often allowed improved pathway and Gene Ontology linking. These methods are freely available at http://www.bigcat.unimaas.nl/public/publications/Gaj_Annotation/.</p
Reliability analysis of the Ahringer Caenorhabditis elegans RNAi feeding library: a guide for genome-wide screens
<p>Abstract</p> <p>Background</p> <p>The Ahringer <it>C. elegans </it>RNAi feeding library prepared by cloning genomic DNA fragments has been widely used in genome-wide analysis of gene function. However, the library has not been thoroughly validated by direct sequencing, and there are potential errors, including: 1) mis-annotation (the clone with the retired gene name should be remapped to the actual target gene); 2) nonspecific PCR amplification; 3) cross-RNAi; 4) mis-operation such as sample loading error, <it>etc</it>.</p> <p>Results</p> <p>Here we performed a reliability analysis on the Ahringer <it>C. elegans </it>RNAi feeding library, which contains 16,256 bacterial strains, using a bioinformatics approach. Results demonstrated that most (98.3%) of the bacterial strains in the library are reliable. However, we also found that 2,851 (17.54%) bacterial strains need to be re-annotated even they are reliable. Most of these bacterial strains are the clones having the retired gene names. Besides, 28 strains are grouped into unreliable category and 226 strains are marginal because of probably expressing unrelated double-stranded RNAs (dsRNAs). The accuracy of the prediction was further confirmed by direct sequencing analysis of 496 bacterial strains. Finally, a freely accessible database named CelRNAi (<url>http://biocompute.bmi.ac.cn/CelRNAi/</url>) was developed as a valuable complement resource for the feeding RNAi library by providing the predicted information on all bacterial strains. Moreover, submission of the direct sequencing result or any other annotations for the bacterial strains to the database are allowed and will be integrated into the CelRNAi database to improve the accuracy of the library. In addition, we provide five candidate primer sets for each of the unreliable and marginal bacterial strains for users to construct an alternative vector for their own RNAi studies.</p> <p>Conclusions</p> <p>Because of the potential unreliability of the Ahringer <it>C. elegans </it>RNAi feeding library, we strongly suggest the user examine the reliability information of the bacterial strains in the CelRNAi database before performing RNAi experiments, as well as the post-RNAi experiment analysis.</p
Linkage mapping bovine EST-based SNP
BACKGROUND: Existing linkage maps of the bovine genome primarily contain anonymous microsatellite markers. These maps have proved valuable for mapping quantitative trait loci (QTL) to broad regions of the genome, but more closely spaced markers are needed to fine-map QTL, and markers associated with genes and annotated sequence are needed to identify genes and sequence variation that may explain QTL. RESULTS: Bovine expressed sequence tag (EST) and bacterial artificial chromosome (BAC)sequence data were used to develop 918 single nucleotide polymorphism (SNP) markers to map genes on the bovine linkage map. DNA of sires from the MARC reference population was used to detect SNPs, and progeny and mates of heterozygous sires were genotyped. Chromosome assignments for 861 SNPs were determined by twopoint analysis, and positions for 735 SNPs were established by multipoint analyses. Linkage maps of bovine autosomes with these SNPs represent 4585 markers in 2475 positions spanning 3058 cM . Markers include 3612 microsatellites, 913 SNPs and 60 other markers. Mean separation between marker positions is 1.2 cM. New SNP markers appear in 511 positions, with mean separation of 4.7 cM. Multi-allelic markers, mostly microsatellites, had a mean (maximum) of 216 (366) informative meioses, and a mean 3-lod confidence interval of 3.6 cM Bi-allelic markers, including SNP and other marker types, had a mean (maximum) of 55 (191) informative meioses, and were placed within a mean 8.5 cM 3-lod confidence interval. Homologous human sequences were identified for 1159 markers, including 582 newly developed and mapped SNP. CONCLUSION: Addition of these EST- and BAC-based SNPs to the bovine linkage map not only increases marker density, but provides connections to gene-rich physical maps, including annotated human sequence. The map provides a resource for fine-mapping quantitative trait loci and identification of positional candidate genes, and can be integrated with other data to guide and refine assembly of bovine genome sequence. Even after the bovine genome is completely sequenced, the map will continue to be a useful tool to link observable phenotypes and animal genotypes to underlying genes and molecular mechanisms influencing economically important beef and dairy traits
Genomorama: genome visualization and analysis
<p>Abstract</p> <p>Background</p> <p>The ability to visualize genomic features and design experimental assays that can target specific regions of a genome is essential for modern biology. To assist in these tasks, we present Genomorama, a software program for interactively displaying multiple genomes and identifying potential DNA hybridization sites for assay design.</p> <p>Results</p> <p>Useful features of Genomorama include genome search by DNA hybridization (probe binding and PCR amplification), efficient multi-scale display and manipulation of multiple genomes, support for many genome file types and the ability to search for and retrieve data from the National Center for Biotechnology Information (NCBI) Entrez server.</p> <p>Conclusion</p> <p>Genomorama provides an efficient computational platform for visualizing and analyzing multiple genomes.</p
Structural Heterogeneity and Quantitative FRET Efficiency Distributions of Polyprolines through a Hybrid Atomistic Simulation and Monte Carlo Approach
Förster Resonance Energy Transfer (FRET) experiments probe molecular distances via distance dependent energy transfer from an excited donor dye to an acceptor dye. Single molecule experiments not only probe average distances, but also distance distributions or even fluctuations, and thus provide a powerful tool to study biomolecular structure and dynamics. However, the measured energy transfer efficiency depends not only on the distance between the dyes, but also on their mutual orientation, which is typically inaccessible to experiments. Thus, assumptions on the orientation distributions and averages are usually made, limiting the accuracy of the distance distributions extracted from FRET experiments. Here, we demonstrate that by combining single molecule FRET experiments with the mutual dye orientation statistics obtained from Molecular Dynamics (MD) simulations, improved estimates of distances and distributions are obtained. From the simulated time-dependent mutual orientations, FRET efficiencies are calculated and the full statistics of individual photon absorption, energy transfer, and photon emission events is obtained from subsequent Monte Carlo (MC) simulations of the FRET kinetics. All recorded emission events are collected to bursts from which efficiency distributions are calculated in close resemblance to the actual FRET experiment, taking shot noise fully into account. Using polyproline chains with attached Alexa 488 and Alexa 594 dyes as a test system, we demonstrate the feasibility of this approach by direct comparison to experimental data. We identified cis-isomers and different static local environments as sources of the experimentally observed heterogeneity. Reconstructions of distance distributions from experimental data at different levels of theory demonstrate how the respective underlying assumptions and approximations affect the obtained accuracy. Our results show that dye fluctuations obtained from MD simulations, combined with MC single photon kinetics, provide a versatile tool to improve the accuracy of distance distributions that can be extracted from measured single molecule FRET efficiencies
- …
