145 research outputs found
The phage Mu repressor c and IS30 transposase proteins are significantly related
AbstractThe IS30 transposase exhibits significant amino acid sequence homology to the phage Mu repressor c in the amino- and carboxy-terminal regions of the proteins. The conserved sequences include the proposed Mu repressor DNA binding site, which is also related to the proposed Mu and D108 transposase DNA binding sites. The carboxy-terminal homologies are characterised by two almost complete, and one partial, somewhat diverged amino acid sequence repeats. Only weak homologies to this domain are present in the Mu transposase (Mu A). Nevertheless, a clear link between an insertion sequence and a bacteriophage has been established
Quality control of the sheep bacterial artificial chromosome library, CHORI-243
<p>Abstract</p> <p>Background</p> <p>The sheep CHORI-243 bacterial artificial chromosome (BAC) library is being used in the construction of the virtual sheep genome, the sequencing and construction of the actual sheep genome assembly and as a source of DNA for regions of the genome of biological interest. The objective of our study is to assess the integrity of the clones and plates which make up the CHORI-243 library using the virtual sheep genome.</p> <p>Findings</p> <p>A series of analyses were undertaken based on the mapping the sheep BAC-end sequences (BESs) to the virtual sheep genome. Overall, very few plate specific biases were identified, with only three of the 528 plates in the library significantly affected. The analysis of the number of tail-to-tail (concordant) BACs on the plates identified a number of plates with lower than average numbers of such BACs. For plates 198 and 213 a partial swap of the BESs determined with one of the two primers appear to have occurred. A third plate, 341, also with a significant deficit in tail-to-tail BACs, appeared to contain a substantial number of sequences determined from contaminating eubacterial 16 S rRNA DNA. Additionally a small number of eubacterial 16 S rRNA DNA sequences were present on two other plates, 111 and 338, in the library.</p> <p>Conclusions</p> <p>The comparative genomic approach can be used to assess BAC library integrity in the absence of fingerprinting. The sequences of the sheep CHORI-243 library BACs have high integrity, especially with the corrections detailed above. The library represents a high quality resource for use by the sheep genomics community.</p
Mining tissue specificity, gene connectivity and disease association to reveal a set of genes that modify the action of disease causing genes
<p>Abstract</p> <p>Background</p> <p>The tissue specificity of gene expression has been linked to a number of significant outcomes including level of expression, and differential rates of polymorphism, evolution and disease association. Recent studies have also shown the importance of exploring differential gene connectivity and sequence conservation in the identification of disease-associated genes. However, no study relates gene interactions with tissue specificity and disease association.</p> <p>Methods</p> <p>We adopted an <it>a priori </it>approach making as few assumptions as possible to analyse the interplay among gene-gene interactions with tissue specificity and its subsequent likelihood of association with disease. We mined three large datasets comprising expression data drawn from massively parallel signature sequencing across 32 tissues, describing a set of 55,606 true positive interactions for 7,197 genes, and microarray expression results generated during the profiling of systemic inflammation, from which 126,543 interactions among 7,090 genes were reported.</p> <p>Results</p> <p>Amongst the myriad of complex relationships identified between expression, disease, connectivity and tissue specificity, some interesting patterns emerged. These include elevated rates of expression and network connectivity in housekeeping and disease-associated tissue-specific genes. We found that disease-associated genes are more likely to show tissue specific expression and most frequently interact with other disease genes. Using the thresholds defined in these observations, we develop a guilt-by-association algorithm and discover a group of 112 non-disease annotated genes that predominantly interact with disease-associated genes, impacting on disease outcomes.</p> <p>Conclusion</p> <p>We conclude that parameters such as tissue specificity and network connectivity can be used in combination to identify a group of genes, not previously confirmed as disease causing, that are involved in interactions with disease causing genes. Our guilt-by-association algorithm should be useful for the discovery of additional modifiers of genetic diseases, and more generally, for the ability to associate genes of unknown function to clusters of genes with defined functions allowing for novel biological inference that can be subsequently validated.</p
Using paired-end sequences to optimise parameters for alignment of sequence reads against related genomes
<p>Abstract</p> <p>Background</p> <p>The advent of cheap high through-put sequencing methods has facilitated low coverage skims of a large number of organisms. To maximise the utility of the sequences, assembly into contigs and then ordering of those contigs is required. Whilst sequences can be assembled into contigs <it>de novo</it>, using assembled genomes of closely related organisms as a framework can considerably aid the process. However, the preferred search programs and parameters that will optimise the sensitivity and specificity of the alignments between the sequence reads and the framework genome(s) are not necessarily obvious. Here we demonstrate a process that uses paired-end sequence reads to choose an optimal program and alignment parameters.</p> <p>Results</p> <p>Unlike two single fragment reads, in paired-end sequence reads, such as BAC-end sequences, the two sequences in the pair have a known positional relationship in the original genome. This provides an additional level of confidence over match scores and e-values in the accuracy of the positional assignment of the reads in the comparative genome. Three commonly used sequence alignment programs: MegaBLAST, Blastz and PatternHunter were used to align a set of ovine BAC-end sequences against the equine genome assembly. A range of different search parameters, with a particular focus on contiguous and discontiguous seeds, were used for each program. The number of reads with a hit and the number of read pairs with hits for the two end sequences in the tail-to-tail paired-end configuration were plotted relative to the theoretical maximum expected curve. Of the programs tested, MegaBLAST with short contiguous seed lengths (word size 8-11) performed best in this particular task. In addition the data also provides estimates of the false positive and false negative rates, which can be used to determine the appropriate values of additional parameters, such as score cut-off, to balance sensitivity and specificity. To determine whether the approach also worked for the alignment of shorter reads, the first 240 bases of each BAC end sequence were also aligned to the equine genome. Again, contiguous MegaBLAST performed the best in optimising the sensitivity and specificity with which sheep BAC end reads map to the equine and bovine genomes.</p> <p>Conclusions</p> <p>Paired-end reads, such as BAC-end sequences, provide an efficient mechanism to optimise sequence alignment parameters, for example for comparative genome assemblies, by providing an objective standard to evaluate performance.</p
Transcription profiling provides insights into gene pathways involved in horn and scurs development in cattle
<p>Abstract</p> <p>Background</p> <p>Two types of horns are evident in cattle - fixed horns attached to the skull and a variation called scurs, which refers to small loosely attached horns. Cattle lacking horns are referred to as polled. Although both the <it>Poll </it>and <it>Scurs </it>loci have been mapped to BTA1 and 19 respectively, the underlying genetic basis of these phenotypes is unknown, and so far, no candidate genes regulating these developmental processes have been described. This study is the first reported attempt at transcript profiling to identify genes and pathways contributing to horn and scurs development in Brahman cattle, relative to polled counterparts.</p> <p>Results</p> <p>Expression patterns in polled, horned and scurs tissues were obtained using the Agilent 44 k bovine array. The most notable feature when comparing transcriptional profiles of developing horn tissues against polled was the down regulation of genes coding for elements of the cadherin junction as well as those involved in epidermal development. We hypothesize this as a key event involved in keratinocyte migration and subsequent horn development. In the polled-scurs comparison, the most prevalent differentially expressed transcripts code for genes involved in extracellular matrix remodelling, which were up regulated in scurs tissues relative to polled.</p> <p>Conclusion</p> <p>For this first time we describe networks of genes involved in horn and scurs development. Interestingly, we did not observe differential expression in any of the genes present on the fine mapped region of BTA1 known to contain the <it>Poll </it>locus.</p
Analysis of the complement and molecular evolution of tRNA genes in cow
<p>Abstract</p> <p>Background</p> <p>Detailed information regarding the number and organization of transfer RNA (tRNA) genes at the genome level is becoming readily available with the increase of DNA sequencing of whole genomes. However the identification of functional tRNA genes is challenging for species that have large numbers of repetitive elements containing tRNA derived sequences, such as <it>Bos taurus</it>. Reliable identification and annotation of entire sets of tRNA genes allows the evolution of tRNA genes to be understood on a genomic scale.</p> <p>Results</p> <p>In this study, we explored the <it>B. taurus </it>genome using bioinformatics and comparative genomics approaches to catalogue and analyze cow tRNA genes. The initial analysis of the cow genome using tRNAscan-SE identified 31,868 putative tRNA genes and 189,183 pseudogenes, where 28,830 of the 31,868 predicted tRNA genes were classified as repetitive elements by the RepeatMasker program. We then used comparative genomics to further discriminate between functional tRNA genes and tRNA-derived sequences for the remaining set of 3,038 putative tRNA genes. For our analysis, we used the human, chimpanzee, mouse, rat, horse, dog, chicken and fugu genomes to predict that the number of active tRNA genes in cow lies in the vicinity of 439. Of this set, 150 tRNA genes were 100% identical in their sequences across all nine vertebrate genomes studied. Using clustering analyses, we identified a new tRNA-Gly<sup>CCC </sup>subfamily present in all analyzed mammalian genomes. We suggest that this subfamily originated from an ancestral tRNA-Gly<sup>GCC </sup>gene via a point mutation prior to the radiation of the mammalian lineages. Lastly, in a separate analysis we created phylogenetic profiles for each putative cow tRNA gene using a representative set of genomes to gain an overview of common evolutionary histories of tRNA genes.</p> <p>Conclusion</p> <p>The use of a combination of bioinformatics and comparative genomics approaches has allowed the confident identification of a set of cow tRNA genes that will facilitate further studies in understanding the molecular evolution of cow tRNA genes.</p
Characterisation and application of a bovine U6 promoter for expression of short hairpin RNAs
BackgroundThe use of small interfering RNA (siRNA) molecules in animals to achieve double-stranded RNA-mediated interference (RNAi) has recently emerged as a powerful method of sequence-specific gene knockdown. As DNA-based expression of short hairpin RNA (shRNA) for RNAi may offer some advantages over chemical and in vitro synthesised siRNA, a number of vectors for expression of shRNA have been developed. These often feature polymerase III (pol. III) promoters of either mouse or human origin.ResultsTo develop a shRNA expression vector specifically for bovine RNAi applications, we identified and characterised a novel bovine U6 small nuclear RNA (snRNA) promoter from bovine sequence data. This promoter is the putative bovine homologue of the human U6-8 snRNA promoter, and features a number of functional sequence elements that are characteristic of these types of pol. III promoters. A PCR based cloning strategy was used to incorporate this promoter sequence into plasmid vectors along with shRNA sequences for RNAi. The promoter was then used to express shRNAs, which resulted in the efficient knockdown of an exogenous reporter gene and an endogenous bovine gene.ConclusionWe have mined data from the bovine genome sequencing project to identify a functional bovine U6 promoter and used the promoter sequence to construct a shRNA expression vector. The use of this native bovine promoter in shRNA expression is an important component of our future development of RNAi therapeutic and transgenic applications in bovine species.<br /
An Always Correlated gene expression landscape for ovine skeletal muscle, lessons learnt from comparison with an โequivalentโ bovine landscape
BACKGROUND: We have recently described a method for the construction of an informative gene expression correlation landscape for a single tissue, longissimus muscle (LM) of cattle, using a small number (less than a hundred) of diverse samples. Does this approach facilitate interspecies comparison of networks? FINDINGS: Using gene expression datasets from LM samples from a single postnatal time point for high and low muscling sheep, and from a developmental time course (prenatal to postnatal) for normal sheep and sheep exhibiting the Callipyge muscling phenotype gene expression correlations were calculated across subsets of the data comparable to the bovine analysis. An โAlways Correlatedโ gene expression landscape was constructed by integrating the correlations from the subsets of data and was compared to the equivalent landscape for bovine LM muscle. Whilst at the high level apparently equivalent modules were identified in the two species, at the detailed level overlap between genes in the equivalent modules was limited and generally not significant. Indeed, only 395 genes and 18 edges were in common between the two landscapes. CONCLUSIONS: Since it is unlikely that the equivalent muscles of two closely related species are as different as this analysis suggests, within tissue gene expression correlations appear to be very sensitive to the samples chosen for their construction, compounded by the different platforms used. Thus users need to be very cautious in interpretation of the differences. In future experiments, attention will be required to ensure equivalent experimental designs and use cross-species gene expression platform to enable the identification of true differences between different species
- โฆ