1,772 research outputs found
An exploratory classification of ecological incubator environments in Wales
School of Managemen
Multiple structural alignment for distantly related all b structures using TOPS pattern discovery and simulated annealing
Topsalign is a method that will structurally align diverse protein structures, for example, structural alignment of protein superfolds. All proteins within a superfold share the same fold but often have very low sequence identity and different biological and biochemical functions. There is often signi®cant structural diversity around the common scaffold of secondary structure elements of the fold. Topsalign uses topological descriptions of proteins. A pattern discovery algorithm identi®es equivalent secondary structure elements between a set of proteins and these are used to produce an initial multiple structure alignment. Simulated annealing is used to optimize the alignment. The output of Topsalign is a multiple structure-based sequence alignment and a 3D superposition of the structures. This method has been tested on three superfolds: the b jelly roll, TIM (a/b) barrel and the OB fold. Topsalign outperforms established methods on very diverse structures. Despite the pattern discovery working only on b strand secondary structure elements, Topsalign is shown to align TIM (a/b) barrel superfamilies, which contain both a helices and b strands
TmaDB: a repository for tissue microarray data
Background: Tissue microarray (TMA) technology has been developed to facilitate large, genome-scale molecular pathology studies. This technique provides a high-throughput method for analyzing a large cohort of clinical specimens in a single experiment thereby permitting the parallel analysis of molecular alterations ( at the DNA, RNA, or protein level) in thousands of tissue specimens. As a vast quantity of data can be generated in a single TMA experiment a systematic approach is required for the storage and analysis of such data.
Description: To analyse TMA output a relational database ( known as TmaDB) has been developed to collate all aspects of information relating to TMAs. These data include the TMA construction protocol, experimental protocol and results from the various immunocytological and histochemical staining experiments including the scanned images for each of the TMA cores. Furthermore the database contains pathological information associated with each of the specimens on the TMA slide, the location of the various TMAs and the individual specimen blocks ( from which cores were taken) in the laboratory and their current status i.e. if they can be sectioned into further slides or if they are exhausted. TmaDB has been designed to incorporate and extend many of the published common data elements and the XML format for TMA experiments and is therefore compatible with the TMA data exchange specifications developed by the Association for Pathology Informatics community. Finally the design of the database is made flexible such that TMA experiments from several types of cancer can be stored in a single database, which incorporates the national minimum data set required for pathology reports supported by the Royal College of Pathologists (UK).
Conclusion: TmaDB will provide a comprehensive repository for TMA data such that a large number of results from the numerous immunostaining experiments can be efficiently compared for each of the TMA cores. This will allow a systematic, large-scale comparison of tumour samples to facilitate the identification of gene products of clinical importance such as therapeutic or prognostic markers. In addition this work will contribute to the establishment of a standard for reporting TMA data analogous to MIAME in the description of microarray dat
A Framework for Modeling Subgrid Effects for Two-Phase Flows in Porous Media
In this paper, we study upscaling for two-phase flows in strongly heterogeneous porous media. Upscaling a hyperbolic convection equation is known to be very difficult due to the presence of nonlocal memory effects. Even for a linear hyperbolic equation with a shear velocity field, the upscaled equation involves a nonlocal history dependent diffusion term, which is not amenable to computation. By performing a systematic multiscale analysis, we derive coupled equations for the average and the fluctuations for the two-phase flow. The homogenized equations for the coupled system are obtained by projecting the fluctuations onto a suitable subspace. This projection corresponds exactly to averaging along streamlines of the flow. Convergence of the multiscale analysis is verified numerically. Moreover, we show how to apply this multiscale analysis to upscale two-phase flows in practical applications
Recommended from our members
A computer system to perform structure comparison using TOPS representations of protein structure
We describe the design and implementation of a fast topology–based method
for protein structure comparison. The approach uses the TOPS topological representation
of protein structure, aligning two structures using a common discovered
pattern and generating measure of distance derived from an insert score. Heavy
use is made of a constraint-based pattern matching algorithm for TOPS diagrams
that we have designed and described elsewhere Gilbert et al. (1999). The comparison
system is maintained at the European Bioinformatics Institute and is available
over the Web via the at tops.ebi.ac.uk/tops. Users submit a structure description in
Protein Data Bank (PDB) format and can compare it with structures in the entire
PDB or a representative subset of protein domains, receiving the results by email
Natural antisense transcripts with coding capacity in Arabidopsis may have a regulatory role that is not linked to double-stranded RNA degradation
BACKGROUND:
Overlapping transcripts in antisense orientation have the potential to form double-stranded RNA (dsRNA), a substrate for a number of different RNA-modification pathways. One prominent route for dsRNA is its breakdown by Dicer enzyme complexes into small RNAs, a pathway that is widely exploited by RNA interference technology to inactivate defined genes in transgenic lines. The significance of this pathway for endogenous gene regulation remains unclear.
RESULTS:
We have examined transcription data for overlapping gene pairs in Arabidopsis thaliana. On the basis of an analysis of transcripts with coding regions, we find the majority of overlapping gene pairs to be convergently overlapping pairs (COPs), with the potential for dsRNA formation. In all tissues, COP transcripts are present at a higher frequency compared to the overall gene pool. The probability that both the sense and antisense copy of a COP are co-transcribed matches the theoretical value for coexpression under the assumption that the expression of one partner does not affect the expression of the other. Among COPs, we observe an over-representation of spliced (intron-containing) genes (90%) and of genes with alternatively spliced transcripts. For loci where antisense transcripts overlap with sense transcript introns, we also find a significant bias in favor of alternative splicing and variation of polyadenylation.
CONCLUSION:
The results argue against a predominant RNA degradation effect induced by dsRNA formation. Instead, our data support alternative roles for dsRNAs. They suggest that at least for a subgroup of COPs, antisense expression may induce alternative splicing or polyadenylation
Recommended from our members
Topology-based protein structure comparison using a pattern discovery technique
metaSHARK: software for automated metabolic network prediction from DNA sequence and its application to the genomes of Plasmodium falciparum and Eimeria tenella
The metabolic SearcH And Reconstruction Kit
(metaSHARK) is a new fully automated software package
for the detection of enzyme-encoding genes
within unannotated genome data and their visualization
in the context of the surrounding metabolic network.
The gene detection package (SHARKhunt) runs
on a Linux systemand requires only a set of raw DNA
sequences (genomic, expressed sequence tag and/
or genome survey sequence) as input. Its output
may be uploaded to our web-based visualization
tool (SHARKview) for exploring and comparing data
from different organisms. We first demonstrate the
utility of the software by comparing its results for
the raw Plasmodium falciparum genome with the
manual annotations available at the PlasmoDB and
PlasmoCyc websites. We then apply SHARKhunt to
the unannotated genome sequences of the coccidian
parasite Eimeria tenella and observe that, at an
E-value cut-off of 10(-20), our software makes 142
additional assertions of enzymatic function compared
with a recent annotation package working
with translated open reading frame sequences. The
ability of the software to cope with low levels of
sequence coverage is investigated by analyzing
assemblies of the E.tenella genome at estimated
coverages from 0.5x to 7.5x. Lastly, as an example
of how metaSHARK can be used to evaluate the
genomic evidence for specific metabolic pathways,
we present a study of coenzyme A biosynthesis in
P.falciparum and E.tenella
- …