396 research outputs found
Classification of protein quaternary structure by functional domain composition
BACKGROUND: The number and the arrangement of subunits that form a protein are referred to as quaternary structure. Quaternary structure is an important protein attribute that is closely related to its function. Proteins with quaternary structure are called oligomeric proteins. Oligomeric proteins are involved in various biological processes, such as metabolism, signal transduction, and chromosome replication. Thus, it is highly desirable to develop some computational methods to automatically classify the quaternary structure of proteins from their sequences. RESULTS: To explore this problem, we adopted an approach based on the functional domain composition of proteins. Every protein was represented by a vector calculated from the domains in the PFAM database. The nearest neighbor algorithm (NNA) was used for classifying the quaternary structure of proteins from this information. The jackknife cross-validation test was performed on the non-redundant protein dataset in which the sequence identity was less than 25%. The overall success rate obtained is 75.17%. Additionally, to demonstrate the effectiveness of this method, we predicted the proteins in an independent dataset and achieved an overall success rate of 84.11% CONCLUSION: Compared with the amino acid composition method and Blast, the results indicate that the domain composition approach may be a more effective and promising high-throughput method in dealing with this complicated problem in bioinformatics
A Brief Review of Computational Gene Prediction Methods
With the development of genome sequencing for many organisms, more and more raw sequences need to be annotated. Gene prediction by computational methods for finding the location of protein coding regions is one of the essential issues in bioinformatics. Two classes of methods are generally adopted: similarity based searches and ab initio prediction. Here, we review the development of gene prediction methods, summarize the measures for evaluating predictor quality, highlight open problems in this area, and discuss future research directions
CHSMiner: a GUI tool to identify chromosomal homologous segments
<p>Abstract</p> <p>Background</p> <p>The identification of chromosomal homologous segments (CHS) within and between genomes is essential for comparative genomics. Various processes including insertion/deletion and inversion could cause the degeneration of CHSs.</p> <p>Results</p> <p>Here we present a Java software CHSMiner that detects CHSs based on shared gene content alone. It implements fast greedy search algorithm and rigorous statistical validation, and its friendly graphical interface allows interactive visualization of the results. We tested the software on both simulated and biological realistic data and compared its performance with similar existing software and data source.</p> <p>Conclusion</p> <p>CHSMiner is characterized by its integrated workflow, fast speed and convenient usage. It will be useful for both experimentalists and bioinformaticians interested in the structure and evolution of genomes.</p
Exploring photosynthesis evolution by comparative analysis of metabolic networks between chloroplasts and photosynthetic bacteria
BACKGROUND: Chloroplasts descended from cyanobacteria and have a drastically reduced genome following an endosymbiotic event. Many genes of the ancestral cyanobacterial genome have been transferred to the plant nuclear genome by horizontal gene transfer. However, a selective set of metabolism pathways is maintained in chloroplasts using both chloroplast genome encoded and nuclear genome encoded enzymes. As an organelle specialized for carrying out photosynthesis, does the chloroplast metabolic network have properties adapted for higher efficiency of photosynthesis? We compared metabolic network properties of chloroplasts and prokaryotic photosynthetic organisms, mostly cyanobacteria, based on metabolic maps derived from genome data to identify features of chloroplast network properties that are different from cyanobacteria and to analyze possible functional significance of those features. RESULTS: The properties of the entire metabolic network and the sub-network that consists of reactions directly connected to the Calvin Cycle have been analyzed using hypergraph representation. Results showed that the whole metabolic networks in chloroplast and cyanobacteria both possess small-world network properties. Although the number of compounds and reactions in chloroplasts is less than that in cyanobacteria, the chloroplast's metabolic network has longer average path length, a larger diameter, and is Calvin Cycle -centered, indicating an overall less-dense network structure with specific and local high density areas in chloroplasts. Moreover, chloroplast metabolic network exhibits a better modular organization than cyanobacterial ones. Enzymes involved in the same metabolic processes tend to cluster into the same module in chloroplasts. CONCLUSION: In summary, the differences in metabolic network properties may reflect the evolutionary changes during endosymbiosis that led to the improvement of the photosynthesis efficiency in higher plants. Our findings are consistent with the notion that since the light energy absorption, transfer and conversion is highly efficient even in photosynthetic bacteria, the further improvements in photosynthetic efficiency in higher plants may rely on changes in metabolic network properties
Tree of Life Based on Genome Context Networks
Efforts in phylogenomics have greatly improved our understanding of the backbone tree of life. However, due to the systematic error in sequence data, a sequence-based phylogenomic approach leads to well-resolved but statistically significant incongruence. Thus, independent test of current phylogenetic knowledge is required. Here, we have devised a distance-based strategy to reconstruct a highly resolved backbone tree of life, on the basis of the genome context networks of 195 fully sequenced representative species. Along with strongly supporting the monophylies of three superkingdoms and most taxonomic sub-divisions, the derived tree also suggests some intriguing results, such as high G+C gram positive origin of Bacteria, classification of Symbiobacterium thermophilum and Alcanivorax borkumensis in Firmicutes. Furthermore, simulation analyses indicate that addition of more gene relationships with high accuracy can greatly improve the resolution of the phylogenetic tree. Our results demonstrate the feasibility of the reconstruction of highly resolved phylogenetic tree with extensible gene networks across all three domains of life. This strategy also implies that the relationships between the genes (gene network) can define what kind of species it is
Ab-origin: an enhanced tool to identify the sourcing gene segments in germline for rearranged antibodies
<p>Abstract</p> <p>Background</p> <p>In the adaptive immune system, variable regions of immunoglobulin (IG) are encoded by random recombination of variable (V), diversity (D), and joining (J) gene segments in the germline. Partitioning the functional antibody sequences to their sourcing germline gene segments is vital not only for understanding antibody maturation but also for promoting the potential engineering of the therapeutic antibodies. To date, several tools have been developed to perform such "trace-back" calculations. Yet, the predicting ability and processing volume of those tools vary significantly for different sets of data. Moreover, none of them give a confidence for immunoglobulin heavy diversity (IGHD) identification. Developing fast, efficient and enhanced tools is always needed with the booming of immunological data.</p> <p>Results</p> <p>Here, a program named Ab-origin is presented. It is designed by batch query against germline databases based on empirical knowledge, optimized scoring scheme and appropriate parameters. Special efforts have been paid to improve the identification accuracy of the short and volatile region, IGHD. In particular, a threshold score for certain sensitivity and specificity is provided to give the confidence level of the IGHD identification.</p> <p>Conclusion</p> <p>When evaluated using different sets of both simulated data and experimental data, Ab-origin outperformed all the other five popular tools in terms of prediction accuracy. The features of batch query and confidence indication of IGHD identification would provide extra help to users. The program is freely available at <url>http://mpsq.biosino.org/ab-origin/supplementary.html</url>.</p
Essential role of liquid phase on melt-processed GdBCO single-grain superconductors
RE-Ba-Cu-O (RE denotes rare earth elements) single-grain superconductors have
garnered considerable attention owning to their ability to trap strong magnetic
field and self-stability for maglev. Here, we employed a modified melt-growth
method by adding liquid source (LS) to provide a liquid rich environment during
crystal growth. It further enables a significantly low maximum processing
temperature (Tmax) even approaching peritectic decomposition temperature. This
method was referred as the liquid source rich low Tmax (LS+LTmax) growth method
which combines the advantage of Top Seeded Infiltration Growth (TSIG) into Top
Seeded Melt-texture Growth (TSMG). The LS+LTmax method synergistically
regulates the perfect appearance and high superconducting performance in REBCO
single grains. The complementary role of liquid source and low Tmax on the
crystallization has been carefully investigated. Microstructure analysis
demonstrates that the LS+LTmax processed GdBCO single grains show clear
advantages of uniform distribution of RE3+ ions as well as RE211 particles. The
inhibition of Gd211 coarsening leads to improved pining properties. GdBCO
single-grain superconductors with diameter of 18 mm and 25 mm show maximum
trapped magnetic field of 0.746 T and 1.140 T at 77 K. These trapped fields are
significantly higher than those of conventional TSMG samples. Particularly, at
grain boundaries with reduced RE211 density superior flux pinning performance
has been observed. It indicates the existence of multiple pinning mechanisms at
these areas. The presented strategy provides essential LS+LTmax technology for
processing high performance single-grain superconductors with improved
reliability which is considered important for engineering applications
- β¦