209 research outputs found
Classification of Camellia (Theaceae) Species Using Leaf Architecture Variations and Pattern Recognition Techniques
Leaf characters have been successfully utilized to classify Camellia (Theaceae) species; however, leaf characters combined with supervised pattern recognition techniques have not been previously explored. We present results of using leaf morphological and venation characters of 93 species from five sections of genus Camellia to assess the effectiveness of several supervised pattern recognition techniques for classifications and compare their accuracy. Clustering approach, Learning Vector Quantization neural network (LVQ-ANN), Dynamic Architecture for Artificial Neural Networks (DAN2), and C-support vector machines (SVM) are used to discriminate 93 species from five sections of genus Camellia (11 in sect. Furfuracea, 16 in sect. Paracamellia, 12 in sect. Tuberculata, 34 in sect. Camellia, and 20 in sect. Theopsis). DAN2 and SVM show excellent classification results for genus Camellia with DAN2's accuracy of 97.92% and 91.11% for training and testing data sets respectively. The RBF-SVM results of 97.92% and 97.78% for training and testing offer the best classification accuracy. A hierarchical dendrogram based on leaf architecture data has confirmed the morphological classification of the five sections as previously proposed. The overall results suggest that leaf architecture-based data analysis using supervised pattern recognition techniques, especially DAN2 and SVM discrimination methods, is excellent for identification of Camellia species
The Antioxidant Protein Peroxiredoxin 4 Is Epigenetically Down Regulated in Acute Promyelocytic Leukemia
The antioxidant peroxiredoxin (PRDX) protein family comprises 6 members, which are implicated in a variety of cellular responses, including growth factor signal transduction. PRDX4 resides in the endoplasmic reticulum (ER), where it locally controls oxidative stress by reducing H2O2levels. We recently provided evidence for a regulatory function of PRDX4 in signal transduction from a myeloid growth factor receptor, the granulocyte colony-stimulating factor receptor (G-CSFR). Upon activation, the ligand-induced G-CSFR undergoes endocytosis and routes via the early endosomes where it physically interacts with ER-resident PRDX4. PRDX4 negatively regulates G-CSFR mediated signaling. Here, we investigated whether PRDX4 is affected in acute myeloid leukemia (AML); genomic alterations and expression levels of PRDX4 were investigated. We show that genomic abnormalities involving PRDX4 are rare in AML. However, we find a strong reduction in PRDX4 expression levels in acute promyelocytic leukemia (APL) compared to normal promyelocytes and different molecular subtypes of AML. Subsequently, the possible role of DNA methylation and histone modifications in silencing of PRDX4 in APLs was investigated. We show that the reduced expression is not due to methylation of the CpG island in the promoter region of PRDX4 but correlates with increased trimethylation of histone 3 lysine residue 27 (H3K27me3) and lysine residue 4 (H3K4me3) at the transcriptional start site (TSS) of PRDX4, indicative of a bivalent histone code involved in transcriptional silencing. These findings suggest that the control of G-CSF responses by the antioxidant protein PRDX4 may be perturbed in APL
Genetic Diversity and Linkage Disequilibrium in Chinese Bread Wheat (Triticum aestivum L.) Revealed by SSR Markers
Two hundred and fifty bread wheat lines, mainly Chinese mini core accessions, were assayed for polymorphism and linkage disequilibrium (LD) based on 512 whole-genome microsatellite loci representing a mean marker density of 5.1 cM. A total of 6,724 alleles ranging from 1 to 49 per locus were identified in all collections. The mean PIC value was 0.650, ranging from 0 to 0.965. Population structure and principal coordinate analysis revealed that landraces and modern varieties were two relatively independent genetic sub-groups. Landraces had a higher allelic diversity than modern varieties with respect to both genomes and chromosomes in terms of total number of alleles and allelic richness. 3,833 (57.0%) and 2,788 (41.5%) rare alleles with frequencies of <5% were found in the landrace and modern variety gene pools, respectively, indicating greater numbers of rare variants, or likely new alleles, in landraces. Analysis of molecular variance (AMOVA) showed that A genome had the largest genetic differentiation and D genome the lowest. In contrast to genetic diversity, modern varieties displayed a wider average LD decay across the whole genome for locus pairs with r2>0.05 (P<0.001) than the landraces. Mean LD decay distance for the landraces at the whole genome level was <5 cM, while a higher LD decay distance of 5–10 cM in modern varieties. LD decay distances were also somewhat different for each of the 21 chromosomes, being higher for most of the chromosomes in modern varieties (<5∼25 cM) compared to landraces (<5∼15 cM), presumably indicating the influences of domestication and breeding. This study facilitates predicting the marker density required to effectively associate genotypes with traits in Chinese wheat genetic resources
Phylogenetic Reconstruction and DNA Barcoding for Closely Related Pine Moth Species (Dendrolimus) in China with Multiple Gene Markers
Unlike distinct species, closely related species offer a great challenge for phylogeny reconstruction and species identification with DNA barcoding due to their often overlapping genetic variation. We tested a sibling species group of pine moth pests in China with a standard cytochrome c oxidase subunit I (COI) gene and two alternative internal transcribed spacer (ITS) genes (ITS1 and ITS2). Five different phylogenetic/DNA barcoding analysis methods (Maximum likelihood (ML)/Neighbor-joining (NJ), “best close match” (BCM), Minimum distance (MD), and BP-based method (BP)), representing commonly used methodology (tree-based and non-tree based) in the field, were applied to both single-gene and multiple-gene analyses. Our results demonstrated clear reciprocal species monophyly for three relatively distant related species, Dendrolimus superans, D. houi, D. kikuchii, as recovered by both single and multiple genes while the phylogenetic relationship of three closely related species, D. punctatus, D. tabulaeformis, D. spectabilis, could not be resolved with the traditional tree-building methods. Additionally, we find the standard COI barcode outperforms two nuclear ITS genes, whatever the methods used. On average, the COI barcode achieved a success rate of 94.10–97.40%, while ITS1 and ITS2 obtained a success rate of 64.70–81.60%, indicating ITS genes are less suitable for species identification in this case. We propose the use of an overall success rate of species identification that takes both sequencing success and assignation success into account, since species identification success rates with multiple-gene barcoding system were generally overestimated, especially by tree-based methods, where only successfully sequenced DNA sequences were used to construct a phylogenetic tree. Non-tree based methods, such as MD, BCM, and BP approaches, presented advantages over tree-based methods by reporting the overall success rates with statistical significance. In addition, our results indicate that the most closely related species D. punctatus, D. tabulaeformis, and D. spectabilis, may be still in the process of incomplete lineage sorting, with occasional hybridizations occurring among them
The Dynamics of Supply and Demand in mRNA Translation
We study the elongation stage of mRNA translation in eukaryotes and find that, in contrast to the assumptions of previous models, both the supply and the demand for tRNA resources are important for determining elongation rates. We find that increasing the initiation rate of translation can lead to the depletion of some species of aa-tRNA, which in turn can lead to slow codons and queueing. Particularly striking “competition” effects are observed in simulations of multiple species of mRNA which are reliant on the same pool of tRNA resources. These simulations are based on a recent model of elongation which we use to study the translation of mRNA sequences from the Saccharomyces cerevisiae genome. This model includes the dynamics of the use and recharging of amino acid tRNA complexes, and we show via Monte Carlo simulation that this has a dramatic effect on the protein production behaviour of the system
Analysis and Prediction of the Metabolic Stability of Proteins Based on Their Sequential Features, Subcellular Locations and Interaction Networks
The metabolic stability is a very important idiosyncracy of proteins that is related to their global flexibility, intramolecular fluctuations, various internal dynamic processes, as well as many marvelous biological functions. Determination of protein's metabolic stability would provide us with useful information for in-depth understanding of the dynamic action mechanisms of proteins. Although several experimental methods have been developed to measure protein's metabolic stability, they are time-consuming and more expensive. Reported in this paper is a computational method, which is featured by (1) integrating various properties of proteins, such as biochemical and physicochemical properties, subcellular locations, network properties and protein complex property, (2) using the mRMR (Maximum Relevance & Minimum Redundancy) principle and the IFS (Incremental Feature Selection) procedure to optimize the prediction engine, and (3) being able to identify proteins among the four types: “short”, “medium”, “long”, and “extra-long” half-life spans. It was revealed through our analysis that the following seven characters played major roles in determining the stability of proteins: (1) KEGG enrichment scores of the protein and its neighbors in network, (2) subcellular locations, (3) polarity, (4) amino acids composition, (5) hydrophobicity, (6) secondary structure propensity, and (7) the number of protein complexes the protein involved. It was observed that there was an intriguing correlation between the predicted metabolic stability of some proteins and the real half-life of the drugs designed to target them. These findings might provide useful insights for designing protein-stability-relevant drugs. The computational method can also be used as a large-scale tool for annotating the metabolic stability for the avalanche of protein sequences generated in the post-genomic age
Modular prediction of protein structural classes from sequences of twilight-zone identity with predicting sequences
<p>Abstract</p> <p>Background</p> <p>Knowledge of structural class is used by numerous methods for identification of structural/functional characteristics of proteins and could be used for the detection of remote homologues, particularly for chains that share twilight-zone similarity. In contrast to existing sequence-based structural class predictors, which target four major classes and which are designed for high identity sequences, we predict seven classes from sequences that share twilight-zone identity with the training sequences.</p> <p>Results</p> <p>The proposed MODular Approach to Structural class prediction (MODAS) method is unique as it allows for selection of any subset of the classes. MODAS is also the first to utilize a novel, custom-built feature-based sequence representation that combines evolutionary profiles and predicted secondary structure. The features quantify information relevant to the definition of the classes including conservation of residues and arrangement and number of helix/strand segments. Our comprehensive design considers 8 feature selection methods and 4 classifiers to develop Support Vector Machine-based classifiers that are tailored for each of the seven classes. Tests on 5 twilight-zone and 1 high-similarity benchmark datasets and comparison with over two dozens of modern competing predictors show that MODAS provides the best overall accuracy that ranges between 80% and 96.7% (83.5% for the twilight-zone datasets), depending on the dataset. This translates into 19% and 8% error rate reduction when compared against the best performing competing method on two largest datasets. The proposed predictor provides accurate predictions at 58% accuracy for membrane proteins class, which is not considered by majority of existing methods, in spite that this class accounts for only 2% of the data. Our predictive model is analyzed to demonstrate how and why the input features are associated with the corresponding classes.</p> <p>Conclusions</p> <p>The improved predictions stem from the novel features that express collocation of the secondary structure segments in the protein sequence and that combine evolutionary and secondary structure information. Our work demonstrates that conservation and arrangement of the secondary structure segments predicted along the protein chain can successfully predict structural classes which are defined based on the spatial arrangement of the secondary structures. A web server is available at <url>http://biomine.ece.ualberta.ca/MODAS/</url>.</p
Prevalence and trend of hepatitis C virus infection among blood donors in Chinese mainland: a systematic review and meta-analysis
<p>Abstract</p> <p>Background</p> <p>Blood transfusion is one of the most common transmission pathways of hepatitis C virus (HCV). This paper aims to provide a comprehensive and reliable tabulation of available data on the epidemiological characteristics and risk factors for HCV infection among blood donors in Chinese mainland, so as to help make prevention strategies and guide further research.</p> <p>Methods</p> <p>A systematic review was constructed based on the computerized literature database. Infection rates and 95% confidence intervals (95% CI) were calculated using the approximate normal distribution model. Odds ratios and 95% CI were calculated by fixed or random effects models. Data manipulation and statistical analyses were performed using STATA 10.0 and ArcGIS 9.3 was used for map construction.</p> <p>Results</p> <p>Two hundred and sixty-five studies met our inclusion criteria. The pooled prevalence of HCV infection among blood donors in Chinese mainland was 8.68% (95% CI: 8.01%-9.39%), and the epidemic was severer in North and Central China, especially in Henan and Hebei. While a significant lower rate was found in Yunnan. Notably, before 1998 the pooled prevalence of HCV infection was 12.87% (95%CI: 11.25%-14.56%) among blood donors, but decreased to 1.71% (95%CI: 1.43%-1.99%) after 1998. No significant difference was found in HCV infection rates between male and female blood donors, or among different blood type donors. The prevalence of HCV infection was found to increase with age. During 1994-1995, the prevalence rate reached the highest with a percentage of 15.78% (95%CI: 12.21%-19.75%), and showed a decreasing trend in the following years. A significant difference was found among groups with different blood donation types, Plasma donors had a relatively higher prevalence than whole blood donors of HCV infection (33.95% <it>vs </it>7.9%).</p> <p>Conclusions</p> <p>The prevalence of HCV infection has rapidly decreased since 1998 and kept a low level in recent years, but some provinces showed relatively higher prevalence than the general population. It is urgent to make efficient measures to prevent HCV secondary transmission and control chronic progress, and the key to reduce the HCV incidence among blood donors is to encourage true voluntary blood donors, strictly implement blood donation law, and avoid cross-infection.</p
- …