843 research outputs found

    Prediction of heterogeneous differential genes by detecting outliers to a Gaussian tight cluster

    Get PDF
    © 2013 Yang and Yang; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.Background Heterogeneously and differentially expressed genes (hDEG) are a common phenomenon due to bio-logical diversity. A hDEG is often observed in gene expression experiments (with two experimental conditions) where it is highly expressed in a few experimental samples, or in drug trial experiments for cancer studies with drug resistance heterogeneity among the disease group. These highly expressed samples are called outliers. Accurate detection of outliers among hDEGs is then desirable for dis- ease diagnosis and effective drug design. The standard approach for detecting hDEGs is to choose the appropriate subset of outliers to represent the experimental group. However, existing methods typically overlook hDEGs with very few outliers. Results We present in this paper a simple algorithm for detecting hDEGs by sequentially testing for potential outliers with respect to a tight cluster of non- outliers, among an ordered subset of the experimental samples. This avoids making any restrictive assumptions about how the outliers are distributed. We use simulated and real data to illustrate that the proposed algorithm achieves a good separation between the tight cluster of low expressions and the outliers for hDEGs. Conclusions The proposed algorithm assesses each potential outlier in relation to the cluster of potential outliers without making explicit assumptions about the outlier distribution. Simulated examples and and breast cancer data sets are used to illustrate the suitability of the proposed algorithm for identifying hDEGs with small numbers of outliers

    Progress in solid-state NMR studies of electrode materials for lithium ion batteries

    Get PDF
    Solid-state NMR is the effective technique for the study of local structural changes and chemical environment around the atoms which monitor atomic environments by varying adjacent metal or carbon content. Based on the changes of the Li-6,Li-7 NMR spectrum, the coordinating condition of lithium with the neighbor metal atoms and the structural changes of the materials during the charge/discharge cycle can be clearly identified. The developments in the micro-structural analysis of the electrode materials and mechanistic study of Li+ intercalation into a various of materials by using solid NMR techniques were reviewed

    Improved prediction of RNA secondary structure by integrating the free energy model with restraints derived from experimental probing data.

    Get PDF
    PublishedEvaluation StudiesJournal ArticleResearch Support, Non-U.S. Gov'tRecently, several experimental techniques have emerged for probing RNA structures based on high-throughput sequencing. However, most secondary structure prediction tools that incorporate probing data are designed and optimized for particular types of experiments. For example, RNAstructure-Fold is optimized for SHAPE data, while SeqFold is optimized for PARS data. Here, we report a new RNA secondary structure prediction method, restrained MaxExpect (RME), which can incorporate multiple types of experimental probing data and is based on a free energy model and an MEA (maximizing expected accuracy) algorithm. We first demonstrated that RME substantially improved secondary structure prediction with perfect restraints (base pair information of known structures). Next, we collected structure-probing data from diverse experiments (e.g. SHAPE, PARS and DMS-seq) and transformed them into a unified set of pairing probabilities with a posterior probabilistic model. By using the probability scores as restraints in RME, we compared its secondary structure prediction performance with two other well-known tools, RNAstructure-Fold (based on a free energy minimization algorithm) and SeqFold (based on a sampling algorithm). For SHAPE data, RME and RNAstructure-Fold performed better than SeqFold, because they markedly altered the energy model with the experimental restraints. For high-throughput data (e.g. PARS and DMS-seq) with lower probing efficiency, the secondary structure prediction performances of the tested tools were comparable, with performance improvements for only a portion of the tested RNAs. However, when the effects of tertiary structure and protein interactions were removed, RME showed the highest prediction accuracy in the DMS-accessible regions by incorporating in vivo DMS-seq data.National Key Basic Research Program of China [2012CB316503]; National High-Tech Research and Development Program of China [2014AA021103]; National Natural Science Foundation of China [31271402]; Tsinghua University Initiative Scientific Research Program [2014z21045]; Hong Kong Research Grants Council Early Career Scheme [419612 to K.Y.]; National Science Foundation [1339282 to D.H.M.]; Computing Platform of the National Protein Facilities (Tsinghua University). Funding for open access charge: National Natural Science Foundation of China [31271402]

    Entropy Projection Curved Gabor with Random Forest and SVM for Face Recognition

    Get PDF
    In this work, we propose a workflow for face recognition under occlusion using the entropy projection from the curved Gabor filter, and create a representative and compact features vector that describes a face. Despite the reduced vector obtained by the entropy projection, it still presents opportunity for further dimensionality reduction. Therefore, we use a Random Forest classifier as an attribute selector, providing a 97% reduction of the original vector while keeping suitable accuracy. A set of experiments using three public image databases: AR Face, Extended Yale B with occlusion and FERET illustrates the proposed methodology, evaluated using the SVM classifier. The results obtained in the experiments show promising results when compared to the available approaches in the literature, obtaining 98.05% accuracy for the complete AR Face, 97.26% for FERET and 81.66% with Yale with 50% occlusion

    How to find simple and accurate rules for viral protease cleavage specificities

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Proteases of human pathogens are becoming increasingly important drug targets, hence it is necessary to understand their substrate specificity and to interpret this knowledge in practically useful ways. New methods are being developed that produce large amounts of cleavage information for individual proteases and some have been applied to extract cleavage rules from data. However, the hitherto proposed methods for extracting rules have been neither easy to understand nor very accurate. To be practically useful, cleavage rules should be accurate, compact, and expressed in an easily understandable way.</p> <p>Results</p> <p>A new method is presented for producing cleavage rules for viral proteases with seemingly complex cleavage profiles. The method is based on orthogonal search-based rule extraction (OSRE) combined with spectral clustering. It is demonstrated on substrate data sets for human immunodeficiency virus type 1 (HIV-1) protease and hepatitis C (HCV) NS3/4A protease, showing excellent prediction performance for both HIV-1 cleavage and HCV NS3/4A cleavage, agreeing with observed HCV genotype differences. New cleavage rules (consensus sequences) are suggested for HIV-1 and HCV NS3/4A cleavages. The practical usability of the method is also demonstrated by using it to predict the location of an internal cleavage site in the HCV NS3 protease and to correct the location of a previously reported internal cleavage site in the HCV NS3 protease. The method is fast to converge and yields accurate rules, on par with previous results for HIV-1 protease and better than previous state-of-the-art for HCV NS3/4A protease. Moreover, the rules are fewer and simpler than previously obtained with rule extraction methods.</p> <p>Conclusion</p> <p>A rule extraction methodology by searching for multivariate low-order predicates yields results that significantly outperform existing rule bases on out-of-sample data, but are more transparent to expert users. The approach yields rules that are easy to use and useful for interpreting experimental data.</p

    A Noise Trimming and Positional Significance of Transposon Insertion System to Identify Essential Genes in Yersinia pestis

    Get PDF
    This is the final version of the article. Available from Springer Nature via the DOI in this record.Massively parallel sequencing technology coupled with saturation mutagenesis has provided new and global insights into gene functions and roles. At a simplistic level, the frequency of mutations within genes can indicate the degree of essentiality. However, this approach neglects to take account of the positional significance of mutations - the function of a gene is less likely to be disrupted by a mutation close to the distal ends. Therefore, a systematic bioinformatics approach to improve the reliability of essential gene identification is desirable. We report here a parametric model which introduces a novel mutation feature together with a noise trimming approach to predict the biological significance of Tn5 mutations. We show improved performance of essential gene prediction in the bacterium Yersinia pestis, the causative agent of plague. This method would have broad applicability to other organisms and to the identification of genes which are essential for competitiveness or survival under a broad range of stresses.This work was supported by the Defence Science and Technology Laboratory under contract DSTLX-1000060221 (WP1)

    Seminal plasma as a source of prostate cancer peptide biomarker candidates for detection of indolent and advanced disease

    Get PDF
    Background:Extensive prostate specific antigen screening for prostate cancer generates a high number of unnecessary biopsies and over-treatment due to insufficient differentiation between indolent and aggressive tumours. We hypothesized that seminal plasma is a robust source of novel prostate cancer (PCa) biomarkers with the potential to improve primary diagnosis of and to distinguish advanced from indolent disease. &lt;br&gt;Methodology/Principal Findings: In an open-label case/control study 125 patients (70 PCa, 21 benign prostate hyperplasia, 25 chronic prostatitis, 9 healthy controls) were enrolled in 3 centres. Biomarker panels a) for PCa diagnosis (comparison of PCa patients versus benign controls) and b) for advanced disease (comparison of patients with post surgery Gleason score &#60;7 versus Gleason score &#62;&gt;7) were sought. Independent cohorts were used for proteomic biomarker discovery and testing the performance of the identified biomarker profiles. Seminal plasma was profiled using capillary electrophoresis mass spectrometry. Pre-analytical stability and analytical precision of the proteome analysis were determined. Support vector machine learning was used for classification. Stepwise application of two biomarker signatures with 21 and 5 biomarkers provided 83% sensitivity and 67% specificity for PCa detection in a test set of samples. A panel of 11 biomarkers for advanced disease discriminated between patients with Gleason score 7 and organ-confined (&#60;pT3a) or advanced (&#8805;pT3a) disease with 80% sensitivity and 82% specificity in a preliminary validation setting. Seminal profiles showed excellent pre-analytical stability. Eight biomarkers were identified as fragments of N-acetyllactosaminide beta-1,3-N-acetylglucosaminyltransferase​,prostatic acid phosphatase, stabilin-2, GTPase IMAP family member 6, semenogelin-1 and -2. Restricted sample size was the major limitation of the study.&lt;/br&gt; &lt;br&gt;Conclusions/Significance: Seminal plasma represents a robust source of potential peptide makers for primary PCa diagnosis. Our findings warrant further prospective validation to confirm the diagnostic potential of identified seminal biomarker candidates.&lt;/br&gt

    Lower expression of inducible nitric oxide synthase and higher expression of arginase in rat alveolar macrophages are linked to their susceptibility to Toxoplasma gondii infection.

    Get PDF
    Rats are naturally resistant to Toxoplasma gondii infection, particularly the RH strain, while mice are not. Previous studies have demonstrated that inducible nitric oxide synthase (iNOS) and arginase-1 of rodent peritoneal macrophages are linked to the mechanism of resistance. As an increasing number of studies on human and animal infections are showing that pulmonary toxoplasmosis is one of the most severe clinical signs from T. gondii infection, we are interested to know whether T. gondii infection in alveolar macrophages of rats is also linked to the levels of iNOS and arginase-1 activity. Our results demonstrate that T. gondii could grow and proliferate in rat alveolar macrophages, both in vitro and in vivo, at levels higher than resistant rat peritoneal macrophages and at comparable levels to sensitive mouse peritoneal macrophages. Lower activity and expression levels of iNOS and higher activity and expression levels of arginase-1 in rat alveolar macrophages were found to be linked to the susceptibility of T. gondii infection in these cells. These novel findings could aid a better understanding of the pathogenesis of clinical pulmonary toxoplasmosis in humans and domestic animals

    Global Analysis of Genes Essential for Francisella tularensis Schu S4 Growth In Vitro and for Fitness during Competitive Infection of Fischer 344 Rats

    Get PDF
    This is the final version. Available from American Society for Microbiology via the DOI in this record The highly virulent intracellular pathogen Francisella tularensis is a Gram-negative bacterium that has a wide host range, including humans, and is the causative agent of tularemia. To identify new therapeutic drug targets and vaccine candidates and investigate the genetic basis of Francisella virulence in the Fischer 344 rat, we have constructed an F. tularensis Schu S4 transposon library. This library consists of more than 300,000 unique transposon mutants and represents a transposon insertion for every 6 bp of the genome. A transposon-directed insertion site sequencing (TraDIS) approach was used to identify 453 genes essential for growth in vitro Many of these essential genes were mapped to key metabolic pathways, including glycolysis/gluconeogenesis, peptidoglycan synthesis, fatty acid biosynthesis, and the tricarboxylic acid (TCA) cycle. Additionally, 163 genes were identified as required for fitness during colonization of the Fischer 344 rat spleen. This in vivo selection screen was validated through the generation of marked deletion mutants that were individually assessed within a competitive index study against the wild-type F. tularensis Schu S4 strain.IMPORTANCE The intracellular bacterial pathogen Francisella tularensis causes a disease in humans characterized by the rapid onset of nonspecific symptoms such as swollen lymph glands, fever, and headaches. F. tularensis is one of the most infectious bacteria known and following pulmonary exposure can have a mortality rate exceeding 50% if left untreated. The low infectious dose of this organism and concerns surrounding its potential as a biological weapon have heightened the need for effective and safe therapies. To expand the repertoire of targets for therapeutic development, we initiated a genome-wide analysis. This study has identified genes that are important for F. tularensis under in vitro and in vivo conditions, providing candidates that can be evaluated for vaccine or antibacterial development.Biotechnology & Biological Sciences Research Council (BBSRC)Defence Science and Technology Laboratory (DSTL
    • …
    corecore