34 research outputs found

    A Combined Markov Chain Model and Generalized Projection Nonnegative Matrix Factorization Approach for Fault Diagnosis

    Get PDF
    The presence of sets of incomplete measurements is a significant issue in the real-world application of multivariate statistical process monitoring models for industrial process fault detection. Since the missing data in the incomplete measurements are usually correlated with some of the available variables, these measurements can be used if an efficient algorithm is presented. To resolve the problem, a novel method combining Markov chain model and generalized projection nonnegative matrix factorization (MCM-GPNMF) is proposed to detect and diagnose the faults in industrial process. The basic idea of the approach is to use MCM-GPNMF to extract the dominant variables from incomplete process data and to combine them with statistical process monitoring techniques. TG2 and SPEG statistics are defined as online monitoring quantities for fault detection and corresponding contribution plots are also considered for fault isolation. The proposed method is applied to a 1000 MW unit boiler process. The simulation results clearly illustrate the feasibility of the proposed method

    Low-complexity full-field ultrafast nonlinear dynamics prediction by a convolutional feature separation modeling method

    Full text link
    The modeling and prediction of the ultrafast nonlinear dynamics in the optical fiber are essential for the studies of laser design, experimental optimization, and other fundamental applications. The traditional propagation modeling method based on the nonlinear Schr\"odinger equation (NLSE) has long been regarded as extremely time-consuming, especially for designing and optimizing experiments. The recurrent neural network (RNN) has been implemented as an accurate intensity prediction tool with reduced complexity and good generalization capability. However, the complexity of long grid input points and the flexibility of neural network structure should be further optimized for broader applications. Here, we propose a convolutional feature separation modeling method to predict full-field ultrafast nonlinear dynamics with low complexity and high flexibility, where the linear effects are firstly modeled by NLSE-derived methods, then a convolutional deep learning method is implemented for nonlinearity modeling. With this method, the temporal relevance of nonlinear effects is substantially shortened, and the parameters and scale of neural networks can be greatly reduced. The running time achieves a 94% reduction versus NLSE and an 87% reduction versus RNN without accuracy deterioration. In addition, the input pulse conditions, including grid point numbers, durations, peak powers, and propagation distance, can be flexibly changed during the predicting process. The results represent a remarkable improvement in the ultrafast nonlinear dynamics prediction and this work also provides novel perspectives of the feature separation modeling method for quickly and flexibly studying the nonlinear characteristics in other fields.Comment: 15 pages,9 figure

    Analysis of the transcriptome of Panax notoginseng root uncovers putative triterpene saponin-biosynthetic genes and genetic markers

    Get PDF
    <p>Abstract</p> <p>Background</p> <p><it>Panax notoginseng </it>(Burk) F.H. Chen is important medicinal plant of the <it>Araliacease </it>family. Triterpene saponins are the bioactive constituents in <it>P. notoginseng</it>. However, available genomic information regarding this plant is limited. Moreover, details of triterpene saponin biosynthesis in the <it>Panax </it>species are largely unknown.</p> <p>Results</p> <p>Using the 454 pyrosequencing technology, a one-quarter GS FLX titanium run resulted in 188,185 reads with an average length of 410 bases for <it>P. notoginseng </it>root. These reads were processed and assembled by 454 GS <it>De Novo </it>Assembler software into 30,852 unique sequences. A total of 70.2% of unique sequences were annotated by Basic Local Alignment Search Tool (BLAST) similarity searches against public sequence databases. The Kyoto Encyclopedia of Genes and Genomes (KEGG) assignment discovered 41 unique sequences representing 11 genes involved in triterpene saponin backbone biosynthesis in the 454-EST dataset. In particular, the transcript encoding dammarenediol synthase (DS), which is the first committed enzyme in the biosynthetic pathway of major triterpene saponins, is highly expressed in the root of four-year-old <it>P. notoginseng</it>. It is worth emphasizing that the candidate cytochrome P450 (Pn02132 and Pn00158) and UDP-glycosyltransferase (Pn00082) gene most likely to be involved in hydroxylation or glycosylation of aglycones for triterpene saponin biosynthesis were discovered from 174 cytochrome P450s and 242 glycosyltransferases by phylogenetic analysis, respectively. Putative transcription factors were detected in 906 unique sequences, including Myb, homeobox, WRKY, basic helix-loop-helix (bHLH), and other family proteins. Additionally, a total of 2,772 simple sequence repeat (SSR) were identified from 2,361 unique sequences, of which, di-nucleotide motifs were the most abundant motif.</p> <p>Conclusion</p> <p>This study is the first to present a large-scale EST dataset for <it>P. notoginseng </it>root acquired by next-generation sequencing (NGS) technology. The candidate genes involved in triterpene saponin biosynthesis, including the putative CYP450s and UGTs, were obtained in this study. Additionally, the identification of SSRs provided plenty of genetic makers for molecular breeding and genetics applications in this species. These data will provide information on gene discovery, transcriptional regulation and marker-assisted selection for <it>P. notoginseng</it>. The dataset establishes an important foundation for the study with the purpose of ensuring adequate drug resources for this species.</p

    Sequencing and Genetic Variation of Multidrug Resistance Plasmids in Klebsiella pneumoniae

    Get PDF
    BACKGROUND: The development of multidrug resistance is a major problem in the treatment of pathogenic microorganisms by distinct antimicrobial agents. Characterizing the genetic variation among plasmids from different bacterial species or strains is a key step towards understanding the mechanism of virulence and their evolution. RESULTS: We applied a deep sequencing approach to 206 clinical strains of Klebsiella pneumoniae collected from 2002 to 2008 to understand the genetic variation of multidrug resistance plasmids, and to reveal the dynamic change of drug resistance over time. First, we sequenced three plasmids (70 Kb, 94 Kb, and 147 Kb) from a clonal strain of K. pneumoniae using Sanger sequencing. Using the Illumina sequencing technology, we obtained more than 17 million of short reads from two pooled plasmid samples. We mapped these short reads to the three reference plasmid sequences, and identified a large number of single nucleotide polymorphisms (SNPs) in these pooled plasmids. Many of these SNPs are present in drug-resistance genes. We also found that a significant fraction of short reads could not be mapped to the reference sequences, indicating a high degree of genetic variation among the collection of K. pneumoniae isolates. Moreover, we identified that plasmid conjugative transfer genes and antibiotic resistance genes are more likely to suffer from positive selection, as indicated by the elevated rates of nonsynonymous substitution. CONCLUSION: These data represent the first large-scale study of genetic variation in multidrug resistance plasmids and provide insight into the mechanisms of plasmid diversification and the genetic basis of antibiotic resistance

    Extensive pyrosequencing reveals frequent intra-genomic variations of internal transcribed spacer regions of nuclear ribosomal DNA

    Get PDF
    BACKGROUND: Internal transcribed spacer of nuclear ribosomal DNA (nrDNA) is already one of the most popular phylogenetic and DNA barcoding markers. However, the existence of its multiple copies has complicated such usage and a detailed characterization of intra-genomic variations is critical to address such concerns. METHODOLOGY/PRINCIPAL FINDINGS: In this study, we used sequence-tagged pyrosequencing and genome-wide analyses to characterize intra-genomic variations of internal transcribed spacer 2 (ITS2)regions from 178 plant species. We discovered that mutation of ITS2 is frequent, with a mean of 35 variants per species. And on average, three of the most abundant variants make up 91% of all ITS2 copies. Moreover, we found different congeneric species share identical variants in 13 genera. Interestingly, different species across different genera also share identical variants. In particular, one minor variant of ITS2 in Eleutherococcus giraldii was found identical to the ITS2 major variant of Panax ginseng, both from Araliaceae family. In addition, DNA barcoding gap analysis showed that the intra-genomic distances were markedly smaller than those of the intra-specific or inter-specific variants. When each of 5543 variants were examined for its species discrimination efficiency, a 97% success rate was obtained at the species level. CONCLUSIONS: Identification of identical ITS2 variants across intra-generic or inter-generic species revealed complex species evolutionary history, possibly, horizontal gene transfer and ancestral hybridization. Although intra-genomic multiple variants are frequently found within each genome, the usage of the major variants alone is sufficient for phylogeny construction and species determination in most cases. Furthermore, the inclusion of minor variants further improves the resolution of species identification.Jingyuan Song, Linchun Shi, Dezhu Li, Yongzhen Sun, Yunyun Niu, Zhiduan Chen, Hongmei Luo, Xiaohui Pang, Zhiying Sun, Chang Liu, Aiping Lv, Youping Deng, Zachary Larson-Rabin, Mike Wilkinson and Shilin Che

    Pyrosequencing of the Camptotheca acuminata transcriptome reveals putative genes involved in camptothecin biosynthesis and transport

    Get PDF
    Background: Camptotheca acuminata is a Nyssaceae plant, often called the "happy tree", which is indigenous in Southern China. C. acuminata produces the terpenoid indole alkaloid, camptothecin (CPT), which exhibits clinical effects in various cancer treatments. Despite its importance, little is known about the transcriptome of C. acuminata and the mechanism of CPT biosynthesis, as only few nucleotide sequences are included in the GenBank database.Results: From a constructed cDNA library of young C. acuminata leaves, a total of 30,358 unigenes, with an average length of 403 bp, were obtained after assembly of 74,858 high quality reads using GS De Novo assembler software. Through functional annotation, a total of 21,213 unigenes were annotated at least once against the NCBI nucleotide (Nt), non-redundant protein (Nr), Uniprot/SwissProt, Kyoto Encyclopedia of Genes and Genomes (KEGG), and Arabidopsis thaliana proteome (TAIR) databases. Further analysis identified 521 ESTs representing 20 enzyme genes that are involved in the backbone of the CPT biosynthetic pathway in the library. Three putative genes in the upstream pathway, including genes for geraniol-10-hydroxylase (CaPG10H), secologanin synthase (CaPSCS), and strictosidine synthase (CaPSTR) were cloned and analyzed. The expression level of the three genes was also detected using qRT-PCR in C. acuminata. With respect to the branch pathway of CPT synthesis, six cytochrome P450s transcripts were selected as candidate transcripts by detection of transcript expression in different tissues using qRT-PCR. In addition, one glucosidase gene was identified that might participate in CPT biosynthesis. For CPT transport, three of 21 transcripts for multidrug resistance protein (MDR) transporters were also screened from the dataset by their annotation result and gene expression analysis.Conclusion: This study produced a large amount of transcriptome data from C. acuminata by 454 pyrosequencing. According to EST annotation, catalytic features prediction, and expression analysis, novel putative transcripts involved in CPT biosynthesis and transport were discovered in C. acuminata. This study will facilitate further identification of key enzymes and transporter genes in C. acuminata

    Raw Data-Based Motion Compensation for High-Resolution Sliding Spotlight Synthetic Aperture Radar

    No full text
    For accurate motion compensation (MOCO) in airborne synthetic aperture radar (SAR) imaging, a high-precision inertial navigation system (INS) is required. However, an INS is not always precise enough or is sometimes not even included in airborne SAR systems. In this paper, a new, raw, data-based range-invariant motion compensation approach, which can effectively extract the displacements in the line-of-sight (LOS) direction, is proposed for high-resolution sliding spotlight SAR mode. In this approach, the sub-aperture radial accelerations of the airborne platform are estimated via a well-developed weighted total least square (WTLS) method considering the time-varying beam direction. The effectiveness of the proposed approach is validated by two airborne sliding spotlight C band SAR raw datasets containing different types of terrain, with a high spatial resolution of about 0.15 m in azimuth

    Plastome evolution in the genus Sium (Apiaceae, Oenantheae) inferred from phylogenomic and comparative analyses

    No full text
    Abstract Background Sium L. (Apiaceae) is a small genus distributed primarily in Eurasia, with one species also occurring in North America. Recently, its circumscription has been revised to include 10 species, however, the phylogenetic relationships within its two inclusive clades were poorly supported or collapsed in previous studies based on nuclear ribosomal DNA ITS or cpDNA sequences. To identify molecular markers suitable for future intraspecific phylogeographic and population genetic studies, and to evaluate the efficacy of plastome in resolving the phylogenetic relationships of the genus, the complete chloroplast (cp) genomes of six Sium species were sequenced. Results The Sium plastomes exhibited typical quadripartite structures of Apiaceae and most other higher plant plastid DNAs, and were relatively conserved in their size (153,029–155,006 bp), gene arrangement and content (with 114 unique genes). A total of 61–67 SSRs, along with 12 highly divergent regions (trnQ, trnG-atpA, trnE-trnT, rps4-trnT, accD-psbI, rpl16, ycf1-ndhF, ndhF-rpl32, rpl32-trnL, ndhE-ndhG, ycf1a and ycf1b) were discovered in the plastomes. No significant IR length variation was detected showing that plastome evolution was conserved within this genus. Phylogenomic analysis based on whole chloroplast genome sequences produced a highly resolved phylogenetic tree, in which the monophyly of Sium, as well as the sister relationship of its two inclusive clades were strongly supported. Conclusions The plastome sequences could greatly improve phylogenetic resolution, and will provide genomic resources and potential markers useful for future studies of the genus
    corecore