30 research outputs found

    A novel neural response algorithm for protein function prediction

    Get PDF
    BACKGROUND: Large amounts of data are being generated by high-throughput genome sequencing methods. But the rate of the experimental functional characterization falls far behind. To fill the gap between the number of sequences and their annotations, fast and accurate automated annotation methods are required. Many methods, such as GOblet, GOFigure, and Gotcha, are designed based on the BLAST search. Unfortunately, the sequence coverage of these methods is low as they cannot detect the remote homologues. Adding to this, the lack of annotation specificity advocates the need to improve automated protein function prediction. RESULTS: We designed a novel automated protein functional assignment method based on the neural response algorithm, which simulates the neuronal behavior of the visual cortex in the human brain. Firstly, we predict the most similar target protein for a given query protein and thereby assign its GO term to the query sequence. When assessed on test set, our method ranked the actual leaf GO term among the top 5 probable GO terms with accuracy of 86.93%. CONCLUSIONS: The proposed algorithm is the first instance of neural response algorithm being used in the biological domain. The use of HMM profiles along with the secondary structure information to define the neural response gives our method an edge over other available methods on annotation accuracy. Results of the 5-fold cross validation and the comparison with PFP and FFPred servers indicate the prominent performance by our method. The program, the dataset, and help files are available at http://www.jjwanglab.org/NRProF/.published_or_final_versio

    NRProF: Neural response based protein function prediction algorithm

    Get PDF
    A large amount of proteomic data is being generated due to the advancements in high-throughput genome sequencing. But the rate of functional annotation of these sequences falls far behind. To fill the gap between the number of sequences and their annotations, fast and accurate automated annotation methods are required. Many methods, such as GOblet, GOfigure, and Gotcha, are designed based on the BLAST search. Unfortunately, the sequence coverage of these methods is low as they cannot detect the remote homologues. The lack of annotation coverage of the existing methods advocates novel methods to improve protein function prediction. Here we present a automated protein functional assignment method based on the neural response algorithm, which simulates the neuronal behavior of the visual cortex in the human brain. The main idea of this algorithm is to define a distance metric that corresponds to the similarity of the subsequences and reflects how the human brain can distinguish different sequences. Given query protein, we predict the most similar target protein using a two layered neural response algorithm and thereby assigned the GO term of the target protein to the query. Our method predicted and ranked the actual leaf GO term among the top 5 probable GO terms with 87.66% accuracy. Results of the 5-fold cross validation and the comparison with PFP and FFPred servers indicate the prominent performance by our method. The NRProF program, the dataset, and help files are available at http://www.jjwanglab.org/NRProF/. © 2011 IEEE.published_or_final_versionThe 2011 IEEE International Conference on Systems Biology (ISB), Zhuhai, China, 2-4 September 2011. In Conference Proceedings, 2011, p. 33-4

    ProF: neural response based protein function prediction algorithm

    Get PDF
    Poster Presentation: P-H001A large amount of proteomic data is being generated due to advancements in high-throughput genome sequencing methods. But the rate of the experimental functional characterization falls far behind. To fill the gap between the number of sequences and their annotations, fast and accurate automated annotation methods are required. Many methods, such as GOblet, GOfigure, and Gotcha, are designed based on the BLAST search. Unfortunately, the sequence coverage of these methods is low as they cannot detect the remote homologues. The lack of annotation specificity and high complexity of the existing methods advocate the needs to improve automated protein function prediction method. Here we present a novel automated protein functional assignment method based on the neural response algorithm, which simulates the neuronal behavior of the visual cortex in the human brain. The main idea of this algorithm is to define a distance metric that corresponds to the similarity of the subsequences and reflects how the human brain can distinguish between different sequences. We predicted the most similar target protein for a given query protein using the two layered neural response algorithm and thereby assigned the GO term associated with the target sequence to the query sequence. Our method predicted and ranked the actual leaf GO term among the top 5 probable GO terms with 87.66% accuracy. Results of the 5-fold cross validation and the comparison with PFP and FFPred servers indicate the prominent performance by our method.postprintThe 2011 Hong Kong Inter-University Biochemistry Postgraduate Symposium, Hong Kong, 11 June 2011

    SpliceNet: recovering splicing isoform-specific differential gene networks from RNA-Seq data of normal and diseased samples

    Get PDF
    Conventionally, overall gene expressions from microarrays are used to infer gene networks, but it is challenging to account splicing isoforms. High-throughput RNA Sequencing has made splice variant profiling practical. However, its true merit in quantifying splicing isoforms and isoform-specific exon expressions is not well explored in inferring gene networks. This study demonstrates SpliceNet, a method to infer isoform-specific co-expression networks from exon-level RNA-Seq data, using large dimensional trace. It goes beyond differentially expressed genes and infers splicing isoform network changes between normal and diseased samples. It eases the sample size bottleneck; evaluations on simulated data and lung cancer-specific ERBB2 and MAPK signaling pathways, with varying number of samples, evince the merit in handling high exon to sample size ratio datasets. Inferred network rewiring of well established Bcl-x and EGFR centered networks from lung adenocarcinoma expression data is in good agreement with literature. Gene level evaluations demonstrate a substantial performance of SpliceNet over canonical correlation analysis, a method that is currently applied to exon level RNA-Seq data. SpliceNet can also be applied to exon array data. SpliceNet is distributed as an R package available at http://www.jjwanglab.org/SpliceNet.published_or_final_versio

    C-Terminal Region of EBNA-2 Determines the Superior Transforming Ability of Type 1 Epstein-Barr Virus by Enhanced Gene Regulation of LMP-1 and CXCR7

    Get PDF
    Type 1 Epstein-Barr virus (EBV) strains immortalize B lymphocytes in vitro much more efficiently than type 2 EBV, a difference previously mapped to the EBNA-2 locus. Here we demonstrate that the greater transforming activity of type 1 EBV correlates with a stronger and more rapid induction of the viral oncogene LMP-1 and the cell gene CXCR7 (which are both required for proliferation of EBV-LCLs) during infection of primary B cells with recombinant viruses. Surprisingly, although the major sequence differences between type 1 and type 2 EBNA-2 lie in N-terminal parts of the protein, the superior ability of type 1 EBNA-2 to induce proliferation of EBV-infected lymphoblasts is mostly determined by the C-terminus of EBNA-2. Substitution of the C-terminus of type 1 EBNA-2 into the type 2 protein is sufficient to confer a type 1 growth phenotype and type 1 expression levels of LMP-1 and CXCR7 in an EREB2.5 cell growth assay. Within this region, the RG, CR7 and TAD domains are the minimum type 1 sequences required. Sequencing the C-terminus of EBNA-2 from additional EBV isolates showed high sequence identity within type 1 isolates or within type 2 isolates, indicating that the functional differences mapped are typical of EBV type sequences. The results indicate that the C-terminus of EBNA-2 accounts for the greater ability of type 1 EBV to promote B cell proliferation, through mechanisms that include higher induction of genes (LMP-1 and CXCR7) required for proliferation and survival of EBV-LCLs

    NRProF: Neural Response Based Protein Function Prediction Algorithm

    No full text
    A large amount of proteomic data is being generated due to advancements in high-throughput genome sequencing methods. But the rate of the experimental functional characterization falls far behind. To fill the gap between the number of sequences and their annotations, fast and accurate automated annotation methods are required. Many methods, such as GOblet, GOfigure, and Gotcha, are designed based on the BLAST search. Unfortunately, the sequence coverage of these methods is low as they cannot detect the remote homologues. The lack of annotation specificity and high complexity of the existing methods advocate the needs to improve automated protein function prediction method. Here we present a novel automated protein functional assignment method based on the neural response algorithm, which simulates the neuronal behavior of the visual cortex in the human brain. The main idea of this algorithm is to define a distance metric that corresponds to the similarity of the subsequences and reflects how the human brain can distinguish between different sequences. We predicted the most similar target protein for a given query protein using the two layered neural response algorithm and thereby assigned the GO term associated with the target sequence to the query sequence. Our method predicted and ranked the actual leaf GO term among the top 5 probable GO terms with 87.66% accuracy. Results of the 5-fold cross validation and the comparison with PFP and FFPred servers indicate the prominent performance by our method

    Gene regulatory network inference from high frequency time series data

    No full text
    Poster Presentations: Theme 3 - Reproduction & Development, Cell Biology, and Musculoskeletal System: no. 3.2

    NRProF: neural response based protein function prediction algorithm

    No full text
    Poster Presentation: Theme 1 - Cell Biology, Musculoskeletal System, Reproduction & Development: no. 1.27The 16th Research Postgraduate Symposium (RPS 2011), the University of Hong Kong, Hong Kong, 7-8 December 2011

    A Novel Neural Response Algorithm for Protein Function Prediction

    Get PDF
    BACKGROUND: Large amounts of data are being generated by high-throughput genome sequencing methods. But the rate of the experimental functional characterization falls far behind. To fill the gap between the number of sequences and their annotations, fast and accurate automated annotation methods are required. Many methods, such as GOblet, GOFigure, and Gotcha, are designed based on the BLAST search. Unfortunately, the sequence coverage of these methods is low as they cannot detect the remote homologues. Adding to this, the lack of annotation specificity advocates the need to improve automated protein function prediction. RESULTS: We designed a novel automated protein functional assignment method based on the neural response algorithm, which simulates the neuronal behavior of the visual cortex in the human brain. Firstly, we predict the most similar target protein for a given query protein and thereby assign its GO term to the query sequence. When assessed on test set, our method ranked the actual leaf GO term among the top 5 probable GO terms with accuracy of 86.93%. CONCLUSIONS: The proposed algorithm is the first instance of neural response algorithm being used in the biological domain. The use of HMM profiles along with the secondary structure information to define the neural response gives our method an edge over other available methods on annotation accuracy. Results of the 5-fold cross validation and the comparison with PFP and FFPred servers indicate the prominent performance by our method. The program, the dataset, and help files are available at http://www.jjwanglab.org/NRProF/.published_or_final_versio
    corecore