13 research outputs found

    Automatic, context-specific generation of Gene Ontology slims

    Get PDF
    Background: The use of ontologies to control vocabulary and structure annotation has added value to genome-scale data, and contributed to the capture and re-use of knowledge across research domains. Gene Ontology (GO) is widely used to capture detailed expert knowledge in genomic-scale datasets and as a consequence has grown to contain many terms, making it unwieldy for many applications. To increase its ease of manipulation and efficiency of use, subsets called GO slims are often created by collapsing terms upward into more general, high-level terms relevant to a particular context. Creation of a GO slim currently requires manipulation and editing of GO by an expert (or community) familiar with both the ontology and the biological context. Decisions about which terms to include are necessarily subjective, and the creation process itself and subsequent curation are time-consuming and largely manual

    How to Improve Postgenomic Knowledge Discovery Using Imputation

    Get PDF
    While microarrays make it feasible to rapidly investigate many complex biological problems, their multistep fabrication has the proclivity for error at every stage. The standard tactic has been to either ignore or regard erroneous gene readings as missing values, though this assumption can exert a major influence upon postgenomic knowledge discovery methods like gene selection and gene regulatory network (GRN) reconstruction. This has been the catalyst for a raft of new flexible imputation algorithms including local least square impute and the recent heuristic collateral missing value imputation, which exploit the biological transactional behaviour of functionally correlated genes to afford accurate missing value estimation. This paper examines the influence of missing value imputation techniques upon postgenomic knowledge inference methods with results for various algorithms consistently corroborating that instead of ignoring missing values, recycling microarray data by flexible and robust imputation can provide substantial performance benefits for subsequent downstream procedures

    DLocalMotif: a discriminative approach for discovering local motifs in protein sequences

    Get PDF
    Motivation: Local motifs are patterns of DNA or protein sequences that occur within a sequence interval relative to a biologically defined anchor or landmark. Current protein motif discovery methods do not adequately consider such constraints to identify biologically significant motifs that are only weakly over-represented but spatially confined. Using negatives, i.e. sequences known to not contain a local motif, can further increase the specificity of their discovery

    Performance of the gas gain monitoring system of the CMS RPC muon detector

    Get PDF
    The RPC muon detector of the CMS experiment at the LHC (CERN, Geneva, Switzerland) is equipped with a Gas Gain Monitoring (GGM) system. A report on the stability of the system during the 2011-2012 data taking run is given, as well as the observation of an effect which suggests a novel method for the monitoring of gas mixture composition.Comment: Presented at RPC2014, Beijing, China. Accepted for publication on JINS

    CF-GeNe: Fuzzy Framework for Robust Gene Regulatory Network Inference

    No full text
    Abstract — Most Gene Regulatory Network (GRN) studies ignore the impact of the noisy nature of gene expression data despite its significant influence upon inferred results. This paper presents an innovative Collateral-Fuzzy Gene Regulatory Network Reconstruction (CF-GeNe) framework for Gene Regulatory Network (GRN) inference. The approach uses the Collateral Missing Value Estimation (CMVE) algorithm as its core to estimate missing values in microarray gene expression data. CF-GeNe also mimics the inherent fuzzy nature of gene co-regulation by applying fuzzy clustering principles using the well-established fuzzy cmeans algorithm, with the model adapting to the data distribution by automatically determining key parameters, like the number of clusters. Empirical results confirm that the CMVE-based CF-GeNe paradigm infers the majority of co-regulated links even in the presence of large numbers of missing values, compared to other data imputation methods including: Least Square Impute (LSImpute), K-Nearest Neighbour Impute (KNN), Bayesian Principal Component Analysis Impute (BPCA) and ZeroImpute. The statistical significance of this improved performance has been underscored by gene selection and also by applying the Wilcoxon Ranksum Significance Test, with results corroborating the ability of CF-GeNe to successfully infer GRN interactions in noisy gene expression data

    A clustering based hybrid system for mass spectrometry data analysis

    Get PDF
    Recently, much attention has been given to the mass spectrometry (MS) technology based disease classification, diagnosis, and protein-based biomarker identification. Similar to microarray based investigation, proteomic data generated by such kind of high-throughput experiments are often with high feature-to-sample ratio. Moreover, biological information and pattern are compounded with data noise, redundancy and outliers. Thus, the development of algorithms and procedures for the analysis and interpretation of such kind of data is of paramount importance. In this paper, we propose a hybrid system for analyzing such high dimensional data. The proposed method uses the k-mean clustering algorithm based feature extraction and selection procedure to bridge the filter selection and wrapper selection methods. The potential informative mass/charge (m/z) markers selected by filters are subject to the k-mean clustering algorithm for correlation and redundancy reduction, and a multi-objective Genetic Algorithm selector is then employed to identify discriminative m/z markers generated by k-mean clustering algorithm. Experimental results obtained by using the proposed method indicate that it is suitable for m/z biomarker selection and MS based sample classification. <br /
    corecore