348 research outputs found

    A novel method to identify cooperative functional modules: study of module coordination in the Saccharomyces cerevisiae cell cycle

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Identifying key components in biological processes and their associations is critical for deciphering cellular functions. Recently, numerous gene expression and molecular interaction experiments have been reported in <it>Saccharomyces cerevisiae</it>, and these have enabled systematic studies. Although a number of approaches have been used to predict gene functions and interactions, tools that analyze the essential coordination of functional components in cellular processes still need to be developed.</p> <p>Results</p> <p>In this work, we present a new approach to study the cooperation of functional modules (sets of functionally related genes) in a specific cellular process. A cooperative module pair is defined as two modules that significantly cooperate with certain functional genes in a cellular process. This method identifies cooperative module pairs that significantly influence a cellular process and the correlated genes and interactions that are essential to that process. Using the yeast cell cycle as an example, we identified 101 cooperative module associations among 82 modules, and importantly, we established a cell cycle-specific cooperative module network. Most of the identified module pairs cover cooperative pathways and components essential to the cell cycle. We found that 14, 36, 18, 15, and 20 cooperative module pairs significantly cooperate with genes regulated in early G1, late G1, S, G2, and M phase, respectively. Fifty-nine module pairs that correlate with Cdc28 and other essential regulators were also identified. These results are consistent with previous studies and demonstrate that our methodology is effective for studying cooperative mechanisms in the cell cycle.</p> <p>Conclusions</p> <p>In this work, we propose a new approach to identifying condition-related cooperative interactions, and importantly, we establish a cell cycle-specific cooperation module network. These results provide a global view of the cell cycle and the method can be used to discover the dynamic coordination properties of functional components in other cellular processes.</p

    GeneAlign: a coding exon prediction tool based on phylogenetical comparisons

    Get PDF
    GeneAlign is a coding exon prediction tool for predicting protein coding genes by measuring the homologies between a sequence of a genome and related sequences, which have been annotated, of other genomes. Identifying protein coding genes is one of most important tasks in newly sequenced genomes. With increasing numbers of gene annotations verified by experiments, it is feasible to identify genes in the newly sequenced genomes by comparing to annotated genes of phylogenetically close organisms. GeneAlign applies CORAL, a heuristic linear time alignment tool, to determine if regions flanked by the candidate signals (initiation codon-GT, AG-GT and AG-STOP codon) are similar to annotated coding exons. Employing the conservation of gene structures and sequence homologies between protein coding regions increases the prediction accuracy. GeneAlign was tested on Projector dataset of 491 human–mouse homologous sequence pairs. At the gene level, both the average sensitivity and the average specificity of GeneAlign are 81%, and they are larger than 96% at the exon level. The rates of missing exons and wrong exons are smaller than 1%. GeneAlign is a free tool available at

    A pre-S gene chip to detect pre-S deletions in hepatitis B virus large surface antigen as a predictive marker for hepatoma risk in chronic hepatitis B virus carriers

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Chronic hepatitis B virus (HBV) infection is an important cause of hepatocellular carcinoma (HCC) worldwide. The pre-S<sub>1 </sub>and -S<sub>2 </sub>mutant large HBV surface antigen (LHBS), in which the pre-S<sub>1 </sub>and -S<sub>2 </sub>regions of the LHBS gene are partially deleted, are highly associated with HBV-related HCC.</p> <p>Methods</p> <p>The pre-S region of the LHBS gene in two hundred and one HBV-positive serum samples was PCR-amplified and sequenced. A pre-S oligonucleotide gene chip was developed to efficiently detect pre-S deletions in chronic HBV carriers. Twenty serum samples from chronic HBV carriers were analyzed using the chip.</p> <p>Results</p> <p>The pre-S deletion rates were relatively low (7%) in the sera of patients with acute HBV infection. They gradually increased in periods of persistent HBV infection: pre-S mutation rates were 37% in chronic HBV carriers, and as high as 60% in HCC patients. The Pre-S Gene Chip offers a highly sensitive and specific method for pre-S deletion detection and is less expensive and more efficient (turnaround time 3 days) than DNA sequencing analysis.</p> <p>Conclusion</p> <p>The pre-S<sub>1/2 </sub>mutants may emerge during the long-term persistence of the HBV genome in carriers and facilitate HCC development. Combined detection of pre-S mutations, other markers of HBV replication, and viral titers, offers a reliable predictive method for HCC risks in chronic HBV carriers.</p

    Estimating quality weights for EQ-5D (EuroQol-5 dimensions) health states with the time trade-off method in Taiwan

    Get PDF
    Background/PurposeEQ-5D (EuroQol-5 dimensions) is a preference-based measure of health, which is widely used in cost–utility analyses. It has been suggested that each country should develop its own value set. We therefore sought to develop the quality weights of the EQ-5D health states with the time trade-off (TTO) method in Taiwan.MethodsA total of 745 respondents consisting of employees and volunteers in 17 different hospitals were recruited and interviewed. Each of them valued 13 of 73 EQ-5D health states using the TTO method. Based on the three exclusion criteria for valuation data, only 456 (61.21%) respondents were considered eligible for data analysis. The quality weights for all EQ-5D health states were modeled by generalized estimating equations (GEEs).ResultsOver half of the responses were given negative values, and the medical personnel seemed to have a significantly higher TTO value (+0.1) than others after controlling for other predictors. The N3 model (level 3 occurred within at least 1 dimension) yielded an acceptable fit for the observed OTT data [mean absolute error (MAE) = 0.056, R2 = 0.35]. The magnitude of mean absolute differences (MADs) between Taiwan data and those from the UK, Japan, and South Korea ranged from 0.146 to 0.592, but the rank correlation coefficients were all above 0.811.ConclusionThis study reaffirms the differences in health-related preference values across countries. The high proportion of negative values might indicate that we have also partially measured the intensity of fear in addition to the utility of different health states

    NERBio: using selected word conjunctions, term normalization, and global patterns to improve biomedical named entity recognition

    Get PDF
    BACKGROUND: Biomedical named entity recognition (Bio-NER) is a challenging problem because, in general, biomedical named entities of the same category (e.g., proteins and genes) do not follow one standard nomenclature. They have many irregularities and sometimes appear in ambiguous contexts. In recent years, machine-learning (ML) approaches have become increasingly common and now represent the cutting edge of Bio-NER technology. This paper addresses three problems faced by ML-based Bio-NER systems. First, most ML approaches usually employ singleton features that comprise one linguistic property (e.g., the current word is capitalized) and at least one class tag (e.g., B-protein, the beginning of a protein name). However, such features may be insufficient in cases where multiple properties must be considered. Adding conjunction features that contain multiple properties can be beneficial, but it would be infeasible to include all conjunction features in an NER model since memory resources are limited and some features are ineffective. To resolve the problem, we use a sequential forward search algorithm to select an effective set of features. Second, variations in the numerical parts of biomedical terms (e.g., "2" in the biomedical term IL2) cause data sparseness and generate many redundant features. In this case, we apply numerical normalization, which solves the problem by replacing all numerals in a term with one representative numeral to help classify named entities. Third, the assignment of NE tags does not depend solely on the target word's closest neighbors, but may depend on words outside the context window (e.g., a context window of five consists of the current word plus two preceding and two subsequent words). We use global patterns generated by the Smith-Waterman local alignment algorithm to identify such structures and modify the results of our ML-based tagger. This is called pattern-based post-processing. RESULTS: To develop our ML-based Bio-NER system, we employ conditional random fields, which have performed effectively in several well-known tasks, as our underlying ML model. Adding selected conjunction features, applying numerical normalization, and employing pattern-based post-processing improve the F-scores by 1.67%, 1.04%, and 0.57%, respectively. The combined increase of 3.28% yields a total score of 72.98%, which is better than the baseline system that only uses singleton features. CONCLUSION: We demonstrate the benefits of using the sequential forward search algorithm to select effective conjunction feature groups. In addition, we show that numerical normalization can effectively reduce the number of redundant and unseen features. Furthermore, the Smith-Waterman local alignment algorithm can help ML-based Bio-NER deal with difficult cases that need longer context windows
    corecore