12 research outputs found

    Gene Characterization Index: Assessing the Depth of Gene Annotation

    Get PDF
    We introduce the Gene Characterization Index, a bioinformatics method for scoring the extent to which a protein-encoding gene is functionally described. Inherently a reflection of human perception, the Gene Characterization Index is applied for assessing the characterization status of individual genes, thus serving the advancement of both genome annotation and applied genomics research by rapid and unbiased identification of groups of uncharacterized genes for diverse applications such as directed functional studies and delineation of novel drug targets.The scoring procedure is based on a global survey of researchers, who assigned characterization scores from 1 (poor) to 10 (extensive) for a sample of genes based on major online resources. By evaluating the survey as training data, we developed a bioinformatics procedure to assign gene characterization scores to all genes in the human genome. We analyzed snapshots of functional genome annotation over a period of 6 years to assess temporal changes reflected by the increase of the average Gene Characterization Index. Applying the Gene Characterization Index to genes within pharmaceutically relevant classes, we confirmed known drug targets as high-scoring genes and revealed potentially interesting novel targets with low characterization indexes. Removing known drug targets and genes linked to sequence-related patent filings from the entirety of indexed genes, we identified sets of low-scoring genes particularly suited for further experimental investigation.The Gene Characterization Index is intended to serve as a tool to the scientific community and granting agencies for focusing resources and efforts on unexplored areas of the genome. The Gene Characterization Index is available from http://cisreg.ca/gci/

    Mechanistic Multidimensional Modeling of Forced Convection Boiling Heat Transfer

    No full text
    Due to the importance of boiling heat transfer in general, and boiling crisis in particular, for the analysis of operation and safety of both nuclear reactors and conventional thermal power systems, extensive efforts have been made in the past to develop a variety of methods and tools to evaluate the boiling heat transfer coefficient and to assess the onset of temperature excursion and critical heat flux (CHF) at various operating conditions of boiling channels. The objective of this paper is to present mathematical modeling concepts behind the development of mechanistic multidimensional models of low-quality forced convection boiling, including the mechanisms leading to temperature excursion and the onset of CHF

    NovelFam3000 - Uncharacterized human protein domains conserved across model organisms

    Get PDF
    Background: Despite significant efforts from the research community, an extensive portion of the proteins encoded by human genes lack an assigned cellular function. Most metazoan proteins are composed of structural and/or functional domains, of which many appear in multiple proteins. Once a domain is characterized in one protein, the presence of a similar sequence in an uncharacterized protein serves as a basis for inference of function. Thus knowledge of a domain's function, or the protein within which it arises, can facilitate the analysis of an entire set of proteins. Description: From the Pfam domain database, we extracted uncharacterized protein domains represented in proteins from humans, worms, and flies. A data centre was created to facilitate the analysis of the uncharacterized domain-containing proteins. The centre both provides researchers with links to dispersed internet resources containing gene-specific experimental data and enables them to post relevant experimental results or comments. For each human gene in the system, a characterization score is posted, allowing users to track the progress of characterization over time or to identify for study uncharacterized domains in well-characterized genes. As a test of the system, a subset of 39 domains was selected for analysis and the experimental results posted to the NovelFam3000 system. For 25 human protein members of these 39 domain families, detailed sub-cellular localizations were determined. Specific observations are presented based on the analysis of the integrated information provided through the online NovelFam3000 system. Conclusion: Consistent experimental results between multiple members of a domain family allow for inferences of the domain's functional role. We unite bioinformatics resources and experimental data in order to accelerate the functional characterization of scarcely annotated domain families.Medical Genetics, Department ofMedicine, Faculty ofMolecular Medicine and Therapeutics, Centre forReviewedFacult

    A new approach to genome mapping and sequencing: slalom libraries

    No full text
    We describe here an efficient strategy for simultaneous genome mapping and sequencing. The approach is based on physically oriented, overlapping restriction fragment libraries called slalom libraries. Slalom libraries combine features of general genomic, jumping and linking libraries. Slalom libraries can be adapted to different applications and two main types of slalom libraries are described in detail. This approach was used to map and sequence (with ∼46% coverage) two human P1-derived artificial chromosome (PAC) clones, each of ∼100 kb. This model experiment demonstrates the feasibility of the approach and shows that the efficiency (cost-effectiveness and speed) of existing mapping/sequencing methods could be improved at least 5–10-fold. Furthermore, since the efficiency of contig assembly in the slalom approach is virtually independent of length of sequence reads, even short sequences produced by rapid, high throughput sequencing techniques would suffice to complete a physical map and a sequence scan of a small genome

    NotI clones in the analysis of the human genome

    No full text
    NotI linking clones contain sequences flanking NotI recognition sites and were previously shown to be tightly associated with CpG islands and genes. To directly assess the value of NotI clones in genome research, high density grids with 50 000 NotI linking clones originating from six representative NotI linking libraries were constructed. Altogether, these libraries contained nearly 100 times the total number of NotI sites in the human genome. A total of 3437 sequences flanking NotI sites were generated. Analysis of 3265 unique sequences demonstrated that 51% of the clones displayed significant protein similarity to SWISSPROT and TREMBL database proteins based on MSPcrunch filtering with stringent parameters. Of the 3265 sequences, 1868 (57.2%) were new sequences, not present in the EMBL and EST databases (similarity ≤ 90%). Among these new sequences, 795 (24.3%) showed similarity to known proteins and 712 (21.8%) displayed an identity of >75% at the nucleotide level to sequences from EMBL or EST databases. The remaining 361 (11.1%) sequences were completely new, i.e. <75% identical. The work also showed tight, specific association of NotI sites with the first exon and suggest that the so-called 3′ ESTs can actually be generated from 5′-ends of genes that contain NotI sites in their first exon
    corecore