136 research outputs found

    The caCORE Software Development Kit: Streamlining construction of interoperable biomedical information services

    Get PDF
    BACKGROUND: Robust, programmatically accessible biomedical information services that syntactically and semantically interoperate with other resources are challenging to construct. Such systems require the adoption of common information models, data representations and terminology standards as well as documented application programming interfaces (APIs). The National Cancer Institute (NCI) developed the cancer common ontologic representation environment (caCORE) to provide the infrastructure necessary to achieve interoperability across the systems it develops or sponsors. The caCORE Software Development Kit (SDK) was designed to provide developers both within and outside the NCI with the tools needed to construct such interoperable software systems. RESULTS: The caCORE SDK requires a Unified Modeling Language (UML) tool to begin the development workflow with the construction of a domain information model in the form of a UML Class Diagram. Models are annotated with concepts and definitions from a description logic terminology source using the Semantic Connector component. The annotated model is registered in the Cancer Data Standards Repository (caDSR) using the UML Loader component. System software is automatically generated using the Codegen component, which produces middleware that runs on an application server. The caCORE SDK was initially tested and validated using a seven-class UML model, and has been used to generate the caCORE production system, which includes models with dozens of classes. The deployed system supports access through object-oriented APIs with consistent syntax for retrieval of any type of data object across all classes in the original UML model. The caCORE SDK is currently being used by several development teams, including by participants in the cancer biomedical informatics grid (caBIG) program, to create compatible data services. caBIG compatibility standards are based upon caCORE resources, and thus the caCORE SDK has emerged as a key enabling technology for caBIG. CONCLUSION: The caCORE SDK substantially lowers the barrier to implementing systems that are syntactically and semantically interoperable by providing workflow and automation tools that standardize and expedite modeling, development, and deployment. It has gained acceptance among developers in the caBIG program, and is expected to provide a common mechanism for creating data service nodes on the data grid that is under development

    Structural analysis of MDM2 RING separates degradation from regulation of p53 transcription activity

    Get PDF
    MDM2–MDMX complexes bind the p53 tumor-suppressor protein, inhibiting p53's transcriptional activity and targeting p53 for proteasomal degradation. Inhibitors that disrupt binding between p53 and MDM2 efficiently activate a p53 response, but their use in the treatment of cancers that retain wild-type p53 may be limited by on-target toxicities due to p53 activation in normal tissue. Guided by a novel crystal structure of the MDM2–MDMX–E2(UbcH5B)–ubiquitin complex, we designed MDM2 mutants that prevent E2–ubiquitin binding without altering the RING-domain structure. These mutants lack MDM2's E3 activity but retain the ability to limit p53′s transcriptional activity and allow cell proliferation. Cells expressing these mutants respond more quickly to cellular stress than cells expressing wild-type MDM2, but basal p53 control is maintained. Targeting the MDM2 E3-ligase activity could therefore widen the therapeutic window of p53 activation in tumors

    Customisation of the Exome Data Analysis Pipeline Using a Combinatorial Approach

    Get PDF
    The advent of next generation sequencing (NGS) technologies have revolutionised the way biologists produce, analyse and interpret data. Although NGS platforms provide a cost-effective way to discover genome-wide variants from a single experiment, variants discovered by NGS need follow up validation due to the high error rates associated with various sequencing chemistries. Recently, whole exome sequencing has been proposed as an affordable option compared to whole genome runs but it still requires follow up validation of all the novel exomic variants. Customarily, a consensus approach is used to overcome the systematic errors inherent to the sequencing technology, alignment and post alignment variant detection algorithms. However, the aforementioned approach warrants the use of multiple sequencing chemistry, multiple alignment tools, multiple variant callers which may not be viable in terms of time and money for individual investigators with limited informatics know-how. Biologists often lack the requisite training to deal with the huge amount of data produced by NGS runs and face difficulty in choosing from the list of freely available analytical tools for NGS data analysis. Hence, there is a need to customise the NGS data analysis pipeline to preferentially retain true variants by minimising the incidence of false positives and make the choice of right analytical tools easier. To this end, we have sampled different freely available tools used at the alignment and post alignment stage suggesting the use of the most suitable combination determined by a simple framework of pre-existing metrics to create significant datasets

    Chromosome alterations in human hepatocellular carcinomas correlate with aetiology and histological grade – results of an explorative CGH meta-analysis

    Get PDF
    All available comparative genomic hybridisation (CGH) analyses (n=31, until 12/2003) of human hepatocellular carcinomas (HCCs; n=785) and premalignant dysplastic nodules (DNs; n=30) were compiled and correlated with clinical and histological parameters. The most prominent amplifications of genomic material were present in 1q (57.1%), 8q (46.6%), 6p (22.3%), and 17q (22.2%), while losses were most prevalent in 8p (38%), 16q (35.9%), 4q (34.3%), 17p (32.1%), and 13q (26.2%). Deletions of 4q, 16q, 13q, and 8p positively correlated with hepatitis B virus aetiology, while losses of 8p were more frequently found in hepatitis C virus-negative cases. In poorly differentiated HCCs, 13q and 4q were significantly under-represented. Moreover, gains of 1q were positively correlated with the occurrence of all other high-frequency alterations in HCCs. In DNs, amplifications were most frequently present in 1q and 8q, while deletions occurred in 8p, 17p, 5p, 13q, 14q, and 16q. In conclusion, aetiology and dedifferentiation correlate with specific genomic alterations in human HCCs. Gains of 1q appear to be rather early events that may predispose to further chromosomal abnormalities. Thus, explorative CGH meta-analysis generates novel and testable hypotheses regarding the cause and functional significance of genomic alterations in human HCCs

    A Comparative Survey of the Frequency and Distribution of Polymorphism in the Genome of Xenopus tropicalis

    Get PDF
    Naturally occurring DNA sequence variation within a species underlies evolutionary adaptation and can give rise to phenotypic changes that provide novel insight into biological questions. This variation exists in laboratory populations just as in wild populations and, in addition to being a source of useful alleles for genetic studies, can impact efforts to identify induced mutations in sequence-based genetic screens. The Western clawed frog Xenopus tropicalis (X. tropicalis) has been adopted as a model system for studying the genetic control of embryonic development and a variety of other areas of research. Its diploid genome has been extensively sequenced and efforts are underway to isolate mutants by phenotype- and genotype-based approaches. Here, we describe a study of genetic polymorphism in laboratory strains of X. tropicalis. Polymorphism was detected in the coding and non-coding regions of developmental genes distributed widely across the genome. Laboratory strains exhibit unexpectedly high frequencies of genetic polymorphism, with alleles carrying a variety of synonymous and non-synonymous codon substitutions and nucleotide insertions/deletions. Inter-strain comparisons of polymorphism uncover a high proportion of shared alleles between Nigerian and Ivory Coast strains, in spite of their distinct geographical origins. These observations will likely influence the design of future sequence-based mutation screens, particularly those using DNA mismatch-based detection methods which can be disrupted by the presence of naturally occurring sequence variants. The existence of a significant reservoir of alleles also suggests that existing laboratory stocks may be a useful source of novel alleles for mapping and functional studies

    Sharing Detailed Research Data Is Associated with Increased Citation Rate

    Get PDF
    BACKGROUND: Sharing research data provides benefit to the general scientific community, but the benefit is less obvious for the investigator who makes his or her data available. PRINCIPAL FINDINGS: We examined the citation history of 85 cancer microarray clinical trial publications with respect to the availability of their data. The 48% of trials with publicly available microarray data received 85% of the aggregate citations. Publicly available data was significantly (p = 0.006) associated with a 69% increase in citations, independently of journal impact factor, date of publication, and author country of origin using linear regression. SIGNIFICANCE: This correlation between publicly available data and increased literature impact may further motivate investigators to share their detailed research data

    ConservedPrimers 2.0: A high-throughput pipeline for comparative genome referenced intron-flanking PCR primer design and its application in wheat SNP discovery

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In some genomic applications it is necessary to design large numbers of PCR primers in exons flanking one or several introns on the basis of orthologous gene sequences in related species. The primer pairs designed by this target gene approach are called "intron-flanking primers" or because they are located in exonic sequences which are usually conserved between related species, "conserved primers". They are useful for large-scale single nucleotide polymorphism (SNP) discovery and marker development, especially in species, such as wheat, for which a large number of ESTs are available but for which genome sequences and intron/exon boundaries are not available. To date, no suitable high-throughput tool is available for this purpose.</p> <p>Results</p> <p>We have developed, the ConservedPrimers 2.0 pipeline, for designing intron-flanking primers for large-scale SNP discovery and marker development, and demonstrated its utility in wheat. This tool uses non-redundant wheat EST sequences, such as wheat contigs and singleton ESTs, and related genomic sequences, such as those of rice, as inputs. It aligns the ESTs to the genomic sequences to identify unique colinear exon blocks and predicts intron lengths. Intron-flanking primers are then designed based on the intron/exon information using the Primer3 core program or BatchPrimer3. Finally, a tab-delimited file containing intron-flanking primer pair sequences and their primer properties is generated for primer ordering and their PCR applications. Using this tool, 1,922 bin-mapped wheat ESTs (31.8% of the 6,045 in total) were found to have unique colinear exon blocks suitable for primer design and 1,821 primer pairs were designed from these single- or low-copy genes for PCR amplification and SNP discovery. With these primers and subsequently designed genome-specific primers, a total of 1,527 loci were found to contain one or more genome-specific SNPs.</p> <p>Conclusion</p> <p>The ConservedPrimers 2.0 pipeline for designing intron-flanking primers was developed and its utility demonstrated. The tool can be used for SNP discovery, genetic variation assays and marker development for any target genome that has abundant ESTs and a related reference genome that has been fully sequenced. The ConservedPrimers 2.0 pipeline has been implemented as a command-line tool as well as a web application. Both versions are freely available at <url>http://wheat.pw.usda.gov/demos/ConservedPrimers/</url>.</p
    • …
    corecore