34 research outputs found

    National Mesothelioma Virtual Bank: A standard based biospecimen and clinical data resource to enhance translational research

    Get PDF
    Background: Advances in translational research have led to the need for well characterized biospecimens for research. The National Mesothelioma Virtual Bank is an initiative which collects annotated datasets relevant to human mesothelioma to develop an enterprising biospecimen resource to fulfill researchers' need. Methods: The National Mesothelioma Virtual Bank architecture is based on three major components: (a) common data elements (based on College of American Pathologists protocol and National North American Association of Central Cancer Registries standards), (b) clinical and epidemiologic data annotation, and (c) data query tools. These tools work interoperably to standardize the entire process of annotation. The National Mesothelioma Virtual Bank tool is based upon the caTISSUE Clinical Annotation Engine, developed by the University of Pittsburgh in cooperation with the Cancer Biomedical Informatics Gridℱ (caBIGℱ, see http://cabig.nci.nih.gov). This application provides a web-based system for annotating, importing and searching mesothelioma cases. The underlying information model is constructed utilizing Unified Modeling Language class diagrams, hierarchical relationships and Enterprise Architect software. Result: The database provides researchers real-time access to richly annotated specimens and integral information related to mesothelioma. The data disclosed is tightly regulated depending upon users' authorization and depending on the participating institute that is amenable to the local Institutional Review Board and regulation committee reviews. Conclusion: The National Mesothelioma Virtual Bank currently has over 600 annotated cases available for researchers that include paraffin embedded tissues, tissue microarrays, serum and genomic DNA. The National Mesothelioma Virtual Bank is a virtual biospecimen registry with robust translational biomedical informatics support to facilitate basic science, clinical, and translational research. Furthermore, it protects patient privacy by disclosing only de-identified datasets to assure that biospecimens can be made accessible to researchers. © 2008 Amin et al; licensee BioMed Central Ltd

    Creating a medical dictionary using word alignment: The influence of sources and resources

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Automatic word alignment of parallel texts with the same content in different languages is among other things used to generate dictionaries for new translations. The quality of the generated word alignment depends on the quality of the input resources. In this paper we report on automatic word alignment of the English and Swedish versions of the medical terminology systems ICD-10, ICF, NCSP, KSH97-P and parts of MeSH and how the terminology systems and type of resources influence the quality.</p> <p>Methods</p> <p>We automatically word aligned the terminology systems using static resources, like dictionaries, statistical resources, like statistically derived dictionaries, and training resources, which were generated from manual word alignment. We varied which part of the terminology systems that we used to generate the resources, which parts that we word aligned and which types of resources we used in the alignment process to explore the influence the different terminology systems and resources have on the recall and precision. After the analysis, we used the best configuration of the automatic word alignment for generation of candidate term pairs. We then manually verified the candidate term pairs and included the correct pairs in an English-Swedish dictionary.</p> <p>Results</p> <p>The results indicate that more resources and resource types give better results but the size of the parts used to generate the resources only partly affects the quality. The most generally useful resources were generated from ICD-10 and resources generated from MeSH were not as general as other resources. Systematic inter-language differences in the structure of the terminology system rubrics make the rubrics harder to align. Manually created training resources give nearly as good results as a union of static resources, statistical resources and training resources and noticeably better results than a union of static resources and statistical resources. The verified English-Swedish dictionary contains 24,000 term pairs in base forms.</p> <p>Conclusion</p> <p>More resources give better results in the automatic word alignment, but some resources only give small improvements. The most important type of resource is training and the most general resources were generated from ICD-10.</p

    An informatics model for tissue banks – Lessons learned from the Cooperative Prostate Cancer Tissue Resource

    Get PDF
    BACKGROUND: Advances in molecular biology and growing requirements from biomarker validation studies have generated a need for tissue banks to provide quality-controlled tissue samples with standardized clinical annotation. The NCI Cooperative Prostate Cancer Tissue Resource (CPCTR) is a distributed tissue bank that comprises four academic centers and provides thousands of clinically annotated prostate cancer specimens to researchers. Here we describe the CPCTR information management system architecture, common data element (CDE) development, query interfaces, data curation, and quality control. METHODS: Data managers review the medical records to collect and continuously update information for the 145 clinical, pathological and inventorial CDEs that the Resource maintains for each case. An Access-based data entry tool provides de-identification and a standard communication mechanism between each group and a central CPCTR database. Standardized automated quality control audits have been implemented. Centrally, an Oracle database has web interfaces allowing multiple user-types, including the general public, to mine de-identified information from all of the sites with three levels of specificity and granularity as well as to request tissues through a formal letter of intent. RESULTS: Since July 2003, CPCTR has offered over 6,000 cases (38,000 blocks) of highly characterized prostate cancer biospecimens, including several tissue microarrays (TMA). The Resource developed a website with interfaces for the general public as well as researchers and internal members. These user groups have utilized the web-tools for public query of summary data on the cases that were available, to prepare requests, and to receive tissues. As of December 2005, the Resource received over 130 tissue requests, of which 45 have been reviewed, approved and filled. Additionally, the Resource implemented the TMA Data Exchange Specification in its TMA program and created a computer program for calculating PSA recurrence. CONCLUSION: Building a biorepository infrastructure that meets today's research needs involves time and input of many individuals from diverse disciplines. The CPCTR can provide large volumes of carefully annotated prostate tissue for research initiatives such as Specialized Programs of Research Excellence (SPOREs) and for biomarker validation studies and its experience can help development of collaborative, large scale, virtual tissue banks in other organ systems

    The development of common data elements for a multi-institute prostate cancer tissue bank: The Cooperative Prostate Cancer Tissue Resource (CPCTR) experience

    Get PDF
    BACKGROUND: The Cooperative Prostate Cancer Tissue Resource (CPCTR) is a consortium of four geographically dispersed institutions that are funded by the U.S. National Cancer Institute (NCI) to provide clinically annotated prostate cancer tissue samples to researchers. To facilitate this effort, it was critical to arrive at agreed upon common data elements (CDEs) that could be used to collect demographic, pathologic, treatment and clinical outcome data. METHODS: The CPCTR investigators convened a CDE curation subcommittee to develop and implement CDEs for the annotation of collected prostate tissues. The draft CDEs were refined and progressively annotated to make them ISO 11179 compliant. The CDEs were implemented in the CPCTR database and tested using software query tools developed by the investigators. RESULTS: By collaborative consensus the CPCTR CDE subcommittee developed 145 data elements to annotate the tissue samples collected. These included for each case: 1) demographic data, 2) clinical history, 3) pathology specimen level elements to describe the staging, grading and other characteristics of individual surgical pathology cases, 4) tissue block level annotation critical to managing a virtual inventory of cases and facilitating case selection, and 5) clinical outcome data including treatment, recurrence and vital status. These elements have been used successfully to respond to over 60 requests by end-users for tissue, including paraffin blocks from cases with 5 to 10 years of follow up, tissue microarrays (TMAs), as well as frozen tissue collected prospectively for genomic profiling and genetic studies. The CPCTR CDEs have been fully implemented in two major tissue banks and have been shared with dozens of other tissue banking efforts. CONCLUSION: The freely available CDEs developed by the CPCTR are robust, based on "best practices" for tissue resources, and are ISO 11179 compliant. The process for CDE development described in this manuscript provides a framework model for other organ sites and has been used as a model for breast and melanoma tissue banking efforts

    Comparative proteome and peptidome analysis of the cephalic fluid secreted by Arapaima gigas (Teleostei: Osteoglossidae) during and outside parental care

    Get PDF
    Parental investment in Arapaima gigas includes nest building and guarding, followed by a care provision when a cephalic fluid is released from the parents&rsquo; head to the offspring. This fluid has presumably important functions for the offspring but so far its composition has not been characterised. In this study the proteome and peptidome of the cephalic secretion was studied in parental and non-parental fish using capillary electrophoresis coupled to mass spectrometry (CE-MS) and GeLC-MS/MS analyses. Multiple comparisons revealed 28 peptides were significantly different between males and parental males (PC-males), 126 between females and parental females (PC-females), 51 between males and females and 9 between PC-males and PC-females. Identification revealed peptides were produced in the inner ear (pcdh15b), eyes (tetraspanin and ppp2r3a), central nervous system (otud4, ribeye a, tjp1b and syn1) among others. A total of 422 proteins were also identified and gene ontology analysis revealed 28 secreted extracellular proteins. From these, 2 hormones (prolactin and stanniocalcin) and 12 proteins associated to immunological processes (serotransferrin, &alpha;-1-antitrypsin homolog, apolipoprotein A-I, and others) were identified. This study provides novel biochemical data on the lateral line fluid which will enable future hypotheses-driven experiments to better understand the physiological roles of the lateral line in chemical communication

    Twelve-month observational study of children with cancer in 41 countries during the COVID-19 pandemic

    Get PDF
    Introduction Childhood cancer is a leading cause of death. It is unclear whether the COVID-19 pandemic has impacted childhood cancer mortality. In this study, we aimed to establish all-cause mortality rates for childhood cancers during the COVID-19 pandemic and determine the factors associated with mortality. Methods Prospective cohort study in 109 institutions in 41 countries. Inclusion criteria: children &lt;18 years who were newly diagnosed with or undergoing active treatment for acute lymphoblastic leukaemia, non-Hodgkin's lymphoma, Hodgkin lymphoma, retinoblastoma, Wilms tumour, glioma, osteosarcoma, Ewing sarcoma, rhabdomyosarcoma, medulloblastoma and neuroblastoma. Of 2327 cases, 2118 patients were included in the study. The primary outcome measure was all-cause mortality at 30 days, 90 days and 12 months. Results All-cause mortality was 3.4% (n=71/2084) at 30-day follow-up, 5.7% (n=113/1969) at 90-day follow-up and 13.0% (n=206/1581) at 12-month follow-up. The median time from diagnosis to multidisciplinary team (MDT) plan was longest in low-income countries (7 days, IQR 3-11). Multivariable analysis revealed several factors associated with 12-month mortality, including low-income (OR 6.99 (95% CI 2.49 to 19.68); p&lt;0.001), lower middle income (OR 3.32 (95% CI 1.96 to 5.61); p&lt;0.001) and upper middle income (OR 3.49 (95% CI 2.02 to 6.03); p&lt;0.001) country status and chemotherapy (OR 0.55 (95% CI 0.36 to 0.86); p=0.008) and immunotherapy (OR 0.27 (95% CI 0.08 to 0.91); p=0.035) within 30 days from MDT plan. Multivariable analysis revealed laboratory-confirmed SARS-CoV-2 infection (OR 5.33 (95% CI 1.19 to 23.84); p=0.029) was associated with 30-day mortality. Conclusions Children with cancer are more likely to die within 30 days if infected with SARS-CoV-2. However, timely treatment reduced odds of death. This report provides crucial information to balance the benefits of providing anticancer therapy against the risks of SARS-CoV-2 infection in children with cancer
    corecore