30 research outputs found

    SAIL—a software system for sample and phenotype availability across biobanks and cohorts

    Get PDF
    Summary: The Sample avAILability system—SAIL—is a web based application for searching, browsing and annotating biological sample collections or biobank entries. By providing individual-level information on the availability of specific data types (phenotypes, genetic or genomic data) and samples within a collection, rather than the actual measurement data, resource integration can be facilitated. A flexible data structure enables the collection owners to provide descriptive information on their samples using existing or custom vocabularies. Users can query for the available samples by various parameters combining them via logical expressions. The system can be scaled to hold data from millions of samples with thousands of variables

    Harmonising and linking biomedical and clinical data across disparate data archives to enable integrative cross-biobank research

    Get PDF
    A wealth of biospecimen samples are stored in modern globally distributed biobanks. Biomedical researchers worldwide need to be able to combine the available resources to improve the power of large-scale studies. A prerequisite for this effort is to be able to search and access phenotypic, clinical and other information about samples that are currently stored at biobanks in an integrated manner. However, privacy issues together with heterogeneous information systems and the lack of agreed-upon vocabularies have made specimen searching across multiple biobanks extremely challenging. We describe three case studies where we have linked samples and sample descriptions in order to facilitate global searching of available samples for research. The use cases include the ENGAGE (European Network for Genetic and Genomic Epidemiology) consortium comprising at least 39 cohorts, the SUMMIT (surrogate markers for micro- and macro-vascular hard endpoints for innovative diabetes tools) consortium and a pilot for data integration between a Swedish clinical health registry and a biobank. We used the Sample avAILability (SAIL) method for data linking: first, created harmonised variables and then annotated and made searchable information on the number of specimens available in individual biobanks for various phenotypic categories. By operating on this categorised availability data we sidestep many obstacles related to privacy that arise when handling real values and show that harmonised and annotated records about data availability across disparate biomedical archives provide a key methodological advance in pre-analysis exchange of information between biobanks, that is, during the project planning phase

    Exploring the Use of Genomic and Routinely Collected Data: Narrative Literature Review and Interview Study

    Get PDF
    Background: Advancing the use of genomic data with routinely collected health data holds great promise for health care andresearch. Increasing the use of these data is a high priority to understand and address the causes of disease.Objective: This study aims to provide an outline of the use of genomic data alongside routinely collected data in health researchto date. As this field prepares to move forward, it is important to take stock of the current state of play in order to highlight newavenues for development, identify challenges, and ensure that adequate data governance models are in place for safe and sociallyacceptable progress.Methods: We conducted a literature review to draw information from past studies that have used genomic and routinely collecteddata and conducted interviews with individuals who use these data for health research. We collected data on the following: therationale of using genomic data in conjunction with routinely collected data, types of genomic and routinely collected data used,data sources, project approvals, governance and access models, and challenges encountered.Results: The main purpose of using genomic and routinely collected data was to conduct genome-wide and phenome-wideassociation studies. Routine data sources included electronic health records, disease and death registries, health insurance systems,and deprivation indices. The types of genomic data included polygenic risk scores, single nucleotide polymorphisms, and measuresof genetic activity, and biobanks generally provided these data. Although the literature search showed that biobanks released datato researchers, the case studies revealed a growing tendency for use within a data safe haven. Challenges of working with thesedata revolved around data collection, data storage, technical, and data privacy issues.Conclusions: Using genomic and routinely collected data holds great promise for progressing health research. Several challengesare involved, particularly in terms of privacy. Overcoming these barriers will ensure that the use of these data to progress healthresearch can be exploited to its full potential

    Data management, an essential process for the development of Biobanks

    Get PDF
    El proceso de recolección de material biológico humano, necesita cada vez más la organización de la información para la gestión de las muestras y la integración de los datos relacionados con el participante o donante, provenientes de diferentes fuentes. Con el avance de la computación, esto permitirá identificar posibles interacciones sociodemográficas, genéticas, ambientales, entre otras con determinada enfermedad. Las estructuras de datos, los sistemas de codificación y los sistemas de metadatos, se han convertido en un desafío para la organización de los biobancos. La gestión, integración, seguridad, privacidad y análisis de los datos, son retos importantes para los investigadores y la informática. La normalización de los datos, la armonización e interoperabilidad de sistemasinformáticos de biobancos permitirán el óptimo uso del material biológico, convirtiéndose en no solo un gran recurso para estudios epidemiológicos y clínicos a gran escala, sino también en bases para nuevas pruebas de diagnóstico e intervenciones terapéuticas personalizadas.Palabras clave: Biobanco, sistemas de información, material biológico, gestión de datos.The collection, processing and storage of biological samples need a system for not only organizing and managing the patient samples but also integrating data records from different sources related to these patients. Along with computer advancement, these integration processes will allow to identify possible relationships between sociodemographic, genetic and environmental factors with specific diseases. Therefore, data structures, coding and metadata systems, have become essential elements for controlling biobanks. In fact, management, integration, security, privacy and data analysis are current challenges for scientists and computer administrators. The standardization of data, harmonization and interoperability of biobank computer systems will help to have an optimum use of biological material. As a result, these advances will turn into a great resource for large-scale epidemiological and clinicalstudies as well as the basis for new diagnostic tests and personalized therapies.Keywords: Biobank, data systems, biological samples, data management

    Data management, an essential process for the development of Biobanks

    Get PDF
    El proceso de recolección de material biológico humano, necesita cada vez más la organización de la información para la gestión de las muestras y la integración de los datos relacionados con el participante o donante, provenientes de diferentes fuentes. Con el avance de la computación, esto permitirá identificar posibles interacciones sociodemográficas, genéticas, ambientales, entre otras con determinada enfermedad. Las estructuras de datos, los sistemas de codificación y los sistemas de metadatos, se han convertido en un desafío para la organización de los biobancos. La gestión, integración, seguridad, privacidad y análisis de los datos, son retos importantes para los investigadores y la informática. La normalización de los datos, la armonización e interoperabilidad de sistemasinformáticos de biobancos permitirán el óptimo uso del material biológico, convirtiéndose en no solo un gran recurso para estudios epidemiológicos y clínicos a gran escala, sino también en bases para nuevas pruebas de diagnóstico e intervenciones terapéuticas personalizadas.Palabras clave: Biobanco, sistemas de información, material biológico, gestión de datos.The collection, processing and storage of biological samples need a system for not only organizing and managing the patient samples but also integrating data records from different sources related to these patients. Along with computer advancement, these integration processes will allow to identify possible relationships between sociodemographic, genetic and environmental factors with specific diseases. Therefore, data structures, coding and metadata systems, have become essential elements for controlling biobanks. In fact, management, integration, security, privacy and data analysis are current challenges for scientists and computer administrators. The standardization of data, harmonization and interoperability of biobank computer systems will help to have an optimum use of biological material. As a result, these advances will turn into a great resource for large-scale epidemiological and clinicalstudies as well as the basis for new diagnostic tests and personalized therapies.Keywords: Biobank, data systems, biological samples, data management

    Population biobanking in selected European countries and proposed model for a Polish national DNA bank

    Get PDF
    Population biobanks offer new opportunities for public health, are rudimentary for the development of its new branch called Public Health Genomics, and are important for translational research. This article presents organizational models of population biobanks in selected European countries. Review of bibliography and websites of European population biobanks (UK, Spain, Estonia). Some countries establish national genomic biobanks (DNA banks) in order to conduct research on new methods of prevention, diagnosis and treatment of the genetic and lifestyle diseases and on pharmacogenetic research. Individual countries have developed different organizational models of these institutions and specific legal regulations regarding various ways of obtaining genetic data from the inhabitants, donors’ rights, organizational and legal aspects. Population biobanks in European countries were funded in different manners. In light of these solutions, the authors discuss prospects of establishing a Polish national genomic biobank for research purpose. They propose the creation of such an institution based on the existing network of blood-donation centres and clinical biobanks in Poland

    Identification of biomarkers for quality control in human biological serum samples

    Get PDF
    Easy access to human biological samples and respective detailed clinical information is pivotal to the progress of medical knowledge, contributing to minimize the direct and indirect health costs and improve the quality of life. The concept of biobank was born in this context, as an infrastructure composed by collections of biological samples and respective clinical information. The impact of biobanks is measured by the presence of good quality samples associated with clinical information. Biobanco-IMM, so far, has more than 80000 samples from more than 10000 donors being the most frequent sample type the serum. Quality control of serum sample has been the subject of intense debate in European meetings of biobanks. Therefore, this study aims to find potential biomarkers for quality control of the serum samples stored at Biobanco-IMM. 200 serum samples were grouped according to the time after sampling, time to freezing, duration of storage or number of freeze-thaw cycles and were compared with a control group of samples that were processed according to the Biobanco-IMM standards. CD40L, GM-CSF, IL- 1a, G-CSF and VEGF were measured in the serum by ELISA. The results show that IL-1a and G-CSF have significant differences in the study groups wherein the analysis of the ROC curves found a cutoff of 33.9 pg/ml with 100% sensitivity for IL-1a, and a cutoff of 78.4 pg/mL with 65% specificity to G-CSF. Thus, these can act as biomarkers for serum quality control.A necessidade do rápido acesso a amostras biológicas humanas e respetiva informação clínica com elevada qualidade é fundamental para o avanço científico e médico, contribuindo para a redução dos custos diretos e indiretos com a saúde, bem como para uma melhoria na qualidade de vida dos doentes. É neste contexto que nasce o conceito de biobanco, uma infra-estrutura composta por coleções de amostras biológicas e respetiva informação clínica. O impacto dos biobancos é medido pela presença de amostras de boa qualidade com informação clínica detalhada. O Biobanco-IMM tem até agora mais de 80 mil amostras de mais de 10000 dadores, sendo a amostra mais frequente o soro. O controlo de qualidade de amostras do soro tem sido objeto de grande debate. Desta forma, este estudo tem como objetivo encontrar possíveis biomarcadores de controlo de qualidade para as amostras de soro armazenadas no Biobanco-IMM. A qualidade das amostras de soro pode ser afetada por inúmeros fatores, como por exemplo, o aumento do tempo até centrifugação e o aumento do tempo até congelação que podem influenciar a cascata de coagulação. Por sua vez o tempo de centrifugação e a exposição a temperaturas elevadas influenciam a eficiência de separação de células. Os próprios componentes poliméricos que podem ser libertados pelos vários tipos de tubos de colheita, podem afetar alguns imunoensaios. Desta é crucial a identificação de marcadores biológicos que sejam sensíveis o suficiente para avaliar a qualidade das amostras aquando de desvios ao processamento, como o aumento do tempo até processamento e congelamento, a duração do armazenamento e do número de ciclos de congelamento e descongelamento. Um bom biomarcador sérico para avaliar a qualidade do soro deve ser transversal a todas as amostras, sejam elas provenientes de dadores com patologia ou saudáveis e mostra uma perda de atividade significativa em condições inadequadas de armazenamento. No que respeita à qualidade das amostras de soro, várias moléculas têm sido propostas para o controlo de qualidade, por exemplo: Receptor da Transferrina, Ácido Ascórbico, Potássio, GMCSF, IL-1a, G-CSF, péptidos C3F, fibrinopeptide A, ACTH, CD40L, VEGF e MMP-7. Neste estudo focamo-nos em 5 proteínas (CD40L, GM-CSF, IL-1a, G-CSF e VEGF) que estão presentes no soro, que de acordo com a literatura, podem ser utilizados como biomarcadores para o seu controlo de qualidade. Estas moléculas são as que atualmente possuem maior número de referências na literatura e são apontadas como sendo as indicadas para verificar a qualidade das amostras com os seguintes desvios ao procedimento: aumento do tempo de précentrifugação, aumento do tempo até congelamento, número de ciclos de congelamento e descongelamento e tempo de armazenamento. Neste trabalho, 200 amostras foram dividas em 5 grupos de estudo: controlos, tempo até centrifugação, tempo até congelamento, ciclos de congelamento/descongelamento e tempo de armazenamento. Os níveis dos séricos de CD-40L, GM-CSF, IL-1a, G-CSF e VEGF quantificados por ELISA. Os resultados mostram que IL-1a e G-CSF possuem diferenças significativas nos grupos de estudo sendo que da análise das curvas ROC se estabeleceu um cutoff de 33.9 pg/ml com 100% de sensibilidade para IL-1a e um cutoff de 78.4 pg/ml com 65% de especificidade para o G-CSF. Desta forma estes poderão ser apontados como biomarcadores para o controlo de qualidade de amostras de soro

    Repeatable and reusable research - Exploring the needs of users for a Data Portal for Disease Phenotyping

    Get PDF
    Background: Big data research in the field of health sciences is hindered by a lack of agreement on how to identify and define different conditions and their medications. This means that researchers and health professionals often have different phenotype definitions for the same condition. This lack of agreement makes it hard to compare different study findings and hinders the ability to conduct repeatable and reusable research. Objective: This thesis aims to examine the requirements of various users, such as researchers, clinicians, machine learning experts, and managers, for both new and existing data portals for phenotypes (concept libraries). Methods: Exploratory sequential mixed methods were used in this thesis to look at which concept libraries are available, how they are used, what their characteristics are, where there are gaps, and what needs to be done in the future from the point of view of the people who use them. This thesis consists of three phases: 1) two qualitative studies, including one-to-one interviews with researchers, clinicians, machine learning experts, and senior research managers in health data science, as well as focus group discussions with researchers working with the Secured Anonymized Information Linkage databank, 2) the creation of an email survey (i.e., the Concept Library Usability Scale), and 3) a quantitative study with researchers, health professionals, and clinicians. Results: Most of the participants thought that the prototype concept library would be a very helpful resource for conducting repeatable research, but they specified that many requirements are needed before its development. Although all the participants stated that they were aware of some existing concept libraries, most of them expressed negative perceptions about them. The participants mentioned several facilitators that would encourage them to: 1) share their work, such as receiving citations from other researchers; and 2) reuse the work of others, such as saving a lot of time and effort, which they frequently spend on creating new code lists from scratch. They also pointed out several barriers that could inhibit them from: 1) sharing their work, such as concerns about intellectual property (e.g., if they shared their methods before publication, other researchers would use them as their own); and 2) reusing others' work, such as a lack of confidence in the quality and validity of their code lists. Participants suggested some developments that they would like to see happen in order to make research that is done with routine data more reproducible, such as the availability of a drive for more transparency in research methods documentation, such as publishing complete phenotype definitions and clear code lists. Conclusions: The findings of this thesis indicated that most participants valued a concept library for phenotypes. However, only half of the participants felt that they would contribute by providing definitions for the concept library, and they reported many barriers regarding sharing their work on a publicly accessible platform such as the CALIBER research platform. Analysis of interviews, focus group discussions, and qualitative studies revealed that different users have different requirements, facilitators, barriers, and concerns about concept libraries. This work was to investigate if we should develop concept libraries in Kuwait to facilitate the development of improved data sharing. However, at the end of this thesis the recommendation is this would be unlikely to be cost effective or highly valued by users and investment in open access research publications may be of more value to the Kuwait research/academic community
    corecore