88 research outputs found

    CLINICAL GENOMIC RESEARCH MANAGEMENT

    Get PDF
    Technological advancement in Genomics has propelled research in a new era, where methods of conducting experiments have completely been renovated. Riding the wave of Information Technology, equipped with statistical tools, Genomics provide a revolutionized perspective unthought-of in the past. With the completion of the Human Genome project, we have a common reference for analysis at the level of the complete genome. High throughput technologies for gene expression, genotyping and sequencing are propelling present research. Attempts are now being made for the incorporation of these methods in the health care in a structured format. Clinicians cherish the use of genomics for the assessment disease predisposition and realizing personalized medical care for a better health care. As genome sequencing is becoming swifter and its cost reducing, the public genomic data has increased many folds. Data from other high throughput technologies and annotations further increase the storage requirements. Laboratory management software, LIMS, is now becoming the limiting factor as automation and integration increases. Thus genomics now faces the challenge of management of this enormous data catering to varied needs, not limited only for the research laboratories, but extends also to health care institutions and individual clinicians. Further, there is a growing need for the analysis and visualization of the generated data to be integrated into the same platform for a continuous research experience and systematic supervision. Data security is of prime concern, especially in health care concerning human subjects. The interest of the clinicians adds another management requirement, a delivery system for the concerned subject. Hypertension is a complex disorder with world-wide prevalence. HYPERGENES project was centered on the objective of integrating biological data and processes with Hypertension as the disease model. The HYPERGENES project focuses on the definition of a comprehensive genetic epidemiological model of complex traits like Essential Hypertension (EH) and intermediate phenotypes of hypertension such as Target Organ Damage (TOD). During the HYPERGENES project, the above mentioned challenges were comprehended and evaluated, leading to the present work as an endeavor to provide a generalized integrated solution towards the management of genomic and clinical data for clinical genomic research. This PhD thesis represents the description of AD2BioDB, biological data management platform and SeqPipe, dynamic pipeline management software, in the path of meeting the challenges posed in the area of clinical genomics. AD2BioDB provides the platform where data generated using different technologies can be managed and analyzed with reporting and visualization modules for improved understanding of the results among all research collaborators. AD2BioDB is the management software environment in which the in-silico data can be shared and analyzed. The analysis software is connected within AD2BioDB through the plug-in system. SeqPipe software provides opportunity to dynamically create pipeline workflows for the multi-step analysis of data. The interactive graphical user interface provides the opportunity for coding free pipeline creation and analysis. This tool is especially useful in the dynamic NGS analysis, where multiple tools i with different versions are in use. SeqPipe can be used as independent software or as a plug-in analysis tool within an application like AD2BioDB. The key features of AD2BioDB can be summarized as: \uf0b7 Clinical genomics data management \uf0b7 Project management \uf0b7 Data security \uf0b7 Dynamic creation of graphical representation. \uf0b7 Distributed workflow analysis \uf0b7 Reporting and alert features. \uf0b7 Dynamic integration of high throughput technologies We developed AD2BioDB as a prototype in our laboratory for providing support to the increasing genomic data and complexity of analysis. The software aims at providing a continuous research experience with a versatile platform that supports data management, analysis and public knowledge integration. Through the integration of SeqPipe into AD2BioDB, the management system becomes robust in providing a distributed analysis environment

    Information management applied to bioinformatics

    Get PDF
    Bioinformatics, the discipline concerned with biological information management is essential in the post-genome era, where the complexity of data processing allows for contemporaneous multi level research including that at the genome level, transcriptome level, proteome level, the metabolome level, and the integration of these -omic studies towards gaining an understanding of biology at the systems level. This research is also having a major impact on disease research and drug discovery, particularly through pharmacogenomics studies. In this study innovative resources have been generated via the use of two case studies. One was of the Research & Development Genetics (RDG) department at AstraZeneca, Alderley Park and the other was of the Pharmacogenomics Group at the Sanger Institute in Cambridge UK. In the AstraZeneca case study senior scientists were interviewed using semi-structured interviews to determine information behaviour through the study scientific workflows. Document analysis was used to generate an understanding of the underpinning concepts and fonned one of the sources of context-dependent information on which the interview questions were based. The objectives of the Sanger Institute case study were slightly different as interviews were carried out with eight scientists together with the use of participation observation, to collect data to develop a database standard for one process of their Pharmacogenomics workflow. The results indicated that AstraZeneca would benefit through upgrading their data management solutions in the laboratory and by development of resources for the storage of data from larger scale projects such as whole genome scans. These studies will also generate very large amounts of data and the analysis of these will require more sophisticated statistical methods. At the Sanger Institute a minimum information standard was reported for the manual design of primers and included in a decision making tree developed for Polymerase Chain Reactions (PCRs). This tree also illustrates problems that can be encountered when designing primers along with procedures that can be taken to address such issues.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Undisclosed, unmet and neglected challenges in multi-omics studies

    Full text link
    [EN] Multi-omics approaches have become a reality in both large genomics projects and small laboratories. However, the multi-omics research community still faces a number of issues that have either not been sufficiently discussed or for which current solutions are still limited. In this Perspective, we elaborate on these limitations and suggest points of attention for future research. We finally discuss new opportunities and challenges brought to the field by the rapid development of single-cell high-throughput molecular technologies.This work has been funded by the Spanish Ministry of Science and Innovation with grant number BES-2016-076994 to A.A.-L.Tarazona, S.; Arzalluz-Luque, Á.; Conesa, A. (2021). Undisclosed, unmet and neglected challenges in multi-omics studies. Nature Computational Science. 1(6):395-402. https://doi.org/10.1038/s43588-021-00086-z3954021

    Systems Biology Knowledgebase for a New Era in Biology A Genomics:GTL Report from the May 2008 Workshop

    Full text link

    CoryneCenter – An online resource for the integrated analysis of corynebacterial genome and transcriptome data

    Get PDF
    Neuweger H, Baumbach J, Albaum S, et al. CoryneCenter: an online resource for the integrated analysis of corynebacterial genome and transcriptome data. BMC Systems Biology. 2007;1(1): 55.Background: The introduction of high-throughput genome sequencing and post-genome analysis technologies, e.g. DNA microarray approaches, has created the potential to unravel and scrutinize complex gene-regulatory networks on a large scale. The discovery of transcriptional regulatory interactions has become a major topic in modern functional genomics. Results: To facilitate the analysis of gene-regulatory networks, we have developed CoryneCenter, a web-based resource for the systematic integration and analysis of genome, transcriptome, and gene regulatory information for prokaryotes, especially corynebacteria. For this purpose, we extended and combined the following systems into a common platform: (1) GenDB, an open source genome annotation system, (2) EMMA, a MAGE compliant application for high-throughput transcriptome data storage and analysis, and (3) CoryneRegNet, an ontology-based data warehouse designed to facilitate the reconstruction and analysis of gene regulatory interactions. We demonstrate the potential of CoryneCenter by means of an application example. Using microarray hybridization data, we compare the gene expression of Corynebacterium glutamicum under acetate and glucose feeding conditions: Known regulatory networks are confirmed, but moreover CoryneCenter points out additional regulatory interactions. Conclusion: CoryneCenter provides more than the sum of its parts. Its novel analysis and visualization features significantly simplify the process of obtaining new biological insights into complex regulatory systems. Although the platform currently focusses on corynebacteria, the integrated tools are by no means restricted to these species, and the presented approach offers a general strategy for the analysis and verification of gene regulatory networks. CoryneCenter provides freely accessible projects with the underlying genome annotation, gene expression, and gene regulation data. The system is publicly available at http://www.CoryneCenter.d

    Structure determination and functional analysis of isochorismate synthase DhbC from Bacillus anthracis using the state of the art SG data management system

    Get PDF
    Wydział BiologiiGłównymi celami niniejszej pracy było rozwiązanie struktury przestrzennej i charakterystyka właściwości biochemicznych syntazy izochoryzmianu DhbC z B. anthracis oraz stworzenie komponentów innowacyjnego systemu zarządzania danymi doświadczalnymi „UniTrack” dla Center for Structural Genomcis of Infectious Diseases (CSGID). Struktura formy apo DhbC została rozwiązana za pomocą krystalografii rentgenowskiej pojedyńczych kryształów makromolekuł do rozdzielczości 2.4 Å. Przy pomocy analizy spektorfotometrycznej potwierdzono funkcję katalityczną enzymu oraz wyznaczono stałe kinetyczne reakcji enzymatycznej. Rozwinięte w ramach pracy komponenty systemu „UniTrack” to baza danych monitorująca postęp prac na celami białkowymi „CSGID-DB”, powiązany z nią portal internetowy zapewniający publiczny dostęp do danych i rozpowszechniający uzyskaną wiedzę, narzędzie do walidacji celów białowych, a także protokoły komunikacji z zewnętrznymi bazami danych. „UniTrack” monitoruje pracę doświadczalną nad poszczególnymi celami białkowymi i zapewnia intuicyjny przypływ pracy pomiędzy grupami badawczymi zaangażowanymi w projekt. System monitoruje również ogólny postęp prac doświadczalnych CSGID przez generowane w czasie rzeczywistym wewnętrzne raporty i statystyki jak również pliki XML, które są wykorzystywane do deponowania informacji w zewnętrznych repozytoriach.The main objectives of this study were to determine the three-dimensional structure of the isochorismate synthase DhbC from B. anthracis, biochemical characterization of this enzyme and development of components of the innovative data management system UniTrack for the Center of Structural Genomics of Infectious Diseases (abbr. CSGID). The structure of the apo form of DhbC was solved using single crystal X-ray diffraction at 2.4 Å resolution. The enzyme catalytic function and the kinetics constants of the enzymatic reaction were determined using spectrophotometric assays. The UniTrack system components developed as a part of this work are the following: the protein target tracking database CSGID-DB, associated publically accessible knowledge dissemination web portal, target validation tool, and communication protocols with external databases. The UniTrack monitors all the experimental work on particular protein targets and provides intuitive workflow between research groups involved in the project. The system also reports general progress of the CSGID’s experimental work by generating real-time internal reports and statistics as well as XML files, which are used for data submission to external repositories

    Bioinformatic and Experimental Approaches for Deeper Metaproteomic Characterization of Complex Environmental Samples

    Get PDF
    The coupling of high performance multi-dimensional liquid chromatography and tandem mass spectrometry for characterization of microbial proteins from complex environmental samples has paved the way for a new era in scientific discovery. The field of metaproteomics, which is the study of protein suite of all the organisms in a biological system, has taken a tremendous leap with the introduction of high-throughput proteomics. However, with corresponding increase in sample complexity, novel challenges have been raised with respect to efficient peptide separation via chromatography and bioinformatic analysis of the resulting high throughput data. In this dissertation, various aspects of metaproteomic characterization, including experimental and computational approaches have been systematically evaluated. In this study, robust separation protocols employing strong cation exchange and reverse phase have been designed for efficient peptide separation thus offering excellent orthogonality and ease of automation. These findings will be useful to the proteomics community for obtaining deeper non-redundant peptide identifications which in turn will improve the overall depth of semi-quantitative proteomics. Secondly, computational bottlenecks associated with screening the vast amount of raw mass spectra generated in these proteomic measurements have been addressed. Computational matching of tandem mass spectra via conventional database search strategies lead to modest peptide/protein identifications. This seriously restricts the amount of information retrieved from these complex samples which is mainly due to high complexity and heterogeneity of the sample containing hundreds of proteins shared between different microbial species often having high level of homology. Hence, the challenges associated with metaproteomic data analysis has been addressed by utilizing multiple iterative search engines coupled with de novo sequencing algorithms for a comprehensive and in-depth characterization of complex environmental samples. The work presented here will utilize various sample types ranging from isolates and mock microbial mixtures prepared in the laboratory to complex community samples extracted from industrial waste water, acid-mine drainage and methane seep sediments. In a broad perspective, this dissertation aims to provide tools for gaining deeper insights to proteome characterization in complex environmental ecosystems
    corecore