93 research outputs found
FAIRness and Usability for Open-access Omics Data Systems
Omics data sharing is crucial to the biological research community, and the last decade or two has seen a huge rise in collaborative analysis systems, databases, and knowledge bases for omics and other systems biology data. We assessed the FAIRness of NASAs GeneLab Data Systems (GLDS) along with four similar kinds of systems in the research omics data domain, using 14 FAIRness metrics. The range of overall FAIRness scores was 6-12 (out of 14), average 10.1, and standard deviation 2.4. The range of Pass ratings for the metrics was 29-79%, Partial Pass 0-21%, and Fail 7-50%. The systems we evaluated performed the best in the areas of data findability and accessibility, and worst in the area of data interoperability. Reusability of metadata, in particular, was frequently not well supported. We relate our experiences implementing semantic integration of omics data from some of the assessed systems for federated querying and retrieval functions, given their shortcomings in data interoperability. Finally, we propose two new principles that Big Data system developers, in particular, should consider for maximizing data accessibility
Semantic Analysis of Email Using Domain Ontologies and WordNet
The problem of capturing and accessing knowledge in paper form has been supplanted by a problem of providing structure to vast amounts of electronic information. Systems that can construct semantic links for natural language documents like email messages automatically will be a crucial element of semantic email tools. We have designed an information extraction process that can leverage the knowledge already contained in an existing semantic web, recognizing references in email to existing nodes in a network of ontology instances by using linguistic knowledge and knowledge of the structure of the semantic web. We developed a heuristic score that uses several forms of evidence to detect references in email to existing nodes in the Semanticorganizer repository's network. While these scores cannot directly support automated probabilistic inference, they can be used to rank nodes by relevance and link those deemed most relevant to email messages
FAIRness and Usability for Open-Access Omics Data Systems
Omics data sharing is especially crucial to the biological research community, and the last decade or two has seen a huge rise in collaborative analysis systems, databases, and knowledge bases for omics and other systems biology data. We assessed the "FAIRness" of NASA's GeneLab Data Systems (GLDS) along with four similar kinds of systems in the research omics data domain, using 14 FAIRness metrics. 14 metrics. The range of Pass ratings was 29-79% of the 14 metrics, Partial Pass 0-21%, and Fail 7-50%. The range of overall FAIRness scores was 5-12 (out of 14). The systems we evaluated performed the best in the areas of data findability and accessibility, and worst in the area of data interoperability. We propose two new principles that Big Data systems, in particular, should consider for increasing data accessibility. We relate our experiences implementing semantic integration of omics data from several systems for the federated querying and retrieval functions of the GLDS, given the shortcomings in data interoperability of these systems
NASA's GeneLab Phase II: Federated Search and Data Discovery
GeneLab is currently being developed by NASA to accelerate 'open science' biomedical research in support of the human exploration of space and the improvement of life on earth. Phase I of the four-phase GeneLab Data Systems (GLDS) project emphasized capabilities for submission, curation, search, and retrieval of genomics, transcriptomics and proteomics ('omics') data from biomedical research of space environments. The focus of development of the GLDS for Phase II has been federated data search for and retrieval of these kinds of data across other open-access systems, so that users are able to conduct biological meta-investigations using data from a variety of sources. Such meta-investigations are key to corroborating findings from many kinds of assays and translating them into systems biology knowledge and, eventually, therapeutics
NASA's GeneLab: An Integrated Omics Data Commons and Workbench
GeneLab (http://genelab.nasa.gov) is a NASA initiative designed to accelerate open science biomedical research in support of the human exploration of space and the improvement of life on earth. The GeneLab Data Systems (GLDS) were developed to help investigators corroborate findings from omics (genomics, transcriptomics, proteomics, and metabolomics) assays and translate them into systems biology knowledge and, eventually, therapeutics, including countermeasures to support life in space. Phase I of the project (completed) emphasized developing key capabilities for submission, curation, storage, search, and retrieval of omics data from biomedical research in and of space environments. The development focus for Phase II (completed) was federated data search and retrieval of these kinds of data from other open-access repositories. The last phase of the project (in work) entails developing an omics analysis tool set, and a portal to visualize processed omics data, emphasizing integration with the data repository and search functions developed during the prior phases. The final product will be an open-access system where users can individually or collaboratively publish, search, integrate, analyze, and visualize omics data
NASA's GeneLab: An Integrated Omics Data Commons and Workbench
GeneLab (http://genelab.nasa.gov) is a NASA initiative designed to accelerate "open science" biomedical research in support of the human exploration of space and the improvement of life on earth. The GeneLab Data Systems (GLDS) were developed to help investigators corroborate findings from "omics" (genomics, transcriptomics, proteomics, and metabolomics) assays and translate them into systems biology knowledge and, eventually, therapeutics, including countermeasures to support life in space. Phase I of the project (completed) emphasized developing key capabilities for submission, curation, storage, search, and retrieval of omics data from biomedical research in and of space environments. The development focus for Phase II (completed) was federated data search and retrieval of these kinds of data from other open-access repositories. The last phase of the project (in work) entails developing an omics analysis tool set, and a portal to visualize processed omics data, emphasizing integration with the data repository and search functions developed during the prior phases. The final product will be an open-access system where users can individually or collaboratively publish, search, integrate, analyze, and visualize omics data
Integrating Engineering Data Systems for NASA Spaceflight Projects
NASA has a large range of custom-built and commercial data systems to support spaceflight programs. Some of the systems are re-used by many programs and projects over time. Management and systems engineering processes require integration of data across many of these systems, a difficult problem given the widely diverse nature of system interfaces and data models. This paper describes an ongoing project to use a central data model with a web services architecture to support the integration and access of linked data across engineering functions for multiple NASA programs. The work involves the implementation of a web service-based middleware system called Data Aggregator to bring together data from a variety of systems to support space exploration. Data Aggregator includes a central data model registry for storing and managing links between the data in disparate systems. Initially developed for NASA's Constellation Program needs, Data Aggregator is currently being repurposed to support the International Space Station Program and new NASA projects with processes that involve significant aggregating and linking of data. This change in user needs led to development of a more streamlined data model registry for Data Aggregator in order to simplify adding new project application data as well as standardization of the Data Aggregator query syntax to facilitate cross-application querying by client applications. This paper documents the approach from a set of stand-alone engineering systems from which data are manually retrieved and integrated, to a web of engineering data systems from which the latest data are automatically retrieved and more quickly and accurately integrated. This paper includes the lessons learned through these efforts, including the design and development of a service-oriented architecture and the evolution of the data model registry approaches as the effort continues to evolve and adapt to support multiple NASA programs and priorities
Systemic Response to Microgravity: Utilizing GeneLab Datasets to Identify Molecular Targets for Future Hypotheses-Driven Spaceflight Studies
Biological risks associated with microgravity are a major concern for long-term space travel. Although determination of risk has been a focus for NASA research, data examining systemic (i.e., multi- or pan-tissue) responses to space flight are sparse. To perform our analysis, we utilized the NASA GeneLab database which is a publicly available repository containing a wide array of omics results from experiments conducted with: i) with different flight conditions (space shuttle (STS) missions vs. International Space Station (ISS); ii) a variety of tissues; and 3) assays that measure epigenetic, transcriptional, and protein expression changes. Meta-analysis of the transcriptomic data from 7 different murine and rat data sets, examining tissues such as liver, kidney, adrenal gland, thymus, mammary gland, skin, and skeletal muscle (soleus, extensor digitorum longus, tibialis anterior, quadriceps, and gastrocnemius) revealed for the first time, the existence of potential master regulators coordinating systemic responses to microgravity in rodents. We identified p53, TGF(beta)1 and immune related pathways as the highly prevalent pan-tissue signaling pathways that are affected by microgravity. Some variability in the degree of change in their expression across species, strain and time of flight was also observed. Interestingly, while certain skeletal muscle (gastrocnemius and soleus) exhibited an overall down-regulation of these genes, some other muscle types such as the extensor digitorum longus, tibialis anterior and quadriceps, showed an up-regulated expression, indicative of potential compensatory mechanisms to prevent microgravity-induced atrophy. Key genes isolated by unbiased systems analyses displayed a major overlap between tissue types and flight conditions and established TGF(beta)1 to be the most connected gene across all data sets. Finally, a set of microgravity responsive miRNA signature was identified and based on their predicted functional state and subsequent impact on health, a theoretical health risk score was calculated. The genes and miRNAs identified from our analyses can be targeted for future research involving efficient countermeasure design. Our study thus exemplifies the utility of GeneLab data repository to aid in the process of performing novel hypothesis based spaceflight research aimed at elucidating the global impact of environmental stressors at multiple biological scales
GeneLab: NASA's Open Access, Collaborative Platform for Systems Biology and Space Medicine
NASA is investing in GeneLab1 (http:genelab.nasa.gov), a multi-year effort to maximize utilization of the limited resources to conduct biological and medical research in space, principally aboard the International Space Station (ISS). High-throughput genomic, transcriptomic, proteomic or other omics analyses from experiments conducted on the ISS will be stored in the GeneLab Data Systems (GLDS), an open-science information system that will also include a biocomputation platform with collaborative science capabilities, to enable the discovery and validation of molecular networks
GeneLab
NASA GeneLab is expected to capture and distribute omics data and experimental and process conditions most relevant to research community in their statistical and theoretical analysis of NASAs omics data
- …