46,967 research outputs found
A Learning Health System for Radiation Oncology
The proposed research aims to address the challenges faced by clinical data science researchers in radiation oncology accessing, integrating, and analyzing heterogeneous data from various sources. The research presents a scalable intelligent infrastructure, called the Health Information Gateway and Exchange (HINGE), which captures and structures data from multiple sources into a knowledge base with semantically interlinked entities. This infrastructure enables researchers to mine novel associations and gather relevant knowledge for personalized clinical outcomes.
The dissertation discusses the design framework and implementation of HINGE, which abstracts structured data from treatment planning systems, treatment management systems, and electronic health records. It utilizes disease-specific smart templates for capturing clinical information in a discrete manner. HINGE performs data extraction, aggregation, and quality and outcome assessment functions automatically, connecting seamlessly with local IT/medical infrastructure.
Furthermore, the research presents a knowledge graph-based approach to map radiotherapy data to an ontology-based data repository using FAIR (Findable, Accessible, Interoperable, Reusable) concepts. This approach ensures that the data is easily discoverable and accessible for clinical decision support systems. The dissertation explores the ETL (Extract, Transform, Load) process, data model frameworks, ontologies, and provides a real-world clinical use case for this data mapping.
To improve the efficiency of retrieving information from large clinical datasets, a search engine based on ontology-based keyword searching and synonym-based term matching tool was developed. The hierarchical nature of ontologies is leveraged to retrieve patient records based on parent and children classes. Additionally, patient similarity analysis is conducted using vector embedding models (Word2Vec, Doc2Vec, GloVe, and FastText) to identify similar patients based on text corpus creation methods. Results from the analysis using these models are presented.
The implementation of a learning health system for predicting radiation pneumonitis following stereotactic body radiotherapy is also discussed. 3D convolutional neural networks (CNNs) are utilized with radiographic and dosimetric datasets to predict the likelihood of radiation pneumonitis. DenseNet-121 and ResNet-50 models are employed for this study, along with integrated gradient techniques to identify salient regions within the input 3D image dataset. The predictive performance of the 3D CNN models is evaluated based on clinical outcomes.
Overall, the proposed Learning Health System provides a comprehensive solution for capturing, integrating, and analyzing heterogeneous data in a knowledge base. It offers researchers the ability to extract valuable insights and associations from diverse sources, ultimately leading to improved clinical outcomes. This work can serve as a model for implementing LHS in other medical specialties, advancing personalized and data-driven medicine
Biodiversity informatics: the challenge of linking data and the role of shared identifiers
A major challenge facing biodiversity informatics is integrating data stored in widely distributed databases. Initial efforts have relied on taxonomic names as the shared identifier linking records in different databases. However, taxonomic names have limitations as identifiers, being neither stable nor globally unique, and the pace of molecular taxonomic and phylogenetic research means that a lot of information in public sequence databases is not linked to formal taxonomic names. This review explores the use of other identifiers, such as specimen codes and GenBank accession numbers, to link otherwise disconnected facts in different databases. The structure of these links can also be exploited using the PageRank algorithm to rank the results of searches on biodiversity databases. The key to rich integration is a commitment to deploy and reuse globally unique, shared identifiers (such as DOIs and LSIDs), and the implementation of services that link those identifiers
PQL: A Declarative Query Language over Dynamic Biological Schemata
We introduce the PQL query language (PQL) used in the GeneSeek genetic data integration project. PQL incorporates many features of query languages for semi-structured data. To this we add the ability to express metadata constraints like intended semantics and database curation approach. These constraints guide the dynamic generation of potential query plans. This allows a single query to remain relevant even in the presence of source and mediated schemas that are continually evolving, as is often the case in data integration
trackr: A Framework for Enhancing Discoverability and Reproducibility of Data Visualizations and Other Artifacts in R
Research is an incremental, iterative process, with new results relying and
building upon previous ones. Scientists need to find, retrieve, understand, and
verify results in order to confidently extend them, even when the results are
their own. We present the trackr framework for organizing, automatically
annotating, discovering, and retrieving results. We identify sources of
automatically extractable metadata for computational results, and we define an
extensible system for organizing, annotating, and searching for results based
on these and other metadata. We present an open-source implementation of these
concepts for plots, computational artifacts, and woven dynamic reports
generated in the R statistical computing language
Key factor for hastening the strategic issue diagnosis process: a within organisational model
Previous research on Strategic Issue Diagnosis (SID) had focused on the complexity and novelty associated with the decision-making process in a turbulent environment. What had not been previously addressed in the extant literature is the requirement for speed inherent within the SID process, especially that is related to the gathering of information and facts through an organisation’s environmental scanning procedures. Since proactive management techniques, nimble processes, and systems that allow an organisation to be responsive and build rapid decision-making capabilities are important determinants of success in a turbulent environment, the element of speed associated with SID is an important factor. Our paper identifi es a series of propositions focusing att ention
on elements of the environmental scanning processes and management hierarchies that are intended to counteract the recursiveness and redundancy inherent in SID systems and ultimately hasten the strategic decision-making process
- …