1,698 research outputs found
Master of Science
thesisData quality has become a significant issue in healthcare as large preexisting databases are integrated to provide greater depth for research and process improvement. Large scale data integration exposes and compounds data quality issues latent in source systems. Although the problems related to data quality in transactional databases have been identified and well-addressed, the application of data quality constraints to large scale data repositories has not and requires novel applications of traditional concepts and methodologies. Despite an abundance of data quality theory, tools and software, there is no consensual technique available to guide developers in the identification of data integrity issues and the application of data quality rules in warehouse-type applications. Data quality measures are frequently developed on an ad hoc basis or methods designed to assure data quality in transactional systems are loosely applied to analytic data stores. These measures are inadequate to address the complex data quality issues in large, integrated data repositories particularly in the healthcare domain with its heterogeneous source systems. This study derives a taxonomy of data quality rules from relational database theory. It describes the development and implementation of data quality rules in the Analytic Health Repository at Intermountain Healthcare and situates the data quality rules in the taxonomy. Further, it identifies areas in which more rigorous data quality iv should be explored. This comparison demonstrates the superiority of a structured approach to data quality rule identification
Recommended from our members
Handling Unstructured Data Type in DB2 and Oracle
The objective of our work is to determine which mainstream object-relational database management systems (ORDMS) provide convenient facilities for the storage and manipulation of unstructured data objects. These objects, which consist of video, audio, photographs, and even executable code such as Java applets, are becoming readily employed by desktop, network, and Internet applications. Typically, these ORDMSs must store the objects in a manner by which they can be easily accessed, but more importantly, easily processed during either storage or retrieval. Our focus is on two of the ORDMS market leaders: IBM’s DB2 and Oracle. The salient facilities of DB2 and Oracle in handling object types are analyzed, considering their advantages and disadvantages
Spatial ontologies for architectural heritage
Informatics and artificial intelligence have generated new requirements for digital archiving, information, and documentation. Semantic interoperability has become fundamental for the management and sharing of information. The constraints to data interpretation enable both database interoperability, for data and schemas sharing and reuse, and information retrieval in large datasets. Another challenging issue is the exploitation of automated reasoning possibilities. The solution is the use of domain ontologies as a reference for data modelling in information systems. The architectural heritage (AH) domain is considered in this thesis. The documentation in this field, particularly complex and multifaceted, is well-known to be critical for the preservation, knowledge, and promotion of the monuments. For these reasons, digital inventories, also exploiting standards and new semantic technologies, are developed by international organisations (Getty Institute, ONU, European Union). Geometric and geographic information is essential part of a monument. It is composed by a number of aspects (spatial, topological, and mereological relations; accuracy; multi-scale representation; time; etc.). Currently, geomatics permits the obtaining of very accurate and dense 3D models (possibly enriched with textures) and derived products, in both raster and vector format. Many standards were published for the geographic field or in the cultural heritage domain. However, the first ones are limited in the foreseen representation scales (the maximum is achieved by OGC CityGML), and the semantic values do not consider the full semantic richness of AH. The second ones (especially the core ontology CIDOC – CRM, the Conceptual Reference Model of the Documentation Commettee of the International Council of Museums) were employed to document museums’ objects. Even if it was recently extended to standing buildings and a spatial extension was included, the integration of complex 3D models has not yet been achieved. In this thesis, the aspects (especially spatial issues) to consider in the documentation of monuments are analysed. In the light of them, the OGC CityGML is extended for the management of AH complexity. An approach ‘from the landscape to the detail’ is used, for considering the monument in a wider system, which is essential for analysis and reasoning about such complex objects. An implementation test is conducted on a case study, preferring open source applications
On the Foundations of Data Interoperability and Semantic Search on the Web
This dissertation studies the problem of facilitating semantic search across disparate ontologies that are developed by different organizations. There is tremendous potential in enabling users to search independent ontologies and discover knowledge in a serendipitous fashion, i.e., often completely unintended by the developers of the ontologies. The main difficulty with such search is that users generally do not have any control over the naming conventions and content of the ontologies. Thus terms must be appropriately mapped across ontologies based on their meaning. The meaning-based search of data is referred to as semantic search, and its facilitation (aka semantic interoperability) then requires mapping between ontologies.
In relational databases, searching across organizational boundaries currently involves the difficult task of setting up a rigid information integration system. Linked Data representations more flexibly tackle the problem of searching across organizational boundaries on the Web. However, there exists no consensus on how ontology mapping should be performed for this scenario, and the problem is open. We lay out the foundations of semantic search on the Web of Data by comparing it to keyword search in the relational model and by providing effective mechanisms to facilitate data interoperability across organizational boundaries.
We identify two sharply distinct goals for ontology mapping based on real-world use cases. These goals are: (i) ontology development, and (ii) facilitating interoperability. We systematically analyze these goals, side-by-side, and contrast them. Our analysis demonstrates the implications of the goals on how to perform ontology mapping and how to represent the mappings.
We rigorously compare facilitating interoperability between ontologies to information integration in databases. Based on the comparison, class matching is emphasized as a critical part of facilitating interoperability. For class matching, various class similarity metrics are formalized and an algorithm that utilizes these metrics is designed. We also experimentally evaluate the effectiveness of the class similarity metrics on real-world ontologies. In order to encode the correspondences between ontologies for interoperability, we develop a novel W3C-compliant representation, named skeleton
Digital Twins for Patient Care via Knowledge Graphs and Closed-Form Continuous-Time Liquid Neural Networks
Digital twin technology has is anticipated to transform healthcare, enabling
personalized medicines and support, earlier diagnoses, simulated treatment
outcomes, and optimized surgical plans. Digital twins are readily gaining
traction in industries like manufacturing, supply chain logistics, and civil
infrastructure. Not in patient care, however. The challenge of modeling complex
diseases with multimodal patient data and the computational complexities of
analyzing it have stifled digital twin adoption in the biomedical vertical.
Yet, these major obstacles can potentially be handled by approaching these
models in a different way. This paper proposes a novel framework for addressing
the barriers to clinical twin modeling created by computational costs and
modeling complexities. We propose structuring patient health data as a
knowledge graph and using closed-form continuous-time liquid neural networks,
for real-time analytics. By synthesizing multimodal patient data and leveraging
the flexibility and efficiency of closed form continuous time networks and
knowledge graph ontologies, our approach enables real time insights,
personalized medicine, early diagnosis and intervention, and optimal surgical
planning. This novel approach provides a comprehensive and adaptable view of
patient health along with real-time analytics, paving the way for digital twin
simulations and other anticipated benefits in healthcare.Comment: 6 page
- …