    Computing Healthcare Quality Indicators Automatically: Secondary Use of Patient Data and Semantic Interoperability

    Semantic technologies: from niche to the mainstream of Web 3? A comprehensive framework for web Information modelling and semantic annotation

    Context: Web information technologies developed and applied in the last decade have considerably changed the way web applications operate and have revolutionised information management and knowledge discovery. Social technologies, user-generated classification schemes and formal semantics have a far-reaching sphere of influence. They promote collective intelligence, support interoperability, enhance sustainability and instigate innovation. Contribution: The research carried out and consequent publications follow the various paradigms of semantic technologies, assess each approach, evaluate its efficiency, identify the challenges involved and propose a comprehensive framework for web information modelling and semantic annotation, which is the thesis’ original contribution to knowledge. The proposed framework assists web information modelling, facilitates semantic annotation and information retrieval, enables system interoperability and enhances information quality. Implications: Semantic technologies coupled with social media and end-user involvement can instigate innovative influence with wide organisational implications that can benefit a considerable range of industries. The scalable and sustainable business models of social computing and the collective intelligence of organisational social media can be resourcefully paired with internal research and knowledge from interoperable information repositories, back-end databases and legacy systems. Semantified information assets can free human resources so that they can be used to better serve business development, support innovation and increase productivity

    Analysis of financial and technical feasibilty of a clinicians generated data platform of fybromyalgia syndrome patients

    This master thesis analyzes the technical and economical feasibility for a medical database, based on clinically generated data of patients with the fibromyalgia syndrome. The main idea is to collect patient data on a regular basis during standard visiting hours at their doctor. Therefore it is essential to provide a data collection platform that can be simply used by the patient and doctor. The collected information (no personal data) shall be shared between researchers to enhance collaborative studies, make studies with rare diseases possible as well as to reduce the cost and effort to gather a big enough cohort group for the study. There are already several medical databases in place that collect and share patient information for research. Yet, despite the significant socioeconomic impact of fibromyalgia, no large database about this disease exists. An introduction to the fibromyalgia syndrome and its impact on society are given. Furthermore medical database technologies and medical database projects for other diseases are described. The presented technologies are further analyzed for their usefulness of creating a database to collect information about fibromyalgia syndrome patients and to use it to enhance its research. Additionally the legal requirements for maintaining such a platform as well as the potential cost are examined. Two possible business models to provide such a platform with funding are presented. Last but not least a possible use case for the collection of patient data via a survey created with REDCap and the integration process into i2b2 has been created and possible suggestions for improvements in the future have been made to bring the platform to a release ready state

    An interoperable electronic medical record-based platform for personalized predictive analytics

    Indiana University-Purdue University Indianapolis (IUPUI)Precision medicine refers to the delivering of customized treatment to patients based on their individual characteristics, and aims to reduce adverse events, improve diagnostic methods, and enhance the efficacy of therapies. Among efforts to achieve the goals of precision medicine, researchers have used observational data for developing predictive modeling to best predict health outcomes according to patients’ variables. Although numerous predictive models have been reported in the literature, not all models present high prediction power, and as the result, not all models may reach clinical settings to help healthcare professionals make clinical decisions at the point-of-care. The lack of generalizability stems from the fact that no comprehensive medical data repository exists that has the information of all patients in the target population. Even if the patients’ records were available from other sources, the datasets may need further processing prior to data analysis due to differences in the structure of databases and the coding systems used to record concepts. This project intends to fill the gap by introducing an interoperable solution that receives patient electronic health records via Health Level Seven (HL7) messaging standard from other data sources, transforms the records to observational medical outcomes partnership (OMOP) common data model (CDM) for population health research, and applies predictive models on patient data to make predictions about health outcomes. This project comprises of three studies. The first study introduces CCD-TOOMOP parser, and evaluates OMOP CDM to accommodate patient data transferred by HL7 consolidated continuity of care documents (CCDs). The second study explores how to adopt predictive model markup language (PMML) for standardizing dissemination of OMOP-based predictive models. Finally, the third study introduces Personalized Health Risk Scoring Tool (PHRST), a pilot, interoperable OMOP-based model scoring tool that processes the embedded models and generates risk scores in a real-time manner. The final product addresses objectives of precision medicine, and has the potentials to not only be employed at the point-of-care to deliver individualized treatment to patients, but also can contribute to health outcome research by easing collecting clinical outcomes across diverse medical centers independent of system specifications

    Interoperability of Enterprise Software and Applications

    Data Management for Dynamic Multimedia Analytics and Retrieval

    Multimedia data in its various manifestations poses a unique challenge from a data storage and data management perspective, especially if search, analysis and analytics in large data corpora is considered. The inherently unstructured nature of the data itself and the curse of dimensionality that afflicts the representations we typically work with in its stead are cause for a broad range of issues that require sophisticated solutions at different levels. This has given rise to a huge corpus of research that puts focus on techniques that allow for effective and efficient multimedia search and exploration. Many of these contributions have led to an array of purpose-built, multimedia search systems. However, recent progress in multimedia analytics and interactive multimedia retrieval, has demonstrated that several of the assumptions usually made for such multimedia search workloads do not hold once a session has a human user in the loop. Firstly, many of the required query operations cannot be expressed by mere similarity search and since the concrete requirement cannot always be anticipated, one needs a flexible and adaptable data management and query framework. Secondly, the widespread notion of staticity of data collections does not hold if one considers analytics workloads, whose purpose is to produce and store new insights and information. And finally, it is impossible even for an expert user to specify exactly how a data management system should produce and arrive at the desired outcomes of the potentially many different queries. Guided by these shortcomings and motivated by the fact that similar questions have once been answered for structured data in classical database research, this Thesis presents three contributions that seek to mitigate the aforementioned issues. We present a query model that generalises the notion of proximity-based query operations and formalises the connection between those queries and high-dimensional indexing. We complement this by a cost-model that makes the often implicit trade-off between query execution speed and results quality transparent to the system and the user. And we describe a model for the transactional and durable maintenance of high-dimensional index structures. All contributions are implemented in the open-source multimedia database system Cottontail DB, on top of which we present an evaluation that demonstrates the effectiveness of the proposed models. We conclude by discussing avenues for future research in the quest for converging the fields of databases on the one hand and (interactive) multimedia retrieval and analytics on the other

    Specification of application logic in web information systems

    Business Intelligence on Non-Conventional Data

    The revolution in digital communications witnessed over the last decade had a significant impact on the world of Business Intelligence (BI). In the big data era, the amount and diversity of data that can be collected and analyzed for the decision-making process transcends the restricted and structured set of internal data that BI systems are conventionally limited to. This thesis investigates the unique challenges imposed by three specific categories of non-conventional data: social data, linked data and schemaless data. Social data comprises the user-generated contents published through websites and social media, which can provide a fresh and timely perception about people’s tastes and opinions. In Social BI (SBI), the analysis focuses on topics, meant as specific concepts of interest within the subject area. In this context, this thesis proposes meta-star, an alternative strategy to the traditional star-schema for modeling hierarchies of topics to enable OLAP analyses. The thesis also presents an architectural framework of a real SBI project and a cross-disciplinary benchmark for SBI. Linked data employ the Resource Description Framework (RDF) to provide a public network of interlinked, structured, cross-domain knowledge. In this context, this thesis proposes an interactive and collaborative approach to build aggregation hierarchies from linked data. Schemaless data refers to the storage of data in NoSQL databases that do not force a predefined schema, but let database instances embed their own local schemata. In this context, this thesis proposes an approach to determine the schema profile of a document-based database; the goal is to facilitate users in a schema-on-read analysis process by understanding the rules that drove the usage of the different schemata. A final and complementary contribution of this thesis is an innovative technique in the field of recommendation systems to overcome user disorientation in the analysis of a large and heterogeneous wealth of data

    Modeling Faceted Browsing with Category Theory for Reuse and Interoperability

    Faceted browsing (also called faceted search or faceted navigation) is an exploratory search model where facets assist in the interactive navigation of search results. Facets are attributes that have been assigned to describe resources being explored; a faceted taxonomy is a collection of facets provided by the interface and is often organized as sets, hierarchies, or graphs. Faceted browsing has become ubiquitous with modern digital libraries and online search engines, yet the process is still difficult to abstractly model in a manner that supports the development of interoperable and reusable interfaces. We propose category theory as a theoretical foundation for faceted browsing and demonstrate how the interactive process can be mathematically abstracted in order to support the development of reusable and interoperable faceted systems. Existing efforts in facet modeling are based upon set theory, formal concept analysis, and light-weight ontologies, but in many regards they are implementations of faceted browsing rather than a specification of the basic, underlying structures and interactions. We will demonstrate that category theory allows us to specify faceted objects and study the relationships and interactions within a faceted browsing system. Resulting implementations can then be constructed through a category-theoretic lens using these models, allowing abstract comparison and communication that naturally support interoperability and reuse. In this context, reuse and interoperability are at two levels: between discrete systems and within a single system. Our model works at both levels by leveraging category theory as a common language for representation and computation. We will establish facets and faceted taxonomies as categories and will demonstrate how the computational elements of category theory, including products, merges, pushouts, and pullbacks, extend the usefulness of our model. More specifically, we demonstrate that categorical constructions such as the pullback and pushout operations can help organize and reorganize facets; these operations in particular can produce faceted views containing relationships not found in the original source taxonomy. We show how our category-theoretic model of facets relates to database schemas and discuss how this relationship assists in implementing the abstractions presented. We give examples of interactive interfaces from the biomedical domain to help illustrate how our abstractions relate to real-world requirements while enabling systematic reuse and interoperability. We introduce DELVE (Document ExpLoration and Visualization Engine), our framework for developing interactive visualizations as modular Web-applications in order to assist researchers with exploratory literature search. We show how facets relate to and control visualizations; we give three examples of text visualizations that either contain or interact with facets. We show how each of these visualizations can be represented with our model and demonstrate how our model directly informs implementation. With our general framework for communicating consistently about facets at a high level of abstraction, we enable the construction of interoperable interfaces and enable the intelligent reuse of both existing and future efforts