Search CORE

301,972 research outputs found

A Learning Health System for Radiation Oncology

Author: Kapoor Rishabh
Publication venue: VCU Scholars Compass
Publication date: 01/01/2023
Field of study

The proposed research aims to address the challenges faced by clinical data science researchers in radiation oncology accessing, integrating, and analyzing heterogeneous data from various sources. The research presents a scalable intelligent infrastructure, called the Health Information Gateway and Exchange (HINGE), which captures and structures data from multiple sources into a knowledge base with semantically interlinked entities. This infrastructure enables researchers to mine novel associations and gather relevant knowledge for personalized clinical outcomes. The dissertation discusses the design framework and implementation of HINGE, which abstracts structured data from treatment planning systems, treatment management systems, and electronic health records. It utilizes disease-specific smart templates for capturing clinical information in a discrete manner. HINGE performs data extraction, aggregation, and quality and outcome assessment functions automatically, connecting seamlessly with local IT/medical infrastructure. Furthermore, the research presents a knowledge graph-based approach to map radiotherapy data to an ontology-based data repository using FAIR (Findable, Accessible, Interoperable, Reusable) concepts. This approach ensures that the data is easily discoverable and accessible for clinical decision support systems. The dissertation explores the ETL (Extract, Transform, Load) process, data model frameworks, ontologies, and provides a real-world clinical use case for this data mapping. To improve the efficiency of retrieving information from large clinical datasets, a search engine based on ontology-based keyword searching and synonym-based term matching tool was developed. The hierarchical nature of ontologies is leveraged to retrieve patient records based on parent and children classes. Additionally, patient similarity analysis is conducted using vector embedding models (Word2Vec, Doc2Vec, GloVe, and FastText) to identify similar patients based on text corpus creation methods. Results from the analysis using these models are presented. The implementation of a learning health system for predicting radiation pneumonitis following stereotactic body radiotherapy is also discussed. 3D convolutional neural networks (CNNs) are utilized with radiographic and dosimetric datasets to predict the likelihood of radiation pneumonitis. DenseNet-121 and ResNet-50 models are employed for this study, along with integrated gradient techniques to identify salient regions within the input 3D image dataset. The predictive performance of the 3D CNN models is evaluated based on clinical outcomes. Overall, the proposed Learning Health System provides a comprehensive solution for capturing, integrating, and analyzing heterogeneous data in a knowledge base. It offers researchers the ability to extract valuable insights and associations from diverse sources, ultimately leading to improved clinical outcomes. This work can serve as a model for implementing LHS in other medical specialties, advancing personalized and data-driven medicine

VCU Scholars Compass

GOGGLES: Automatic Image Labeling with Affinity Coding

Author: Chaba Sanya
Chau Duen Horng
Chu Xu
Das Nilaksh
Gandhi Sakshi
Wu Renzhi
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 03/03/2020
Field of study

Generating large labeled training data is becoming the biggest bottleneck in building and deploying supervised machine learning models. Recently, the data programming paradigm has been proposed to reduce the human cost in labeling training data. However, data programming relies on designing labeling functions which still requires significant domain expertise. Also, it is prohibitively difficult to write labeling functions for image datasets as it is hard to express domain knowledge using raw features for images (pixels). We propose affinity coding, a new domain-agnostic paradigm for automated training data labeling. The core premise of affinity coding is that the affinity scores of instance pairs belonging to the same class on average should be higher than those of pairs belonging to different classes, according to some affinity functions. We build the GOGGLES system that implements affinity coding for labeling image datasets by designing a novel set of reusable affinity functions for images, and propose a novel hierarchical generative model for class inference using a small development set. We compare GOGGLES with existing data programming systems on 5 image labeling tasks from diverse domains. GOGGLES achieves labeling accuracies ranging from a minimum of 71% to a maximum of 98% without requiring any extensive human annotation. In terms of end-to-end performance, GOGGLES outperforms the state-of-the-art data programming system Snuba by 21% and a state-of-the-art few-shot learning technique by 5%, and is only 7% away from the fully supervised upper bound.Comment: Published at 2020 ACM SIGMOD International Conference on Management of Dat

arXiv.org e-Print Archive

Crossref

On Type-Aware Entity Retrieval

Author: Balog K.
Balog Krisztian
Giuliano Claudio
Lin Thomas
Ling Xiao
Nakashole Ndapandula
Yosef Mohamed Amir
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 28/08/2017
Field of study

Today, the practice of returning entities from a knowledge base in response to search queries has become widespread. One of the distinctive characteristics of entities is that they are typed, i.e., assigned to some hierarchically organized type system (type taxonomy). The primary objective of this paper is to gain a better understanding of how entity type information can be utilized in entity retrieval. We perform this investigation in an idealized "oracle" setting, assuming that we know the distribution of target types of the relevant entities for a given query. We perform a thorough analysis of three main aspects: (i) the choice of type taxonomy, (ii) the representation of hierarchical type information, and (iii) the combination of type-based and term-based similarity in the retrieval model. Using a standard entity search test collection based on DBpedia, we find that type information proves most useful when using large type taxonomies that provide very specific types. We provide further insights on the extensional coverage of entities and on the utility of target types.Comment: Proceedings of the 3rd ACM International Conference on the Theory of Information Retrieval (ICTIR '17), 201

arXiv.org e-Print Archive

Crossref

Recommended from our members

Extending TRANSIMS Technology to an Integrated Multilevel Representation

Author: Serras Joan
Publication venue
Publication date: 01/07/2005
Field of study

The TRANSIMS system developed at Los Alamos in the USA over the past decade is a world leader in providing an integrated land-use transportation dynamical model for large areas with a million or more inhabitants. TRANSIMS uses standard survey data to create synthetic micropopulations, including family structure, to simulate trip making and emergent traffic dynamics. We propose to extend TRANSIMS by adapting it to a new multi-level representation, allowing dynamics to be algebraically integrated at the micro-, meso- and macro-levels. The new representation builds a lattice hierarchy in a way that integrates non-partitional hierarchies of links and routes based on the usual hierarchy of geographical zones, e.g. neighbourhoods, districts, cities, counties and countries. Applying the representation to a big city starts by defining sets of zones at different levels. At the first level, N, is the street. This can be subdivided to building plots at level N-1, buildings at level N-2, and even rooms at level N-3. At level N+1 are the neighbourhoods, at level N+2 is the set of district zones (each of them containing the different neighbourhoods in the previous level), and at the top level N+3 (in this case), is just one zone, the city itself. If a larger study area is to be considered, we would have a whole set of N+3 zones defining N+4-level areas, and so on, extending to the level of counties, countries or even continents. This paper will explain the fundamentals of TRANSIMS technology and compare it to other systems. We will show how TRANSIMS and the new multi-level representation can be brought together to give new insights into the macro-dynamics of very large road systems such as London, England and even the whole of Europe

Open Research Online (The Open University)

Knowledge discovery for friction stir welding via data driven approaches: Part 2 – multiobjective modelling using fuzzy rule based systems

Author: Chu TC
Coello Coello CA
El-Danaf EA
G Panoutsos
I Norris
K Beamish
M Mahfouf
Ma ZY
Mishra RS
Nandan R
Q Zhang
Sato Y
Zadeh LA
Zadeh LA
Zhang Q
Zhang Q
Zhang Q
Zhang Q
Publication venue: 'Maney Publishing'
Publication date: 01/11/2012
Field of study

In this final part of this extensive study, a new systematic data-driven fuzzy modelling approach has been developed, taking into account both the modelling accuracy and its interpretability (transparency) as attributes. For the first time, a data-driven modelling framework has been proposed designed and implemented in order to model the intricate FSW behaviours relating to AA5083 aluminium alloy, consisting of the grain size, mechanical properties, as well as internal process properties. As a result, ‘Pareto-optimal’ predictive models have been successfully elicited which, through validations on real data for the aluminium alloy AA5083, have been shown to be accurate, transparent and generic despite the conservative number of data points used for model training and testing. Compared with analytically based methods, the proposed data-driven modelling approach provides a more effective way to construct prediction models for FSW when there is an apparent lack of fundamental process knowledge

Crossref

Kent Academic Repository

Recursion Aware Modeling and Discovery For Hierarchical Software Event Log Analysis (Extended)

Author: Brand Mark G. J. van den
Leemans Maikel
van der Aalst Wil M. P.
Publication venue
Publication date: 01/01/2017
Field of study

This extended paper presents 1) a novel hierarchy and recursion extension to the process tree model; and 2) the first, recursion aware process model discovery technique that leverages hierarchical information in event logs, typically available for software systems. This technique allows us to analyze the operational processes of software systems under real-life conditions at multiple levels of granularity. The work can be positioned in-between reverse engineering and process mining. An implementation of the proposed approach is available as a ProM plugin. Experimental results based on real-life (software) event logs demonstrate the feasibility and usefulness of the approach and show the huge potential to speed up discovery by exploiting the available hierarchy.Comment: Extended version (14 pages total) of the paper Recursion Aware Modeling and Discovery For Hierarchical Software Event Log Analysis. This Technical Report version includes the guarantee proofs for the proposed discovery algorithm

arXiv.org e-Print Archive

Repository TU/e

Crossref

Pure OAI Repository