1,824 research outputs found
Multidisciplinary perspectives on Artificial Intelligence and the law
This open access book presents an interdisciplinary, multi-authored, edited collection of chapters on Artificial Intelligence (‘AI’) and the Law. AI technology has come to play a central role in the modern data economy. Through a combination of increased computing power, the growing availability of data and the advancement of algorithms, AI has now become an umbrella term for some of the most transformational technological breakthroughs of this age. The importance of AI stems from both the opportunities that it offers and the challenges that it entails. While AI applications hold the promise of economic growth and efficiency gains, they also create significant risks and uncertainty. The potential and perils of AI have thus come to dominate modern discussions of technology and ethics – and although AI was initially allowed to largely develop without guidelines or rules, few would deny that the law is set to play a fundamental role in shaping the future of AI. As the debate over AI is far from over, the need for rigorous analysis has never been greater. This book thus brings together contributors from different fields and backgrounds to explore how the law might provide answers to some of the most pressing questions raised by AI. An outcome of the Católica Research Centre for the Future of Law and its interdisciplinary working group on Law and Artificial Intelligence, it includes contributions by leading scholars in the fields of technology, ethics and the law.info:eu-repo/semantics/publishedVersio
Genomic insights for safety assessment of foodborne bacteria.
La sicurezza alimentare e l'accesso ad essa sono fondamentali per sostenere la vita e promuovere una buona salute. Gli alimenti non sicuri, contenenti microrganismi o sostanze chimiche nocive, sono causa di oltre 200 malattie, dalla diarrea al cancro, che colpiscono in particolare i neonati, i bambini piccoli, gli anziani e gli individui immunocompromessi. L'onere globale delle malattie di origine alimentare si ripercuote sulla salute pubblica, sulla società e sull'economia, pertanto è necessaria una buona collaborazione tra governi, produttori e consumatori per contribuire a garantire la sicurezza alimentare e sistemi alimentari più solidi. L'indagine più recente condotta dall'OMS (2015) ha evidenziato una stima di 600 milioni di individui malati e 420.000 decessi annui associati ad alimenti non sicuri. L'impatto economico è dovuto principalmente alla mancanza di alimenti sicuri nei Paesi a basso e medio reddito, con una perdita di 110 miliardi di dollari l'anno in termini di produttività e spese mediche. Le sfide principali per garantire la sicurezza alimentare rimangono legate alla nostra produzione alimentare e alla catena di approvvigionamento, dove fattori come la contaminazione ambientale, le preferenze dei consumatori, il rilevamento tempestivo e la sorveglianza dei focolai giocano un ruolo cruciale. Recentemente, le metodologie basate sul DNA per il rilevamento e l'indagine microbica hanno suscitato particolare interesse, soprattutto grazie allo sviluppo delle tecnologie di sequenziamento. Contrariamente ai metodi tradizionali dipendenti dalla coltura, le tecniche basate sul DNA, come il sequenziamento dell'intero genoma (WGS), mirano a risultati rapidi e sensibili a un prezzo relativamente basso e a tempi di elaborazione brevi. Inoltre, il WGS conferisce un elevato potere discriminatorio che consente di determinare importanti caratteristiche genomiche legate alla sicurezza alimentare, come la tassonomia, il potenziale patogeno, la virulenza e la resistenza antimicrobica e il relativo trasferimento genetico. La comprensione di queste caratteristiche è fondamentale per progettare strategie di rilevamento e mitigazione da applicare lungo l'intera catena alimentare secondo una prospettiva di "One Health", che porta ad acquisire conoscenze sul microbiota che influenza l'uomo, gli animali e l'ambiente. Lo scopo della tesi è quello di approfondire la genomica dei microbi di origine alimentare per la loro caratterizzazione e per creare o migliorare le strategie per la loro individuazione e i metodi di mitigazione. In particolare, questa tesi si concentra sulla valutazione del potenziale patogeno sulla base di analisi genomiche che includono studi di tassonomia, virulenza, resistenza agli antibiotici e mobiloma. Il secondo obiettivo è quello di trarre vantaggio dalle conoscenze genomiche per progettare dispositivi di rilevamento rapidi ed efficaci e metodi di mitigazione affidabili per affrontare i patogeni di origine alimentare. Più in dettaglio, saranno trattati i seguenti argomenti: La presenza di ceppi multiresistenti negli alimenti fermentati pronti al consumo rappresenta un rischio per la salute pubblica per la diffusione di determinanti AMR nella catena alimentare e nel microbiota intestinale dei consumatori. Le analisi genomiche hanno permesso di valutare accuratamente la sicurezza del ceppo UC7251 di E. faecium, in relazione alla sua virulenza e alla co-localizzazione dei geni di resistenza agli antibiotici e ai metalli pesanti in elementi mobili con capacità di coniugazione in diverse matrici. Questo lavoro sottolinea l'importanza di una sorveglianza della presenza di batteri AMR negli alimenti e di incitare lo sviluppo di strategie innovative per la mitigazione del rischio legato alla diffusione della resistenza antimicrobica negli alimenti. L'accuratezza dell'identificazione tassonomica guida le analisi successive e, per questo motivo, un metodo adeguato per identificare le specie è fondamentale. È stata studiata la riclassificazione delle specie di Enterococcus faecium clade B, utilizzando un approccio combinato di filogenomica, tipizzazione di sequenza multilocus, identità nucleotidica media e ibridazione digitale DNA-DNA. L'obiettivo è dimostrare come l'analisi del genoma sia più efficace e fornisca risultati più dettagliati riguardo alla definizione delle specie, rispetto all'analisi della sequenza del 16S rRNA. Ciò ha portato alla proposta di riclassificare tutto il clade B di E. faecium come E. lactis, riconoscendo che i due gruppi sono filogeneticamente separati, per cui è possibile definire una specifica procedura di valutazione della sicurezza, prima del loro utilizzo negli alimenti o come probiotici, compresa la considerazione per l'inclusione nella lista europea QPS.
A partire da questa riclassificazione tassonomica, abbiamo sviluppato un metodo basato sulla PCR per la rapida individuazione e differenziazione di queste due specie e per discutere le principali differenze fenotipiche e genotipiche da una prospettiva clinica. A questo scopo, è stato utilizzato un allineamento del core-genoma basato sull'analisi del pangenoma. La differenza allelica tra alcuni geni del core ha permesso la progettazione di primer e l'identificazione della specie mediante PCR con una specificità del 100% e senza reattività crociata. Inoltre, i genomi clinici di E. lactis sono stati classificati come un rischio potenziale a causa della capacità di aumentare la traslocazione batterica. Gli agenti antimicrobici alternativi agli antibiotici sono una delle principali aree di sviluppo e miglioramento dell'attuale catena alimentare. Le nanoparticelle metalliche, come le nanoparticelle di platino (PtNPs), hanno suscitato interesse per le loro potenti attività catalitiche simili alle ossidasi e alle perossidasi che garantiscono forti effetti antimicrobici, e sono state proposte come potenziali candidati per superare gli inconvenienti degli antibiotici come la resistenza ai farmaci. L'obiettivo è studiare la modalità d'azione delle PtNPs in relazione alla capacità di formazione del biofilm, al meccanismo di contrasto delle specie reattive dell'ossigeno (ROS) e al quorum sensing utilizzando batteri di origine alimentare come Enterococcus faecium e Salmonella Typhimurium.Safe food and the access to it is key to sustaining life and promoting good health. Unsafe food containing harmful microorganisms or chemical substances causes more than 200 diseases, ranging from diarrhoea to cancers that particularly affect infants, young children, elderly and immunocompromised individuals. The global burden of foodborne disease affects public health, society, and economy, therefore good collaboration between governments, producers and consumers is needed to help ensure food safety and stronger food systems. The most recent survey conducted by WHO (2015) showed an estimated 600 million ill individuals and 420 000 yearly deaths associated to unsafe food. The economic impact is mainly due to the lack of safe food in low and middle income causing a US$ 110 billion is lost each year in productivity and medical expenses. The main challenges to assure food safety remain tied to our food production and supply chain, where factors like environmental contamination, consumer preferences, timely detection and surveillance of outbreaks play a crucial role. Recently, DNA-based methodologies for microbial detection and investigation have sparked special interest, mainly linked to the development of sequencing technologies. Contrary to the traditional culture-dependent methods, DNA-based techniques such as Whole Genome Sequencing (WGS) that targets fast and sensitive results at a relative low price and short processing time. Moreover, WGS confers high discriminatory power that allows to determine important genomic characteristics linked to food safety like taxonomy, pathogenic potential, virulence and antimicrobial resistance and the genetic transfer thereof. The understanding of these characteristics is fundamental to design detection and mitigation strategies to apply along the entire food-chain following a ‘One Health’ perspective, leading to gain knowledge about the microbiota that affect humans, animals, and environment.
The aim of the thesis is to gain insight into the genomics of foodborne microbes for their characterization and to create or improve strategies for their detection and mitigation methods. Particularly, this thesis is focused on the assessment of the pathogenic potential based on genomic analyses including taxonomy, virulence, antibiotic resistance and mobilome studies. The second focus is to profit from the genomic insights to design rapid and time-effective detection devices and reliable mitigation methods to tackle foodborne pathogens.
In more detail the following topics will be handled:
The presence of multi-drug resistant strains in ready-to-eat fermented food represents a risk of public health for the spread of AMR determinants in the food chain and in the gut microbiota of consumers. Genomic analyses permitted to accurately assess the safety of E. faecium strain UC7251, with respect to its virulence and co-location of antibiotic and heavy metal resistance genes in mobile elements with conjugation capacity in different matrices. This work emphasizes the importance of a surveillance for the presence of AMR bacteria in food and to incite the development of innovative strategies for the mitigation of the risk related to antimicrobial resistance diffusion in food.
The accuracy of taxonomic identification drives the subsequent analysis and, for this reason, a suitable method to identify species is crucial. The species re-classification of Enterococcus faecium clade B was investigated, using a combined approach of phylogenomics, multilocus sequence typing, average nucleotide identity and digital DNA–DNA hybridization. The goal is to show how the genome analysis is more effective and give more detailed results concerning the species definition, respect to the analysis of the 16S rRNA sequence. This led to the proposal to reclassify all the E. faecium clade B as E. lactis, recognizing the two groups are phylogenetically separate, where a specific safety assessment procedure can be designed, before their use in food or as probiotics, including the consideration for inclusion in the European QPS list.
From this taxonomic re-classification, we developed a PCR-based method for rapid detection and differentiation of these two species and to discuss main phenotypic and genotypic differences from a clinical perspective. To this aim, core-genome alignment base on pangenome analysis was used. Allelic difference between certain core genes allowed primer design and species identification through PCR with 100% specificity and no cross-reactivity. Moreover, clinical E. lactis genomes categorised as a potential risk due to the ability of enhanced bacterial translocation.
Antimicrobial agents alternative to antibiotics are one of the main areas of development and improvement in the current food chain. Metallic nanoparticles like Platinum nanoparticles (PtNPs), have awaken interest due to their potent catalytic activities similar to oxidases and peroxidases granting strong antimicrobial effects, have been proposed as potential candidates to overcome the drawbacks of antibiotics like drug resistance. The goal is to study the mode of action of PtNPs related to biofilm formation capacity, reactive oxygen species (ROS) coping mechanism and quorum sensing using foodborne bacteria like Enterococcus faecium and Salmonella Typhimurium
Analytical validation of innovative magneto-inertial outcomes: a controlled environment study.
peer reviewe
Advances and Applications of DSmT for Information Fusion. Collected Works, Volume 5
This fifth volume on Advances and Applications of DSmT for Information Fusion collects theoretical and applied contributions of researchers working in different fields of applications and in mathematics, and is available in open-access. The collected contributions of this volume have either been published or presented after disseminating the fourth volume in 2015 in international conferences, seminars, workshops and journals, or they are new. The contributions of each part of this volume are chronologically ordered.
First Part of this book presents some theoretical advances on DSmT, dealing mainly with modified Proportional Conflict Redistribution Rules (PCR) of combination with degree of intersection, coarsening techniques, interval calculus for PCR thanks to set inversion via interval analysis (SIVIA), rough set classifiers, canonical decomposition of dichotomous belief functions, fast PCR fusion, fast inter-criteria analysis with PCR, and improved PCR5 and PCR6 rules preserving the (quasi-)neutrality of (quasi-)vacuous belief assignment in the fusion of sources of evidence with their Matlab codes.
Because more applications of DSmT have emerged in the past years since the apparition of the fourth book of DSmT in 2015, the second part of this volume is about selected applications of DSmT mainly in building change detection, object recognition, quality of data association in tracking, perception in robotics, risk assessment for torrent protection and multi-criteria decision-making, multi-modal image fusion, coarsening techniques, recommender system, levee characterization and assessment, human heading perception, trust assessment, robotics, biometrics, failure detection, GPS systems, inter-criteria analysis, group decision, human activity recognition, storm prediction, data association for autonomous vehicles, identification of maritime vessels, fusion of support vector machines (SVM), Silx-Furtif RUST code library for information fusion including PCR rules, and network for ship classification.
Finally, the third part presents interesting contributions related to belief functions in general published or presented along the years since 2015. These contributions are related with decision-making under uncertainty, belief approximations, probability transformations, new distances between belief functions, non-classical multi-criteria decision-making problems with belief functions, generalization of Bayes theorem, image processing, data association, entropy and cross-entropy measures, fuzzy evidence numbers, negator of belief mass, human activity recognition, information fusion for breast cancer therapy, imbalanced data classification, and hybrid techniques mixing deep learning with belief functions as well
Semi-automated learning strategies for large-scale segmentation of histology and other big bioimaging stacks and volumes
Labelled high-resolution datasets are becoming increasingly common and necessary in different areas of biomedical imaging. Examples include: serial histology and ex-vivo MRI for atlas building, OCT for studying the human brain, and micro X-ray for tissue engineering. Labelling such datasets, typically, requires manual delineation of a very detailed set of regions of interest on a large number of sections or slices. This process is tedious, time-consuming, not reproducible and rather inefficient due to the high similarity of adjacent sections.
In this thesis, I explore the potential of a semi-automated slice level segmentation framework and a suggestive region level framework which aim to speed up the segmentation process of big bioimaging datasets. The thesis includes two well validated, published, and widely used novel methods and one algorithm which did not yield an improvement compared to the current state-of the-art.
The slice-wise method, SmartInterpol, consists of a probabilistic model for semi-automated segmentation of stacks of 2D images, in which the user manually labels a sparse set of sections (e.g., one every n sections), and lets the algorithm complete the segmentation for other sections automatically. The proposed model integrates in a principled manner two families of segmentation techniques that have been very successful in brain imaging: multi-atlas segmentation and convolutional neural networks.
Labelling every structure on a sparse set of slices is not necessarily optimal, therefore I also introduce a region level active learning framework which requires the labeller to annotate one region of interest on one slice at the time. The framework exploits partial annotations, weak supervision, and realistic estimates of class and section-specific annotation effort in order to greatly reduce the time it takes to produce accurate segmentations for large histological datasets.
Although both frameworks have been created targeting histological datasets, they have been successfully applied to other big bioimaging datasets, reducing labelling effort by up to 60−70% without compromising accuracy
Gut-brain interactions affecting metabolic health and central appetite regulation in diabetes, obesity and aging
The central aim of this thesis was to study the effects of gut microbiota on host energy metabolism and central regulation of appetite. We specifically studied the interaction between gut microbiota-derived short-chain fatty acids (SCFAs), postprandial glucose metabolism and central regulation of appetite. In addition, we studied probable determinants that affect this interaction, specifically: host genetics, bariatric surgery, dietary intake and hypoglycemic medication.First, we studied the involvement of microbiota-derived short-chain fatty acids in glucose tolerance. In an observational study we found an association of intestinal availability of SCFAs acetate and butyrate with postprandial insulin and glucose responses. Hereafter, we performed a clinical trial, administering acetate intravenously at a constant rate and studied the effects on glucose tolerance and central regulation of appetite. The acetate intervention did not have a significant effect on these outcome measures, suggesting the association between increased gastrointestinal SCFAs and metabolic health, as observed in the observational study, is not paralleled when inducing acute plasma elevations.Second, we looked at other determinants affecting gut-brain interactions in metabolic health and central appetite signaling. Therefore, we studied the relation between the microbiota and central appetite regulation in identical twin pairs discordant for BMI. Second, we studied the relation between microbial composition and post-surgery gastrointestinal symptoms upon bariatric surgery. Third, we report the effects of increased protein intake on host microbiota composition and central regulation of appetite. Finally, we explored the effects of combination therapy with GLP-1 agonist exenatide and SGLT2 inhibitor dapagliflozin on brain responses to food stimuli
Residential green and blue spaces and working memory in children aged 6–12 years old. Results from the INMA cohort
Availability of green and blue spaces in the area of residence has been related to various health outcomes during childhood, including neurodevelopment. Some studies have shown that children living in greener and/or bluer areas score better on cognitive tasks although the evidence is inconsistent. These protective effects are hypothesized to occur in part through reductions in air pollution exposure and odds of attention-deficit/hyperactivity disorder (ADHD). This study analysed the effects of residential green and blue spaces on working memory of children in the Spanish INfancia y Medio Ambiente (INMA) birth cohort and the potential joint mediating role of air pollution and ADHD. The study samples were composed of 1738 six-to eight-year-olds (M = 7.53, SD = 0.68, 49% female) and 1449 ten-to twelve-year-olds (M = 11.18, SD = 0.69, 50% female) living in Asturias, Gipuzkoa, Sabadell or Valencia, Spain. Individual Normalized Difference Vegetation Index (NDVI) values in 100-, 300- and 500-m buffers and availability of green and blue spaces >5000 m2 in 300-m buffers were calculated using Geographic Information Systems software. Individual NO2 values for the home environment were estimated using ESCAPE's land use regression models. ADHD diagnosis was reported by participants' parents via a questionnaire. Working memory was measured with numbers and colours (in the younger group only) N-back tests (2- and 3-back d’). Mixed-effects models informed of the beneficial effects of NDVI in a 300-m buffer on numerical working memory in the younger sample although the results were not consistent for all d’ scores considered and failed to detect significant effects through the candidate mediators. Availability of major blue spaces did not predict working memory performance. Provision of green spaces may play a role in children's working memory but further research is required.</p
If interpretability is the answer, what is the question?
Due to the ability to model even complex dependencies, machine learning (ML) can be used to tackle a broad range of (high-stakes) prediction problems. The complexity of the resulting models comes at the cost of transparency, meaning that it is difficult to understand the model by inspecting its parameters.
This opacity is considered problematic since it hampers the transfer of knowledge from the model, undermines the agency of individuals affected by algorithmic decisions, and makes it more challenging to expose non-robust or unethical behaviour.
To tackle the opacity of ML models, the field of interpretable machine learning (IML) has emerged. The field is motivated by the idea that if we could understand the model's behaviour -- either by making the model itself interpretable or by inspecting post-hoc explanations -- we could also expose unethical and non-robust behaviour, learn about the data generating process, and restore the agency of affected individuals. IML is not only a highly active area of research, but the developed techniques are also widely applied in both industry and the sciences.
Despite the popularity of IML, the field faces fundamental criticism, questioning whether IML actually helps in tackling the aforementioned problems of ML and even whether it should be a field of research in the first place:
First and foremost, IML is criticised for lacking a clear goal and, thus, a clear definition of what it means for a model to be interpretable. On a similar note, the meaning of existing methods is often unclear, and thus they may be misunderstood or even misused to hide unethical behaviour. Moreover, estimating conditional-sampling-based techniques poses a significant computational challenge.
With the contributions included in this thesis, we tackle these three challenges for IML.
We join a range of work by arguing that the field struggles to define and evaluate "interpretability" because incoherent interpretation goals are conflated. However, the different goals can be disentangled such that coherent requirements can inform the derivation of the respective target estimands. We demonstrate this with the examples of two interpretation contexts: recourse and scientific inference.
To tackle the misinterpretation of IML methods, we suggest deriving formal interpretation rules that link explanations to aspects of the model and data. In our work, we specifically focus on interpreting feature importance. Furthermore, we collect interpretation pitfalls and communicate them to a broader audience.
To efficiently estimate conditional-sampling-based interpretation techniques, we propose two methods that leverage the dependence structure in the data to simplify the estimation problems for Conditional Feature Importance (CFI) and SAGE.
A causal perspective proved to be vital in tackling the challenges: First, since IML problems such as algorithmic recourse are inherently causal; Second, since causality helps to disentangle the different aspects of model and data and, therefore, to distinguish the insights that different methods provide; And third, algorithms developed for causal structure learning can be leveraged for the efficient estimation of conditional-sampling based IML methods.Aufgrund der Fähigkeit, selbst komplexe Abhängigkeiten zu modellieren, kann maschinelles Lernen (ML) zur Lösung eines breiten Spektrums von anspruchsvollen Vorhersageproblemen eingesetzt werden.
Die Komplexität der resultierenden Modelle geht auf Kosten der Interpretierbarkeit, d. h. es ist schwierig, das Modell durch die Untersuchung seiner Parameter zu verstehen.
Diese Undurchsichtigkeit wird als problematisch angesehen, da sie den Wissenstransfer aus dem Modell behindert, sie die Handlungsfähigkeit von Personen, die von algorithmischen Entscheidungen betroffen sind, untergräbt und sie es schwieriger macht, nicht robustes oder unethisches Verhalten aufzudecken.
Um die Undurchsichtigkeit von ML-Modellen anzugehen, hat sich das Feld des interpretierbaren maschinellen Lernens (IML) entwickelt.
Dieses Feld ist von der Idee motiviert, dass wir, wenn wir das Verhalten des Modells verstehen könnten - entweder indem wir das Modell selbst interpretierbar machen oder anhand von post-hoc Erklärungen - auch unethisches und nicht robustes Verhalten aufdecken, über den datengenerierenden Prozess lernen und die Handlungsfähigkeit betroffener Personen wiederherstellen könnten.
IML ist nicht nur ein sehr aktiver Forschungsbereich, sondern die entwickelten Techniken werden auch weitgehend in der Industrie und den Wissenschaften angewendet.
Trotz der Popularität von IML ist das Feld mit fundamentaler Kritik konfrontiert, die in Frage stellt, ob IML tatsächlich dabei hilft, die oben genannten Probleme von ML anzugehen, und ob es überhaupt ein Forschungsgebiet sein sollte:
In erster Linie wird an IML kritisiert, dass es an einem klaren Ziel und damit an einer klaren Definition dessen fehlt, was es für ein Modell bedeutet, interpretierbar zu sein. Weiterhin ist die Bedeutung bestehender Methoden oft unklar, so dass sie missverstanden oder sogar missbraucht werden können, um unethisches Verhalten zu verbergen. Letztlich stellt die Schätzung von auf bedingten Stichproben basierenden Verfahren eine erhebliche rechnerische Herausforderung dar.
In dieser Arbeit befassen wir uns mit diesen drei grundlegenden Herausforderungen von IML.
Wir schließen uns der Argumentation an, dass es schwierig ist, "Interpretierbarkeit" zu definieren und zu bewerten, weil inkohärente Interpretationsziele miteinander vermengt werden. Die verschiedenen Ziele lassen sich jedoch entflechten, sodass kohärente Anforderungen die Ableitung der jeweiligen Zielgrößen informieren. Wir demonstrieren dies am Beispiel von zwei Interpretationskontexten: algorithmischer Regress
und wissenschaftliche Inferenz.
Um der Fehlinterpretation von IML-Methoden zu begegnen, schlagen wir vor, formale Interpretationsregeln abzuleiten, die Erklärungen mit Aspekten des Modells und der Daten verknüpfen. In unserer Arbeit konzentrieren wir uns speziell auf die Interpretation von sogenannten Feature Importance Methoden. Darüber hinaus tragen wir wichtige Interpretationsfallen zusammen und kommunizieren sie an ein breiteres Publikum.
Zur effizienten Schätzung auf bedingten Stichproben basierender Interpretationstechniken schlagen wir zwei Methoden vor, die die Abhängigkeitsstruktur in den Daten nutzen, um die Schätzprobleme für Conditional Feature Importance (CFI) und SAGE zu vereinfachen.
Eine kausale Perspektive erwies sich als entscheidend für die Bewältigung der Herausforderungen: Erstens, weil IML-Probleme wie der algorithmische Regress inhärent kausal sind; zweitens, weil Kausalität hilft, die verschiedenen Aspekte von Modell und Daten zu entflechten und somit die Erkenntnisse, die verschiedene Methoden liefern, zu unterscheiden; und drittens können wir Algorithmen, die für das Lernen kausaler Struktur entwickelt wurden, für die effiziente Schätzung von auf bindingten Verteilungen basierenden IML-Methoden verwenden
Evaluation of automated organ segmentation for total-body PET-CT
The ability to diagnose rapidly and accurately and treat patients is substantially facilitated by medical images. Radiologists' visual assessment of medical images is crucial to their study. Segmenting images for diagnostic purposes is a crucial step in the medical imaging process. The purpose of medical image segmentation is to locate and isolate ‘Regions of Interest’ (ROI) within a medical image. Several medical uses rely on this procedure, including diagnosis, patient management, and medical study. Medical image segmentation has applications beyond just diagnosis and treatment planning. Quantitative information from medical images can be extracted by image segmentation and employed in the research of new diagnostic and treatment procedures. In addition, image segmentation is a critical procedure in several programs for image processing, including image fusion and registration. In order to construct a single, high-resolution, high-contrast image of an item or organ from several images, a process called "image registration" is used. A more complete picture of the patient's anatomy can be obtained through image fusion, which entails integrating numerous images from different modalities such as computed tomography (CT) and Magnetic resonance imaging (MRI). Once images are obtained using imaging technologies, they go through post-processing procedures before being analyzed. One of the primary and essential steps in post-processing is image segmentation, which involves dividing the images into parts and utilizing only the relevant sections for analysis. This project explores various imaging technologies and tools that can be utilized for image segmentation. Many open-source imaging tools are available for segmenting medical images across various applications. The objective of this study is to use the Jaccard index to evaluate the degree of similarity between the segmentations produced by various medical image visualization and analysis programs
A scalable formulation of joint modelling for longitudinal and time to event data and its application on large electronic health record data of diabetes complications
INTRODUCTION:
Clinical decision-making in the management of diabetes and other chronic diseases depends upon individualised risk predictions of progression of the disease or complica- tions of disease. With sequential measurements of biomarkers, it should be possible to make dynamic predictions that are updated as new data arrive. Since the 1990s, methods have been developed to jointly model longitudinal measurements of biomarkers and time-to-event data, aiming to facilitate predictions in various fields.
These methods offer a comprehensive approach to analyse both the longitudinal changes in biomarkers, and the occurrence of events, allowing for a more integrated understanding of the underlying processes and improved predictive capabilities. The aim of this thesis is to investigate whether established methods for joint modelling are able to scale to large-scale electronic health record datasets with multiple biomarkers measured asynchronously, and evaluates the performance of a novel approach that overcomes the limitations of existing methods.
METHODS:
The epidemiological study design utilised in this research is a retrospective observa- tional study. The data used for these analyses were obtained from a registry encompassing all individuals with type 1 diabetes in Scotland, which is delivered by the Scottish Care Information - Diabetes Collaboration platform. The two outcomes studied were time to cardiovascular disease (CVD) and time to end-stage renal disease (ESRD) from T1D diag- nosis. The longitudinal biomarkers examined in the study were glycosylated haemoglobin (HbA1c) and estimated glomerular filtration rate (eGFR). These biomarkers and endpoints were selected based on their prevalence in the T1D population and the established association between these biomarkers and the outcomes.
As a state-of-the-art method for joint modelling, Brilleman’s stan_jm() function was evaluated. This is an implementation of a shared parameter joint model for longitudinal and time-to- event data in Stan contributed to the rstanarm package. This was compared with a novel approach based on sequential Bayesian updating of a continuous-time state-space model for the biomarkers, with predictions generated by a Kalman filter algorithm using the ctsem package fed into a Poisson time-splitting regression model for the events. In contrast to the standard joint modelling approach that can only fit a linear mixed model to the biomarkers, the ctsem package is able to fit a broader family of models that include terms for autoregressive drift and diffusion. As a baseline for comparison, a last-observation-carried-forward model was evaluated to predict time-to-event.
RESULTS:
The analyses were conducted using renal replacement therapy outcome data regarding 29764 individuals and cardiovascular disease outcome data on 29479 individuals in Scotland (as per the 2019 national registry extract). The CVD dataset was reduced to 24779 individuals with both HbA1c and eGFR data measured on the same date; a limitation of the modelling function itself. The datasets include 799 events of renal replacement therapy (RRT) or death due to renal failure (6.71 years average follow-up) and 2274 CVD events (7.54 years average follow-up) respectively. The standard approach to joint modelling using quadrature to integrate over the trajectories of the latent biomarker states, implemented in rstanarm, was found to be too slow to use even with moderate-sized datasets, e.g. 17.5 hours for a subset
of 2633 subjects, 35.9 hours for 5265 subjects, and more than 68 hours for 10532 subjects. The sequential Bayesian updating approach was much faster, as it was able to analyse a dataset of 29121 individuals over 225598.3 person-years in 19 hours. Comparison of the fit of different longitudinal biomarker submodels showed that the fit of models that also included a drift and diffusion term was much better (AIC 51139 deviance units lower) than models that included only a linear mixed model slope term. Despite this, the improvement in predictive performance was slight for CVD (C-statistic 0.680 to 0.696 for 2112 individuals) and only moderate for end-stage renal disease (C-statistic 0.88 to 0.91 for 2000 individuals) by adding terms for diffusion and drift. The predictive performance of joint modelling in these datasets was only slightly better than using last-observation-carried-forward in the Poisson regression model (C-statistic 0.819 over 8625 person-years).
CONCLUSIONS:
I have demonstrated that unlike the standard approach to joint modelling, implemented in rstanarm, the time-splitting joint modelling approach based on sequential Bayesian updating can scale to a large dataset and allows biomarker trajectories to be modelled with a wider family of models that have better fit than simple linear mixed models. However, in this application, where the only biomarkers were HbA1c and eGFR, and the outcomes were time-to-CVD and end-stage renal disease, the increment in the predictive performance of joint modelling compared with last-observation-carried forward was slight. For other outcomes, where the ability to predict time-to-event depends upon modelling latent biomarker trajectories rather than just using the last-observation-carried-forward, the advantages of joint modelling may be greater.
This thesis proceeds as follows. The first two chapters serve as an introduction to the joint modelling of longitudinal and time-to-event data and its relation to other methods for clinical risk prediction. Briefly, this part explores the rationale for utilising such an approach to manage chronic diseases, such as T1D, better. The methodological chapters of this thesis describe the mathematical formulation of a multivariate shared-parameter joint model and introduce its application and performance on a subset of individuals with T1D and data pertaining to CVD and ESRD outcomes.
Additionally, the mathematical formulation of an alternative time-splitting approach is demonstrated and compared to a conventional method for estimating longitudinal trajectories of clinical biomarkers used in risk prediction. Also, the key features of the pipeline required to implement this approach are outlined. The final chapters of the thesis present an applied example that demonstrates the estimation and evaluation of the alternative modelling approach and explores the types of inferences that can be obtained for a subset of individuals with T1D that might progress to ESRD. Finally, this thesis highlights the strengths and weaknesses of applying and scaling up more complex modelling approaches to facilitate dynamic risk prediction for precision medicine
- …