7 research outputs found

    Experimental Aspects of Synthesis

    Full text link
    We discuss the problem of experimentally evaluating linear-time temporal logic (LTL) synthesis tools for reactive systems. We first survey previous such work for the currently publicly available synthesis tools, and then draw conclusions by deriving useful schemes for future such evaluations. In particular, we explain why previous tools have incompatible scopes and semantics and provide a framework that reduces the impact of this problem for future experimental comparisons of such tools. Furthermore, we discuss which difficulties the complex workflows that begin to appear in modern synthesis tools induce on experimental evaluations and give answers to the question how convincing such evaluations can still be performed in such a setting.Comment: In Proceedings iWIGP 2011, arXiv:1102.374

    Visualization of medical concepts represented using word embeddings: a scoping review.

    No full text
    International audienceBackgroundAnalyzing the unstructured textual data contained in electronic health records (EHRs) has always been a challenging task. Word embedding methods have become an essential foundation for neural network-based approaches in natural language processing (NLP), to learn dense and low-dimensional word representations from large unlabeled corpora that capture the implicit semantics of words. Models like Word2Vec, GloVe or FastText have been broadly applied and reviewed in the bioinformatics and healthcare fields, most often to embed clinical notes or activity and diagnostic codes. Visualization of the learned embeddings has been used in a subset of these works, whether for exploratory or evaluation purposes. However, visualization practices tend to be heterogeneous, and lack overall guidelines.ObjectiveThis scoping review aims to describe the methods and strategies used to visualize medical concepts represented using word embedding methods. We aim to understand the objectives of the visualizations and their limits.MethodsThis scoping review summarizes different methods used to visualize word embeddings in healthcare. We followed the methodology proposed by Arksey and O’Malley (Int J Soc Res Methodol 8:19–32, 2005) and by Levac et al. (Implement Sci 5:69, 2010) to better analyze the data and provide a synthesis of the literature on the matter.ResultsWe first obtained 471 unique articles from a search conducted in PubMed, MedRxiv and arXiv databases. 30 of these were effectively reviewed, based on our inclusion and exclusion criteria. 23 articles were excluded in the full review stage, resulting in the analysis of 7 papers that fully correspond to our inclusion criteria. Included papers pursued a variety of objectives and used distinct methods to evaluate their embeddings and to visualize them. Visualization also served heterogeneous purposes, being alternatively used as a way to explore the embeddings, to evaluate them or to merely illustrate properties otherwise formally assessed.ConclusionsVisualization helps to explore embedding results (further dimensionality reduction, synthetic representation). However, it does not exhaust the information conveyed by the embeddings nor constitute a self-sustaining evaluation method of their pertinence

    Specifications for the Routine Implementation of Federated Learning in Hospitals Networks

    Get PDF
    International audienceWe collected user needs to define a process for setting up Federated Learning in a network of hospitals. We identified seven steps: consortium definition, architecture implementation, clinical study definition, data collection, initialization, model training and results sharing. This process adapts certain steps from the classical centralized multicenter framework and brings new opportunities for interaction thanks to the architecture of the Federated Learning algorithms. It is open for completion to cover a variety of scenarios

    A Decentralized Framework for Biostatistics and Privacy Concerns

    Get PDF
    International audienceBiostatistics and machine learning have been the cornerstone of a variety of recent developments in medicine. In order to gather large enough datasets, it is often necessary to set up multi-centric studies; yet, centralization of measurements can be difficult, either for practical, legal or ethical reasons. As an alternative, federated learning enables leveraging multiple centers' data without actually collating them. While existing works generally require a center to act as a leader and coordinate computations, we propose a fully decentralized framework where each center plays the same role. In this paper, we apply this framework to logistic regression, including confidence intervals computation. We test our algorithm on two distinct clinical datasets split among different centers, and show that it matches results from the centralized framework. In addition, we discuss possible privacy leaks and potential protection mechanisms, paving the way towards further research

    Effects of Immunoglobulins G From Systemic Sclerosis Patients in Normal Dermal Fibroblasts: A Multi-Omics Study

    No full text
    International audienceAutoantibodies (Aabs) are frequent in systemic sclerosis (SSc). Although recognized as potent biomarkers, their pathogenic role is debated. This study explored the effect of purified immunoglobulin G (IgG) from SSc patients on protein and mRNA expression of dermal fibroblasts (FBs) using an innovative multi-omics approach. Dermal FBs were cultured in the presence of sera or purified IgG from patients with diffuse cutaneous SSc (dcSSc), limited cutaneous SSc or healthy controls (HCs). The FB proteome and transcriptome were explored using liquid chromatography coupled with tandem mass spectrometry (LC-MS/MS) and microarray assays, respectively. Proteomic analysis identified 3,310 proteins. SSc sera and purified IgG induced singular protein profile patterns. These FB proteome changes depended on the Aab serotype, with a singular effect observed with purified IgG from anti-topoisomerase-I autoantibody (ATA) positive patients compared to HC or other SSc serotypes. IgG from ATA positive SSc patients induced enrichment in proteins involved in focal adhesion, cadherin binding, cytosolic part, or lytic vacuole. Multi-omics analysis was performed in two ways: first by restricting the analysis of the transcriptomic data to differentially expressed proteins; and secondly, by performing a global statistical analysis integrating proteomics and transcriptomics. Transcriptomic analysis distinguished 764 differentially expressed genes and revealed that IgG from dcSSc can induce extracellular matrix (ECM) remodeling changes in gene expression profiles in FB. Global statistical analysis integrating proteomics and transcriptomics confirmed that IgG from SSc can induce ECM remodeling and activate FB profiles. This effect depended on the serotype of the patient, suggesting that SSc Aab might play a pathogenic role in some SSc subsets

    Linking Biomedical Data Warehouse Records With the National Mortality Database in France: Large-scale Matching Algorithm

    No full text
    International audienceBackground: Often missing from or uncertain in a biomedical data warehouse (BDW), vital status after discharge is central to the value of a BDW in medical research. The French National Mortality Database (FNMD) offers open-source nominative records of every death. Matching large-scale BDWs records with the FNMD combines multiple challenges: absence of unique common identifiers between the 2 databases, names changing over life, clerical errors, and the exponential growth of the number of comparisons to compute.Objective: We aimed to develop a new algorithm for matching BDW records to the FNMD and evaluated its performance.Methods: We developed a deterministic algorithm based on advanced data cleaning and knowledge of the naming system and the Damerau-Levenshtein distance (DLD). The algorithm's performance was independently assessed using BDW data of 3 university hospitals: Lille, Nantes, and Rennes. Specificity was evaluated with living patients on January 1, 2016 (ie, patients with at least 1 hospital encounter before and after this date). Sensitivity was evaluated with patients recorded as deceased between January 1, 2001, and December 31, 2020. The DLD-based algorithm was compared to a direct matching algorithm with minimal data cleaning as a reference.Results: All centers combined, sensitivity was 11% higher for the DLD-based algorithm (93.3%, 95% CI 92.8-93.9) than for the direct algorithm (82.7%, 95% CI 81.8-83.6; P<.001). Sensitivity was superior for men at 2 centers (Nantes: 87%, 95% CI 85.1-89 vs 83.6%, 95% CI 81.4-85.8; P=.006; Rennes: 98.6%, 95% CI 98.1-99.2 vs 96%, 95% CI 94.9-97.1; P<.001) and for patients born in France at all centers (Nantes: 85.8%, 95% CI 84.3-87.3 vs 74.9%, 95% CI 72.8-77.0; P<.001). The DLD-based algorithm revealed significant differences in sensitivity among centers (Nantes, 85.3% vs Lille and Rennes, 97.3%, P<.001). Specificity was >98% in all subgroups. Our algorithm matched tens of millions of death records from BDWs, with parallel computing capabilities and low RAM requirements. We used the Inseehop open-source R script for this measurement.Conclusions: Overall, sensitivity/recall was 11% higher using the DLD-based algorithm than that using the direct algorithm. This shows the importance of advanced data cleaning and knowledge of a naming system through DLD use. Statistically significant differences in sensitivity between groups could be found and must be considered when performing an analysis to avoid differential biases. Our algorithm, originally conceived for linking a BDW with the FNMD, can be used to match any large-scale databases. While matching operations using names are considered sensitive computational operations, the Inseehop package released her
    corecore