27 research outputs found

    Simultaneous Measurement Imputation and Outcome Prediction for Achilles Tendon Rupture Rehabilitation

    Full text link
    Achilles Tendon Rupture (ATR) is one of the typical soft tissue injuries. Rehabilitation after such a musculoskeletal injury remains a prolonged process with a very variable outcome. Accurately predicting rehabilitation outcome is crucial for treatment decision support. However, it is challenging to train an automatic method for predicting the ATR rehabilitation outcome from treatment data, due to a massive amount of missing entries in the data recorded from ATR patients, as well as complex nonlinear relations between measurements and outcomes. In this work, we design an end-to-end probabilistic framework to impute missing data entries and predict rehabilitation outcomes simultaneously. We evaluate our model on a real-life ATR clinical cohort, comparing with various baselines. The proposed method demonstrates its clear superiority over traditional methods which typically perform imputation and prediction in two separate stages

    Matcha-TTS: A fast TTS architecture with conditional flow matching

    Full text link
    We introduce Matcha-TTS, a new encoder-decoder architecture for speedy TTS acoustic modelling, trained using optimal-transport conditional flow matching (OT-CFM). This yields an ODE-based decoder capable of high output quality in fewer synthesis steps than models trained using score matching. Careful design choices additionally ensure each synthesis step is fast to run. The method is probabilistic, non-autoregressive, and learns to speak from scratch without external alignments. Compared to strong pre-trained baseline models, the Matcha-TTS system has the smallest memory footprint, rivals the speed of the fastest models on long utterances, and attains the highest mean opinion score in a listening test. Please see https://shivammehta25.github.io/Matcha-TTS/ for audio examples, code, and pre-trained models.Comment: 5 pages, 3 figures. Submitted to ICASSP 202

    Neuropathic Pain Diagnosis Simulator for Causal Discovery Algorithm Evaluation

    Full text link
    Discovery of causal relations from observational data is essential for many disciplines of science and real-world applications. However, unlike other machine learning algorithms, whose development has been greatly fostered by a large amount of available benchmark datasets, causal discovery algorithms are notoriously difficult to be systematically evaluated because few datasets with known ground-truth causal relations are available. In this work, we handle the problem of evaluating causal discovery algorithms by building a flexible simulator in the medical setting. We develop a neuropathic pain diagnosis simulator, inspired by the fact that the biological processes of neuropathic pathophysiology are well studied with well-understood causal influences. Our simulator exploits the causal graph of the neuropathic pain pathology and its parameters in the generator are estimated from real-life patient cases. We show that the data generated from our simulator have similar statistics as real-world data. As a clear advantage, the simulator can produce infinite samples without jeopardizing the privacy of real-world patients. Our simulator provides a natural tool for evaluating various types of causal discovery algorithms, including those to deal with practical issues in causal discovery, such as unknown confounders, selection bias, and missing data. Using our simulator, we have evaluated extensively causal discovery algorithms under various settings.Comment: Accepted by NeurIPS 2019, 6 figures, 10 table

    Controllable Motion Synthesis and Reconstruction with Autoregressive Diffusion Models

    Full text link
    Data-driven and controllable human motion synthesis and prediction are active research areas with various applications in interactive media and social robotics. Challenges remain in these fields for generating diverse motions given past observations and dealing with imperfect poses. This paper introduces MoDiff, an autoregressive probabilistic diffusion model over motion sequences conditioned on control contexts of other modalities. Our model integrates a cross-modal Transformer encoder and a Transformer-based decoder, which are found effective in capturing temporal correlations in motion and control modalities. We also introduce a new data dropout method based on the diffusion forward process to provide richer data representations and robust generation. We demonstrate the superior performance of MoDiff in controllable motion synthesis for locomotion with respect to two baselines and show the benefits of diffusion data dropout for robust synthesis and reconstruction of high-fidelity motion close to recorded data

    Causal discovery in the presence of missing data

    No full text
    Missing data are ubiquitous in many domains such as healthcare. Depending on how they are missing, the (conditional) independence relations in the observed data may be different from those for the complete data generated by the underlying causal process (which are not fully observable) and, as a consequence, simply applying existing causal discovery methods to the observed data may give wrong conclusions. It is then essential to extend existing causal discovery approaches to find true underlying causal structure from such incomplete data. In this thesis, we aim at solving this problem for data that are missing completely at random (MCAR), missing at random (MAR), or missing not at random (MNAR). With missingness mechanisms represented by the Missingness Graph, we present conditions under which addition corrected to derive conditional independence/dependence relations in the complete data. Combined with the correction method that gives closed-form, consistent tests of conditional independence, the proposed causal discovery method, as an extension of the PC algorithm, is shown to give asymptotically correct results. Experiment results illustrate that with further reasonable assumptions, the proposed algorithm can correct the conditional independence for values MCAR, MAR and rather general cases of values MNAR.Saknade data Àr allestÀdes nÀrvarande pÄ mÄnga omrÄden, t.ex. sjukvÄrd. Beroende pÄ hur de saknas kan de (villkorliga) oberoende förhÄllandena i de observerade uppgifterna skilja sig frÄn de för de fullstÀndiga data som genereras av den underliggande orsaksprocessen (som inte Àr fullt observerbara) och som en följd av att helt enkelt tillÀmpa befintlig kausal upptÀckt metoder för de observerade data kan ge felaktiga slutsatser. Det Àr dÄ viktigt att förlÀnga befintliga metoder för kausala upptÀckter för att hitta en sann underliggande kausalstruktur frÄn sÄdana ofullstÀndiga data. I denna avhandling strÀvar vi efter att lösa detta problem för data som saknas helt slumpmÀssigt (MCAR), saknas slumpmÀssigt (MAR) eller saknas inte slumpmÀssigt (MNAR). Med missmekanismer representerade av Missfallsgrafen presenterar vi förhÄllanden under vilka tillÀgg korrigerade för att hÀrleda villkorliga oberoende/beroendeförhÄllanden i de fullstÀndiga uppgifterna.Kombinerad med korrigeringsmetoden som ger sluten form, konsekventa test av villkorligt oberoende, visas att den föreslagnaorsaks-sökningsmetoden, som en förlÀngning av PC-algoritmen, ger asymptotiskt korrekta resultat. Experimentresultat illustrera att med ytterligare rimliga antaganden kan den föreslagna algoritmen korrigera det villkorliga oberoende för vÀrdena MCAR, MAR och ganska generella fall av vÀrden MNAR

    Causal discovery in the presence of missing data

    No full text
    Missing data are ubiquitous in many domains such as healthcare. Depending on how they are missing, the (conditional) independence relations in the observed data may be different from those for the complete data generated by the underlying causal process (which are not fully observable) and, as a consequence, simply applying existing causal discovery methods to the observed data may give wrong conclusions. It is then essential to extend existing causal discovery approaches to find true underlying causal structure from such incomplete data. In this thesis, we aim at solving this problem for data that are missing completely at random (MCAR), missing at random (MAR), or missing not at random (MNAR). With missingness mechanisms represented by the Missingness Graph, we present conditions under which addition corrected to derive conditional independence/dependence relations in the complete data. Combined with the correction method that gives closed-form, consistent tests of conditional independence, the proposed causal discovery method, as an extension of the PC algorithm, is shown to give asymptotically correct results. Experiment results illustrate that with further reasonable assumptions, the proposed algorithm can correct the conditional independence for values MCAR, MAR and rather general cases of values MNAR.Saknade data Àr allestÀdes nÀrvarande pÄ mÄnga omrÄden, t.ex. sjukvÄrd. Beroende pÄ hur de saknas kan de (villkorliga) oberoende förhÄllandena i de observerade uppgifterna skilja sig frÄn de för de fullstÀndiga data som genereras av den underliggande orsaksprocessen (som inte Àr fullt observerbara) och som en följd av att helt enkelt tillÀmpa befintlig kausal upptÀckt metoder för de observerade data kan ge felaktiga slutsatser. Det Àr dÄ viktigt att förlÀnga befintliga metoder för kausala upptÀckter för att hitta en sann underliggande kausalstruktur frÄn sÄdana ofullstÀndiga data. I denna avhandling strÀvar vi efter att lösa detta problem för data som saknas helt slumpmÀssigt (MCAR), saknas slumpmÀssigt (MAR) eller saknas inte slumpmÀssigt (MNAR). Med missmekanismer representerade av Missfallsgrafen presenterar vi förhÄllanden under vilka tillÀgg korrigerade för att hÀrleda villkorliga oberoende/beroendeförhÄllanden i de fullstÀndiga uppgifterna.Kombinerad med korrigeringsmetoden som ger sluten form, konsekventa test av villkorligt oberoende, visas att den föreslagnaorsaks-sökningsmetoden, som en förlÀngning av PC-algoritmen, ger asymptotiskt korrekta resultat. Experimentresultat illustrera att med ytterligare rimliga antaganden kan den föreslagna algoritmen korrigera det villkorliga oberoende för vÀrdena MCAR, MAR och ganska generella fall av vÀrden MNAR

    A Further Step of Causal Discovery towards Real-World Impacts

    No full text
    The goal of many sciences is to find causal relationships and understand underlying mechanisms. As the golden standard for finding causal relationships, doing randomized experiments can be difficult or impossible in some applications; hence, determining underlying causal relationships purely from observational data, i.e., causal discovery, has attracted more and more attention in many domains, such as earth science, biology, and healthcare. On the one hand, computational methods of causal discovery have been developed and improved significantly in the recent three decades. On the other hand, there are still many challenges in both practice and theory to further achieve real-world impacts. This thesis aims to introduce the typical methods and challenges of causal discovery and then elaborates on the contributions of the included papers that step forward to achieve more real-world impacts for causal discovery. It mainly covers four challenges: practical issues, understanding and generalizing the restrictive assumptions, the lack of benchmark data sets, and applications of causality in machine learning topics. Each included paper contributes to one of the challenges. In the first paper, regarding causal discovery in the presence of missing data as one of the practical issues, we theoretically study the influence of missing values on causal discovery methods and then correct the errors in their results. Under mild assumptions, our proposed method provides asymptotically correct results. In the second paper, we investigate the understanding of assumptions in a class of causal discovery methods. Such methods impose substantial constraints on functional classes and distributions of causal processes for determining causal relationships; however, the constraints are restrictive and there is a lack of good understanding. Therefore, we introduce a new dynamical-system view for understanding the methods and their constraints by connecting optimal transport and causal discovery. Furthermore, we provide a causal discovery criterion and a robust optimal transport-based algorithm.  In the third paper, the evaluation of causal discovery methods is discussed. While it is too simplistic to evaluate causal discovery methods on synthetic data generated from random causal graphs, the real-world benchmark data sets with ground-truth causal relations are in great demand and always include practical issues. Thus, we create a neuropathic pain diagnosis simulator based on real-world patient records and domain knowledge. The simulator provides ground-truth causal relations and generates simulation data that cannot be distinguished by the medical expert.  Finally, we explored an application of causality: Fairness in machine learning. Many fairness works are based on the constraints of static statistical measures across different demographic groups. It turns out that decisions under such constraints can lead to a pernicious long-term impact on the disadvantaged group. Therefore, we consider the underlying causal processes, theoretically analyze the equilibrium states of dynamical systems under various fairness constraints, show their impact on equilibrium states, and introduce potentially effective interventions to improve the equilibrium states. MĂ„let för mĂ„nga vetenskapsomrĂ„den Ă€r att hitta orsakssamband och förstĂ„ bakomliggande mekanismer. Som den gyllene standarden för att hitta orsakssamband kan slumpmĂ€ssiga experiment vara svĂ„ra eller omöjliga i vissa tillĂ€mpningar; DĂ€rför har bestĂ€mning av underliggande orsakssamband enbart frĂ„n observationsdata, d.v.s. kausal upptĂ€ckt, vĂ€ckt mer och mer uppmĂ€rksamhet inom mĂ„nga omrĂ„den, sĂ„som geovetenskap, biologi och sjukvĂ„rd. Å ena sidan har berĂ€kningsmetoder för kausal upptĂ€ckt utvecklats och förbĂ€ttrats avsevĂ€rt under de senaste tre decennierna. Å andra sidan finns det fortfarande mĂ„nga utmaningar kvar i bĂ„de praktik och teori för att ytterligare uppnĂ„ verkliga effekter. Denna avhandling syftar till att introducera de typiska metoderna och utmaningarna för kausal upptĂ€ckt och sedan utveckla bidragen frĂ„n de inkluderade artiklarna som tar kliv framĂ„t för att uppnĂ„ fler verkliga effekter för kausal upptĂ€ckt. Den tĂ€cker huvudsakligen fyra utmaningar: praktiska frĂ„gor, förstĂ„else och generalisering av de restriktiva antagandena, bristen pĂ„ uppsĂ€ttningar av referensdata och tillĂ€mpningar av kausalitet i maskininlĂ€rningsomrĂ„den. Varje medföljande artikel bidrar till en av utmaningarna. I den första artikeln, angĂ„ende kausal upptĂ€ckt i nĂ€rvaro av saknade data som en av de praktiska frĂ„gorna, studerar vi teoretiskt saknade vĂ€rdens inverkan pĂ„ metoder för kausal upptĂ€ckt och korrigerar sedan felen i deras resultat. Under milda antaganden ger vĂ„r föreslagna metod korrekta resultat. I den andra artikeln undersöker vi förstĂ„elsen av antaganden i en klass av kausala upptĂ€cktsmetoder. SĂ„dana metoder lĂ€gger betydande begrĂ€nsningar pĂ„ funktionella klasser och fördelningar av kausala processer för att bestĂ€mma orsakssamband; dock Ă€r begrĂ€nsningarna restriktiva och det saknas god förstĂ„else. DĂ€rför introducerar vi en ny dynamisk systemvy för att förstĂ„ metoderna och deras begrĂ€nsningar genom att koppla ihop optimal transport och kausal upptĂ€ckt. Dessutom tillhandahĂ„ller vi ett kausalt upptĂ€cktskriterium och en robust optimal transport-baserad algoritm. I den tredje artikeln diskuteras utvĂ€rderingen av kausala upptĂ€cktsmetoder. Även om det Ă€r för förenklat att utvĂ€rdera kausala upptĂ€cktsmetoder med syntetisk data genererad frĂ„n slumpmĂ€ssiga kausala grafer, sĂ„ Ă€r uppsĂ€ttningar av verklig referensdata med grund-sannings orsakssamband efterfrĂ„gade och inkluderar alltid praktiska frĂ„gor. DĂ€rför skapar vi en simulator för neuropatisk smĂ€rtdiagnos baserad pĂ„ verkliga patientjournaler och domĂ€nkunskap. Simulatorn tillhandahĂ„ller sanna orsakssamband och genererar simuleringsdata som inte kan urskiljas av medicinska experter. Slutligen undersökte vi en tillĂ€mpning av kausalitet: RĂ€ttvisa i maskininlĂ€rning. MĂ„nga arbeten inom rĂ€ttvisa Ă€r baserade pĂ„ begrĂ€nsningar av statiska statistiska mĂ„tt över olika demografiska grupper. Det visar sig att beslut under sĂ„dana begrĂ€nsningar kan leda till en skadlig lĂ„ngsiktig pĂ„verkan pĂ„ den missgynnade gruppen. DĂ€rför tar vi hĂ€nsyn till de bakomliggande orsaksprocesserna, analyserar teoretiskt jĂ€mviktstillstĂ„nden i dynamiska system under olika rĂ€ttvisa begrĂ€nsningar, visar deras inverkan pĂ„ jĂ€mviktstillstĂ„nd och introducerar potentiellt effektiva interventioner för att förbĂ€ttra jĂ€mviktstillstĂ„nden.QC 20221217</p

    Optimal transport for causal discovery

    No full text
    To determine causal relationships between two variables, approaches based on Functional Causal Models (FCMs) have been proposed by properly restricting model classes; however, the performance is sensitive to the model assumptions, which makes it difficult to use. In this paper, we provide a novel dynamical-system view of FCMs and propose a new framework for identifying causal direction in the bivariate case. We first show the connection between FCMs and optimal transport, and then study optimal transport under the constraints of FCMs. Furthermore, by exploiting the dynamical interpretation of optimal transport under the FCM constraints, we determine the corresponding underlying dynamical process of the static cause-effect pair data. It provides a new dimension for describing static causal discovery tasks while enjoying more freedom for modeling the quantitative causal influences. In particular, we show that Additive Noise Models (ANMs) correspond to volume-preserving pressure less flows. Consequently, based on their velocity field divergence, we introduce a criterion for determining causal direction. With this criterion, we propose a novel optimal transport-based algorithm for ANMs which is robust to the choice of models and extend it to post-nonlinear models. Our method demonstrated state-of-the-art results on both synthetic and causal discovery benchmark datasets.QC 20220329</p

    Optimal transport for causal discovery

    No full text
    To determine causal relationships between two variables, approaches based on Functional Causal Models (FCMs) have been proposed by properly restricting model classes; however, the performance is sensitive to the model assumptions, which makes it difficult to use. In this paper, we provide a novel dynamical-system view of FCMs and propose a new framework for identifying causal direction in the bivariate case. We first show the connection between FCMs and optimal transport, and then study optimal transport under the constraints of FCMs. Furthermore, by exploiting the dynamical interpretation of optimal transport under the FCM constraints, we determine the corresponding underlying dynamical process of the static cause-effect pair data. It provides a new dimension for describing static causal discovery tasks while enjoying more freedom for modeling the quantitative causal influences. In particular, we show that Additive Noise Models (ANMs) correspond to volume-preserving pressure less flows. Consequently, based on their velocity field divergence, we introduce a criterion for determining causal direction. With this criterion, we propose a novel optimal transport-based algorithm for ANMs which is robust to the choice of models and extend it to post-nonlinear models. Our method demonstrated state-of-the-art results on both synthetic and causal discovery benchmark datasets.QC 20220329</p
    corecore