7,751 research outputs found

    Modeling Complex Disease Trajectories using Deep Generative Models with Semi-Supervised Latent Processes

    Full text link
    In this paper, we propose a deep generative time series approach using latent temporal processes for modeling and holistically analyzing complex disease trajectories. We aim to find meaningful temporal latent representations of an underlying generative process that explain the observed disease trajectories in an interpretable and comprehensive way. To enhance the interpretability of these latent temporal processes, we develop a semi-supervised approach for disentangling the latent space using established medical concepts. By combining the generative approach with medical knowledge, we leverage the ability to discover novel aspects of the disease while integrating medical concepts into the model. We show that the learned temporal latent processes can be utilized for further data analysis and clinical hypothesis testing, including finding similar patients and clustering the disease into new sub-types. Moreover, our method enables personalized online monitoring and prediction of multivariate time series including uncertainty quantification. We demonstrate the effectiveness of our approach in modeling systemic sclerosis, showcasing the potential of our machine learning model to capture complex disease trajectories and acquire new medical knowledge.Comment: Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2023, December 10th, 2023, New Orleans, United States, 23 page

    A framework for automated anomaly detection in high frequency water-quality data from in situ sensors

    Full text link
    River water-quality monitoring is increasingly conducted using automated in situ sensors, enabling timelier identification of unexpected values. However, anomalies caused by technical issues confound these data, while the volume and velocity of data prevent manual detection. We present a framework for automated anomaly detection in high-frequency water-quality data from in situ sensors, using turbidity, conductivity and river level data. After identifying end-user needs and defining anomalies, we ranked their importance and selected suitable detection methods. High priority anomalies included sudden isolated spikes and level shifts, most of which were classified correctly by regression-based methods such as autoregressive integrated moving average models. However, using other water-quality variables as covariates reduced performance due to complex relationships among variables. Classification of drift and periods of anomalously low or high variability improved when we applied replaced anomalous measurements with forecasts, but this inflated false positive rates. Feature-based methods also performed well on high priority anomalies, but were also less proficient at detecting lower priority anomalies, resulting in high false negative rates. Unlike regression-based methods, all feature-based methods produced low false positive rates, but did not and require training or optimization. Rule-based methods successfully detected impossible values and missing observations. Thus, we recommend using a combination of methods to improve anomaly detection performance, whilst minimizing false detection rates. Furthermore, our framework emphasizes the importance of communication between end-users and analysts for optimal outcomes with respect to both detection performance and end-user needs. Our framework is applicable to other types of high frequency time-series data and anomaly detection applications

    Energy Analytics for Infrastructure: An Application to Institutional Buildings

    Get PDF
    abstract: Commercial buildings in the United States account for 19% of the total energy consumption annually. Commercial Building Energy Consumption Survey (CBECS), which serves as the benchmark for all the commercial buildings provides critical input for EnergyStar models. Smart energy management technologies, sensors, innovative demand response programs, and updated versions of certification programs elevate the opportunity to mitigate energy-related problems (blackouts and overproduction) and guides energy managers to optimize the consumption characteristics. With increasing advancements in technologies relying on the ‘Big Data,' codes and certification programs such as the American Society of Heating, Refrigerating and Air-Conditioning Engineers (ASHRAE), and the Leadership in Energy and Environmental Design (LEED) evaluates during the pre-construction phase. It is mostly carried out with the assumed quantitative and qualitative values calculated from energy models such as Energy Plus and E-quest. However, the energy consumption analysis through Knowledge Discovery in Databases (KDD) is not commonly used by energy managers to perform complete implementation, causing the need for better energy analytic framework. The dissertation utilizes Interval Data (ID) and establishes three different frameworks to identify electricity losses, predict electricity consumption and detect anomalies using data mining, deep learning, and mathematical models. The process of energy analytics integrates with the computational science and contributes to several objectives which are to 1. Develop a framework to identify both technical and non-technical losses using clustering and semi-supervised learning techniques. 2. Develop an integrated framework to predict electricity consumption using wavelet based data transformation model and deep learning algorithms. 3. Develop a framework to detect anomalies using ensemble empirical mode decomposition and isolation forest algorithms. With a thorough research background, the first phase details on performing data analytics on the demand-supply database to determine the potential energy loss reduction potentials. Data preprocessing and electricity prediction framework in the second phase integrates mathematical models and deep learning algorithms to accurately predict consumption. The third phase employs data decomposition model and data mining techniques to detect the anomalies of institutional buildings.Dissertation/ThesisDoctoral Dissertation Civil, Environmental and Sustainable Engineering 201
    • …
    corecore