7 research outputs found

    Pointed subspace approach to incomplete data

    Get PDF
    Incomplete data are often represented as vectors with filled missing attributes joined with flag vectors indicating missing components. In this paper, we generalize this approach and represent incomplete data as pointed affine subspaces. This allows to perform various affine transformations of data, such as whitening or dimensionality reduction. Moreover, this representation preserves the information, which coordinates were missing. To use our representation in practical classification tasks, we embed such generalized missing data into a vector space and define the scalar product of embedding space. Our representation is easy to implement, and can be used together with typical kernel methods. Performed experiments show that the application of SVM classifier on the proposed subspace approach obtains highly accurate results

    Learning Better Clinical Risk Models.

    Full text link
    Risk models are used to estimate a patient’s risk of suffering particular outcomes throughout clinical practice. These models are important for matching patients to the appropriate level of treatment, for effective allocation of resources, and for fairly evaluating the performance of healthcare providers. The application and development of methods from the field of machine learning has the potential to improve patient outcomes and reduce healthcare spending with more accurate estimates of patient risk. This dissertation addresses several limitations of currently used clinical risk models, through the identification of novel risk factors and through the training of more effective models. As wearable monitors become more effective and less costly, the previously untapped predictive information in a patient’s physiology over time has the potential to greatly improve clinical practice. However translating these technological advances into real-world clinical impacts will require computational methods to identify high-risk structure in the data. This dissertation presents several approaches to learning risk factors from physiological recordings, through the discovery of latent states using topic models, and through the identification of predictive features using convolutional neural networks. We evaluate these approaches on patients from a large clinical trial and find that these methods not only outperform prior approaches to leveraging heart rate for cardiac risk stratification, but that they improve overall prediction of cardiac death when considered alongside standard clinical risk factors. We also demonstrate the utility of this work for learning a richer description of sleep recordings. Additionally, we consider the development of risk models in the presence of missing data, which is ubiquitous in real-world medical settings. We present a novel method for jointly learning risk and imputation models in the presence of missing data, and find significant improvements relative to standard approaches when evaluated on a large national registry of trauma patients.PhDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/113326/1/alexve_1.pd

    Cybersecurity and safety analysis in online social networks

    Full text link
    The research work deal with the security and safety issues related to the use of online social networks and it successfully presented AI-based solutions to address these issues in online social networks

    Predictive Learning from Real-World Medical Data: Overcoming Quality Challenges

    Get PDF
    Randomized controlled trials (RCTs) are pivotal in medical research, notably as the gold standard, but face challenges, especially with specific groups like pregnant women and newborns. Real-world data (RWD), from sources like electronic medical records and insurance claims, complements RCTs in areas like disease risk prediction and diagnosis. However, RWD's retrospective nature leads to issues such as missing values and data imbalance, requiring intensive data preprocessing. To enhance RWD's quality for predictive modeling, this thesis introduces a suite of algorithms developed to automatically resolve RWD's low-quality issues for predictive modeling. In this study, the AMI-Net method is first introduced, innovatively treating samples as bags with various feature-value pairs and unifying them in an embedding space using a multi-instance neural network. It excels in handling incomplete datasets, a frequent issue in real-world scenarios, and shows resilience to noise and class imbalances. AMI-Net's capability to discern informative instances minimizes the effects of low-quality data. The enhanced version, AMI-Net+, improves instance selection, boosting performance and generalization. However, AMI-Net series initially only processes binary input features, a constraint overcome by AMI-Net3, which supports binary, nominal, ordinal, and continuous features. Despite advancements, challenges like missing values, data inconsistencies, and labeling errors persist in real-world data. The AMI-Net series also shows promise for regression and multi-task learning, potentially mitigating low-quality data issues. Tested on various hospital datasets, these methods prove effective, though risks of overfitting and bias remain, necessitating further research. Overall, while promising for clinical studies and other applications, ensuring data quality and reliability is crucial for these methods' success
    corecore