51,376 research outputs found

    Multilevel Weighted Support Vector Machine for Classification on Healthcare Data with Missing Values

    Full text link
    This work is motivated by the needs of predictive analytics on healthcare data as represented by Electronic Medical Records. Such data is invariably problematic: noisy, with missing entries, with imbalance in classes of interests, leading to serious bias in predictive modeling. Since standard data mining methods often produce poor performance measures, we argue for development of specialized techniques of data-preprocessing and classification. In this paper, we propose a new method to simultaneously classify large datasets and reduce the effects of missing values. It is based on a multilevel framework of the cost-sensitive SVM and the expected maximization imputation method for missing values, which relies on iterated regression analyses. We compare classification results of multilevel SVM-based algorithms on public benchmark datasets with imbalanced classes and missing values as well as real data in health applications, and show that our multilevel SVM-based method produces fast, and more accurate and robust classification results.Comment: arXiv admin note: substantial text overlap with arXiv:1503.0625

    Medical data processing and analysis for remote health and activities monitoring

    Get PDF
    Recent developments in sensor technology, wearable computing, Internet of Things (IoT), and wireless communication have given rise to research in ubiquitous healthcare and remote monitoring of human\u2019s health and activities. Health monitoring systems involve processing and analysis of data retrieved from smartphones, smart watches, smart bracelets, as well as various sensors and wearable devices. Such systems enable continuous monitoring of patients psychological and health conditions by sensing and transmitting measurements such as heart rate, electrocardiogram, body temperature, respiratory rate, chest sounds, or blood pressure. Pervasive healthcare, as a relevant application domain in this context, aims at revolutionizing the delivery of medical services through a medical assistive environment and facilitates the independent living of patients. In this chapter, we discuss (1) data collection, fusion, ownership and privacy issues; (2) models, technologies and solutions for medical data processing and analysis; (3) big medical data analytics for remote health monitoring; (4) research challenges and opportunities in medical data analytics; (5) examples of case studies and practical solutions

    Modeling Big Medical Survival Data Using Decision Tree Analysis with Apache Spark

    Get PDF
    In many medical studies, an outcome of interest is not only whether an event occurred, but when an event occurred; and an example of this is Alzheimer’s disease (AD). Identifying patients with Mild Cognitive Impairment (MCI) who are likely to develop Alzheimer’s disease (AD) is highly important for AD treatment. Previous studies suggest that not all MCI patients will convert to AD. Massive amounts of data from longitudinal and extensive studies on thousands of Alzheimer’s patients have been generated. Building a computational model that can predict conversion form MCI to AD can be highly beneficial for early intervention and treatment planning for AD. This work presents a big data model that contains machine-learning techniques to determine the level of AD in a participant and predict the time of conversion to AD. The proposed framework considers one of the widely used screening assessment for detecting cognitive impairment called Montreal Cognitive Assessment (MoCA). MoCA data set was collected from different centers and integrated into our large data framework storage using a Hadoop Data File System (HDFS); the data was then analyzed using an Apache Spark framework. The accuracy of the proposed framework was compared with a semi-parametric Cox survival analysis model

    An intelligent information forwarder for healthcare big data systems with distributed wearable sensors

    Get PDF
    © 2016 IEEE. An increasing number of the elderly population wish to live an independent lifestyle, rather than rely on intrusive care programmes. A big data solution is presented using wearable sensors capable of carrying out continuous monitoring of the elderly, alerting the relevant caregivers when necessary and forwarding pertinent information to a big data system for analysis. A challenge for such a solution is the development of context-awareness through the multidimensional, dynamic and nonlinear sensor readings that have a weak correlation with observable human behaviours and health conditions. To address this challenge, a wearable sensor system with an intelligent data forwarder is discussed in this paper. The forwarder adopts a Hidden Markov Model for human behaviour recognition. Locality sensitive hashing is proposed as an efficient mechanism to learn sensor patterns. A prototype solution is implemented to monitor health conditions of dispersed users. It is shown that the intelligent forwarders can provide the remote sensors with context-awareness. They transmit only important information to the big data server for analytics when certain behaviours happen and avoid overwhelming communication and data storage. The system functions unobtrusively, whilst giving the users peace of mind in the knowledge that their safety is being monitored and analysed

    Predicting diabetes-related hospitalizations based on electronic health records

    Full text link
    OBJECTIVE: To derive a predictive model to identify patients likely to be hospitalized during the following year due to complications attributed to Type II diabetes. METHODS: A variety of supervised machine learning classification methods were tested and a new method that discovers hidden patient clusters in the positive class (hospitalized) was developed while, at the same time, sparse linear support vector machine classifiers were derived to separate positive samples from the negative ones (non-hospitalized). The convergence of the new method was established and theoretical guarantees were proved on how the classifiers it produces generalize to a test set not seen during training. RESULTS: The methods were tested on a large set of patients from the Boston Medical Center - the largest safety net hospital in New England. It is found that our new joint clustering/classification method achieves an accuracy of 89% (measured in terms of area under the ROC Curve) and yields informative clusters which can help interpret the classification results, thus increasing the trust of physicians to the algorithmic output and providing some guidance towards preventive measures. While it is possible to increase accuracy to 92% with other methods, this comes with increased computational cost and lack of interpretability. The analysis shows that even a modest probability of preventive actions being effective (more than 19%) suffices to generate significant hospital care savings. CONCLUSIONS: Predictive models are proposed that can help avert hospitalizations, improve health outcomes and drastically reduce hospital expenditures. The scope for savings is significant as it has been estimated that in the USA alone, about $5.8 billion are spent each year on diabetes-related hospitalizations that could be prevented.Accepted manuscrip
    corecore