1,383 research outputs found

    Time-Series Embedded Feature Selection Using Deep Learning: Data Mining Electronic Health Records for Novel Biomarkers

    Get PDF
    As health information technologies continue to advance, routine collection and digitisation of patient health records in the form of electronic health records present as an ideal opportunity for data-mining and exploratory analysis of biomarkers and risk factors indicative of a potentially diverse domain of patient outcomes. Patient records have continually become more widely available through various initiatives enabling open access whilst maintaining critical patient privacy. In spite of such progress, health records remain not widely adopted within the current clinical statistical analysis domain due to challenging issues derived from such “big data”.Deep learning based temporal modelling approaches present an ideal solution to health record challenges through automated self-optimisation of representation learning, able to man-ageably compose the high-dimensional domain of patient records into data representations able to model complex data associations. Such representations can serve to condense and reduce dimensionality to emphasise feature sparsity and importance through novel embedded feature selection approaches. Accordingly, application towards patient records enable complex mod-elling and analysis of the full domain of clinical features to select biomarkers of predictive relevance.Firstly, we propose a novel entropy regularised neural network ensemble able to highlight risk factors associated with hospitalisation risk of individuals with dementia. The application of which, was able to reduce a large domain of unique medical events to a small set of relevant risk factors able to maintain hospitalisation discrimination.Following on, we continue our work on ensemble architecture approaches with a novel cas-cading LSTM ensembles to predict severe sepsis onset within critical patients in an ICU critical care centre. We demonstrate state-of-the-art performance capabilities able to outperform that of current related literature.Finally, we propose a novel embedded feature selection application dubbed 1D convolu-tion feature selection using sparsity regularisation. Said methodology was evaluated on both domains of dementia and sepsis prediction objectives to highlight model capability and generalisability. We further report a selection of potential biomarkers for the aforementioned case study objectives highlighting clinical relevance and potential novelty value for future clinical analysis.Accordingly, we demonstrate the effective capability of embedded feature selection ap-proaches through the application of temporal based deep learning architectures in the discovery of effective biomarkers across a variety of challenging clinical applications

    Prediction of Concurrent Hypertensive Disorders in Pregnancy and Gestational Diabetes Mellitus Using Machine Learning Techniques

    Get PDF
    Gestational diabetes mellitus and hypertensive disorders in pregnancy are serious maternal health conditions with immediate and lifelong mother-child health consequences. These obstetric pathologies have been widely investigated, but mostly in silos, while studies focusing on their simultaneous occurrence rarely exist. This is especially the case in the machine learning domain. This retrospective study sought to investigate, construct, evaluate, compare, and isolate a supervised machine learning predictive model for the binary classification of co-occurring gestational diabetes mellitus and hypertensive disorders in pregnancy in a cohort of otherwise healthy pregnant women. To accomplish the stated aims, this study analyzed an extract (n=4624, n_features=38) of a labelled maternal perinatal dataset (n=9967, n_fields=79) collected by the PeriData.Net® database from a participating community hospital in Southeast Wisconsin between 2013 and 2018. The datasets were named, “WiseSample” and “WiseSubset” respectively in this study. Thirty-three models were constructed with the six supervised machine learning algorithms explored on the extracted dataset: logistic regression, random forest, decision tree, support vector machine, StackingClassifier, and KerasClassifier, which is a deep learning classification algorithm; all were evaluated using the StratifiedKfold cross-validation (k=10) method. The Synthetic Minority Oversampling Technique was applied to the training data to resolve the class imbalance that was noted in the sub-sample at the preprocessing phase. A wide range of evidence-based feature selection techniques were used to identify the best predictors of the comorbidity under investigation. Multiple model performance evaluation metrics that were employed to quantitatively evaluate and compare model performance quality include accuracy, F1, precision, recall, and the area under the receiver operating characteristic curve. Support Vector Machine objectively emerged as the most generalizable model for identifying the gravidae in WiseSubset who may develop concurrent gestational diabetes mellitus and hypertensive disorders in pregnancy, scoring 100.00% (mean) in recall. The model consisted of 9 predictors extracted by the recursive feature elimination with cross-validation with random forest. Finding from this study show that appropriate machine learning methods can reliably predict comorbid gestational diabetes and hypertensive disorders in pregnancy, using readily available routine prenatal attributes. Six of the nine most predictive factors of the comorbidity were also in the top 6 selections of at least one other feature selection method examined. The six predictors are healthy weight prepregnancy BMI, mother’s educational status, husband’s educational status, husband’s occupation in one year before the current pregnancy, mother’s blood group, and mother’s age range between 34 and 44 years. Insight from this analysis would support clinical decision making of obstetric experts when they are caring for 1.) nulliparous women, since they would have no obstetric history that could prompt their care providers for feto-maternal medical surveillance; and 2.) the experienced mothers with no obstetric history suggestive of any of the disease(s) under this study. Hence, among other benefits, the artificial-intelligence-backed tool designed in this research would likely improve maternal and child care quality outcomes

    Clinical Big Data and Deep Learning: Applications, Challenges, and Future Outlooks

    Get PDF
    The explosion of digital healthcare data has led to a surge of data-driven medical research based on machine learning. In recent years, as a powerful technique for big data, deep learning has gained a central position in machine learning circles for its great advantages in feature representation and pattern recognition. This article presents a comprehensive overview of studies that employ deep learning methods to deal with clinical data. Firstly, based on the analysis of the characteristics of clinical data, various types of clinical data (e.g., medical images, clinical notes, lab results, vital signs and demographic informatics) are discussed and details provided of some public clinical datasets. Secondly, a brief review of common deep learning models and their characteristics is conducted. Then, considering the wide range of clinical research and the diversity of data types, several deep learning applications for clinical data are illustrated: auxiliary diagnosis, prognosis, early warning, and other tasks. Although there are challenges involved in applying deep learning techniques to clinical data, it is still worthwhile to look forward to a promising future for deep learning applications in clinical big data in the direction of precision medicine

    Center for Research on Sustainable Forests 2013 Annual Report

    Get PDF
    Together, all of the scientists associated with the CRSF brought a total of 1.85millioninoutsiderevenuetosupportforestresearchinMaineandthenorthernforest.Ofthat,1.85 million in outside revenue to support forest research in Maine and the northern forest. Of that, 1.36 million (or 73%) was spent directly on the research. The Maine Economic Improvement Fund (MEIF) provides base operating funds for the CRSF. The 144KinvestmentbyMEIFthisyearleveragedanother144K investment by MEIF this year leveraged another 1.71 million from outside sources to support the CRSF mission; thus providing a 12:1 return. A hallmark of the success of the CRSF research effort is also measured by the 130 organizations that collaborated directly in the research presented in this report. Results from CRSF research were presented this year in 32 journal articles; 32 book chapters, theses, and research reports; and 130 presentations at conferences and meetings

    Evaluating social equity and conservation attitudes in community based conservation: a case study of the controlled hunting area program in the Bale Mountains of Ethiopia

    Get PDF
    2021 Spring.Includes bibliographical references.This dissertation research examines perceptions of social equity and conservation attitudes in community-based conservation (CBC) programs in the Bale Mountains, Ethiopia. While there has been an increasing shift towards inclusive and participatory approaches in conservation over the past 40 years, the social and environmental outcomes of CBC programs remain limited. One reason for this is the failure to recognize the diversity of local actors involved in CBC programs, the different costs and benefits they face, and how embedded power relations shape participation and empowerment in CBC programs. Devising effective and fair CBC programs requires putting social equity concerns at the core of conservation, which should in turn improve both social and conservation outcomes. This dissertation makes conceptual, methodological, and empirical contributions to the fields of social equity and CBC by implementing a mixed methods assessment of perceptions of social equity and conservation attitudes, as indicators of long-term conservation outcomes, and the factors that influence these perceptions and attitudes. Specifically, Chapter 1 provides an overview of the dissertation starting with a background of the underlying premises and implementation challenges of CBC programs globally and in Ethiopia. The chapter introduces social equity and conservation attitudes as central themes of the dissertation, gives a backdrop of the community-based controlled hunting area program in the Bale Mountains, and highlights the key research questions. In Chapter 2, this dissertation draws from a multi-dimensional social equity framework to generate a nuanced understanding of different groups' perceptions of equity in the distribution of benefits and costs, the processes of engagement and participation, and the recognition of needs and priorities in a CBC program. I conducted 15 focus group discussions in different communities and apply grounded theory to elicit locals' nuanced perceptions of social equity. The chapter underscores the need to evaluate local actors' diverse and contextualized relationships with other actors and the natural world and give recognition to how perceptions of equity interplay with broader social and environmental processes, in designing and implementing CBC programs. For Chapter 3, I conducted household surveys in four communities. This chapter builds on the previous qualitative analysis by assessing the effects of socio-economic and institutional factors in shaping perceptions of equity across different communities and CBC program models. I integrate the Sustainable Livelihoods Framework (SLF) to assess how access to various capital assets influences equity perceptions. The results signify the need to address the heterogeneity among local actors affected by conservation programs in equity design and assessment. These findings further highlight the need to strengthen weak institutional ties with external organizations, facilitate intra-community organization, and design programs that emphasize transparency to facilitate more equitable conservation outcomes. Finally, in Chapter 4, I use household survey responses to assess how conservation attitudes vary across different communities based on different social, economic, and/ or institutional characteristics. I also examine the role of social equity in mediating how social capital affects conservation attitudes. To foster positive conservation attitudes, results suggest CBC programs need to build on and strengthen internal communal institutions and external links with conservation organizations. The findings also emphasize the need for adopting equity conscious designs that recognize the needs and priorities of marginalized groups. Overall, this dissertation contributes to the science and practice of CBC in Ethiopia and beyond. Empirically, the dissertation advances the contribution of mixed methods in assessing the complex construct of social equity. The focus group discussions with different community members and the use of grounded theory helped elicit local people's nuanced and contextualized perceptions of social equity. Informed by these qualitative findings, I developed locally relevant indicators to quantitatively measure equity perceptions across communities and program models. This contributes to the literature on social equity by adopting and refining existing frameworks in ways that are pertinent to specific contextual realities. From a policy perspective, the findings suggest that CBC programs in Ethiopia need to critically address differences in access to resources and decision-making power and to reframe notions of benefits to encapsulate multiple dimensions of equity. Additionally, the findings from this dissertation suggest that CBC programs more broadly will benefit from building internal social capital and strengthening links with external conservation organizations and resource management agencies, as social capital is key in crafting more equitable CBC programs and influencing positive conservation outcomes

    The development of psychiatric disorders and adverse behaviors : from context to prediction

    Get PDF
    Psychiatric disorders by definition cause significant impairment in an individual’s daily functioning. Certain disorders, such as borderline personality disorder (BPD) and eating disorders, have worse prognosis and high mortality rates compared to other psychiatric disorders. Similarly, adverse behaviors such as self-harm, suicide, and crime are often present in individuals with psychiatric disorders. It is of interest to further understand the etiology and associations of BPD and eating disorders to uncover potential avenues and opportunities for intervention. Moreover, prediction modeling has recently come of interest to psychiatric epidemiologists with the rise of large data sets. Prediction modeling may provide valuable information about the nature of risk factors and eventually aid clinical diagnostics and prognostics. Thus, the studies included in this thesis seek to examine the etiology, associations, and prediction approaches of psychiatric disorders and adverse behaviors. Study I examined the individual and familial association between type 1 diabetes (T1D) and eating disorder diagnoses. We used national health care records from Denmark (n = 1,825,920) and Sweden (n = 2,517,277) to calculate the association within individuals, full siblings, half siblings, full cousins, and half cousins. Individuals with T1D had twice the hazard rate ratio of being diagnosed with an eating disorder compared to the general population. There was conflicting evidence for the risk of an eating disorder in full siblings of T1D patients. However, there was no evidence to support a further familial relationship between the two conditions. Study II aimed to illuminate the nature of the correlates for BPD across time, sex, and for their full siblings. We examined 87 variables across psychiatric disorders, somatic illnesses, trauma, and adverse behaviors (such as self-harm). In a sample of 1,969,839 Swedes with 12,175 individuals diagnosed with BPD, we found that BPD was associated with nearly all of the examined variables. The associations were largely consistent across time and between the sexes. Finally, we found that having a sibling diagnosed with BPD was associated with psychiatric disorders, trauma, and adverse behaviors but not somatic illnesses. Study III created a prediction model that could predict who would have high or low psychiatric symptoms at age 15 based on data from parental reports and national health care registers collected at age 9 or 12. Additionally, we compared multiple types of machine learning algorithms to assess predictive performance. The sample included 7,638 twins from the Child and Adolescent Twin Study in Sweden (CATSS). Our model was able to predict the outcome with reasonable performance but is not suitable for use in clinics. Each model performed similarly indicating that researchers with similar data and research questions do not need to forgo standard logistic regression. Study IV aimed to determine if an individual will exhibit suicidal behaviour (self-harm or suicidal thoughts), aggressive behaviour, both, or neither before adulthood with prediction modeling. Through variable importance scores we examined the usefulness of genetic variables within the model. A total of 5,974 participants from CATSS and 2,702 participants from the Netherlands Twin Register (NTR) were included in the study. The model had adequate performance in both the CATSS and NTR datasets for all classes except for the suicidal behaviors class in the NTR, which did not perform better than chance. The included genetic data had higher variable importance scores than questionnaire data completed at age 9 or 12, indicating that genetic biomarkers can be useful when combined with other data types. In conclusion, the development of psychiatric disorders and symptoms are associated with many factors across somatic illnesses, other psychiatric disorders, trauma, and harmful behaviors. The results of this thesis demonstrates the limitations of prediction modeling in psychiatric clinics but highlights their use in research and on the path forward towards personalized medicine

    Affect experience in natural language collected with smartphones

    Get PDF
    Recent technological advancements in computerized text and speech analysis as well as machine learning methods have sparked a growing body of research investigating the algorithmic recognition of affect from the ubiquitous digital traces of natural language data and corresponding affect-linked language variations. Also, commercial interest to leverage these new data using AI for affect inferences is on the rise. However, due to the challenges associated with collecting data on subjective affect experience and corresponding language samples, previous research studies and commercial products have mostly relied on data sets from labelled text or enacted speech and, thereby, are focused on affect expression. This work leverages new smartphone-based data collection methods to collect self-reports on in-situ subjective affect experience and corresponding language samples in the wild to investigate between-person differences and within-person fluctuations in affect experience. The present dissertation aims to achieve three goals: (1) to investigate if between-person differences and within-person fluctuations in subjective affect experience are associated with and predictable from cues in spoken and written natural language, (2) to identify specific language characteristics, such as the use of specific word categories or voice parameters, that are associated with and predictive of affect experience, and (3) to analyze the influence of the context of language production on the associations and predictions of affect experience from natural language. This work is comprised of two empirical studies that analyze self-reports on subjective affect experience and natural language data collected with smartphones. Study 1 investigates predictions of between-person differences and within-person fluctuations in subjective momentary affect experience in more than 23000 speech samples from over 1000 participants in two data sets from Germany and the United States. In contrast to voice acoustics, which contain limited predictive information for affective arousal, state-of-the-art word embeddings yield significant above-chance predictions for affective arousal and valence. Moreover, interpretable machine learning methods are used to identify those voice features (i.e., loudness and spectral features) that are most predictive of affect experience. Finally, the work suggests that affect predictions from voice cues from semi-structured free speech are superior to those from read-out predefined sentences and that the emotional sentiment of the spoken content has no effect on affect predictions from voice cues. Study 2 analyzes patterns in written language data logged through smartphones' keyboards to investigate how between-person differences and within-person fluctuations in affect experience manifest in and are predictable from logged text data across different time frames and communication contexts. From a data set of more than 10 million typed words, features regarding typing dynamics, word use based on word dictionaries, and emoji and emoticon use are computed. From the data, distinct affect-linked language variations across communication contexts (private messaging versus public posting) and time frames (trait, weekly, daily, momentary) are identified (e.g., the use 1st person singular). Predictions of affect experience from machine learning algorithms, however, are not significantly better than chance. Results of this study highlight the challenges of using occurrence-counts, such as word dictionaries, for the assessment of subjective affect experience. By leveraging novel smartphone-based experience sampling and on-device language data collection in everyday life, the present dissertation shows how characteristics of spoken and written language are associated with and predictive of subjective affect experience. Thereby, this work highlights the utility of smartphones for investigating subjective affect experience in natural language in the wild, overcoming the caveats of prior research methods. Prediction results, however, challenge the optimistic prediction performances reported in prior works on the recognition of affect expression experience. Using statistical methods from the areas of description, prediction, and explanation, the present dissertation also reveals specific affect-linked language characteristics. Finally, results underline the relevance of the context of language production on language characteristics and corresponding affect predictions. The promising applications and potential future directions of this technology come with multiple challenges with regard to the conceptualization of affect, interdisciplinarity, ethics, and data privacy and security. If these challenges can be overcome, natural language analysis based on data collected with smartphones represents a promising tool to monitor affective well-being and to advance the affective sciences
    • …