119 research outputs found

    Beyond Volume: The Impact of Complex Healthcare Data on the Machine Learning Pipeline

    Full text link
    From medical charts to national census, healthcare has traditionally operated under a paper-based paradigm. However, the past decade has marked a long and arduous transformation bringing healthcare into the digital age. Ranging from electronic health records, to digitized imaging and laboratory reports, to public health datasets, today, healthcare now generates an incredible amount of digital information. Such a wealth of data presents an exciting opportunity for integrated machine learning solutions to address problems across multiple facets of healthcare practice and administration. Unfortunately, the ability to derive accurate and informative insights requires more than the ability to execute machine learning models. Rather, a deeper understanding of the data on which the models are run is imperative for their success. While a significant effort has been undertaken to develop models able to process the volume of data obtained during the analysis of millions of digitalized patient records, it is important to remember that volume represents only one aspect of the data. In fact, drawing on data from an increasingly diverse set of sources, healthcare data presents an incredibly complex set of attributes that must be accounted for throughout the machine learning pipeline. This chapter focuses on highlighting such challenges, and is broken down into three distinct components, each representing a phase of the pipeline. We begin with attributes of the data accounted for during preprocessing, then move to considerations during model building, and end with challenges to the interpretation of model output. For each component, we present a discussion around data as it relates to the healthcare domain and offer insight into the challenges each may impose on the efficiency of machine learning techniques.Comment: Healthcare Informatics, Machine Learning, Knowledge Discovery: 20 Pages, 1 Figur

    Random walk with barriers: Diffusion restricted by permeable membranes

    Full text link
    Restrictions to molecular motion by barriers (membranes) are ubiquitous in biological tissues, porous media and composite materials. A major challenge is to characterize the microstructure of a material or an organism nondestructively using a bulk transport measurement. Here we demonstrate how the long-range structural correlations introduced by permeable membranes give rise to distinct features of transport. We consider Brownian motion restricted by randomly placed and oriented permeable membranes and focus on the disorder-averaged diffusion propagator using a scattering approach. The renormalization group solution reveals a scaling behavior of the diffusion coefficient for large times, with a characteristically slow inverse square root time dependence. The predicted time dependence of the diffusion coefficient agrees well with Monte Carlo simulations in two dimensions. Our results can be used to identify permeable membranes as restrictions to transport in disordered materials and in biological tissues, and to quantify their permeability and surface area.Comment: 8 pages, 3 figures; origin of dispersion clarified, refs adde

    Predicting cell types and genetic variations contributing to disease by combining GWAS and epigenetic data

    Get PDF
    Genome-wide association studies (GWASs) identify single nucleotide polymorphisms (SNPs) that are enriched in individuals suffering from a given disease. Most disease-associated SNPs fall into non-coding regions, so that it is not straightforward to infer phenotype or function; moreover, many SNPs are in tight genetic linkage, so that a SNP identified as associated with a particular disease may not itself be causal, but rather signify the presence of a linked SNP that is functionally relevant to disease pathogenesis. Here, we present an analysis method that takes advantage of the recent rapid accumulation of epigenomics data to address these problems for some SNPs. Using asthma as a prototypic example; we show that non-coding disease-associated SNPs are enriched in genomic regions that function as regulators of transcription, such as enhancers and promoters. Identifying enhancers based on the presence of the histone modification marks such as H3K4me1 in different cell types, we show that the location of enhancers is highly cell-type specific. We use these findings to predict which SNPs are likely to be directly contributing to disease based on their presence in regulatory regions, and in which cell types their effect is expected to be detectable. Moreover, we can also predict which cell types contribute to a disease based on overlap of the disease-associated SNPs with the locations of enhancers present in a given cell type. Finally, we suggest that it will be possible to re-analyze GWAS studies with much higher power by limiting the SNPs considered to those in coding or regulatory regions of cell types relevant to a given disease

    X-ray imaging of the dynamic magnetic vortex core deformation

    Get PDF
    Magnetic platelets with a vortex configuration are attracting considerable attention. The discovery that excitation with small in-plane magnetic fields or spin polarised currents can switch the polarisation of the vortex core did not only open the possibility of using such systems in magnetic memories, but also initiated the fundamental investigation of the core switching mechanism itself. Micromagnetic models predict that the switching is mediated by a vortex-antivortex pair, nucleated in a dynamically induced vortex core deformation. In the same theoretical framework, a critical core velocity is predicted, above which switching occurs. Although these models are extensively studied and generally accepted, experimental support has been lacking until now. In this work, we have used high-resolution time-resolved X-ray microscopy to study the detailed dynamics in vortex structures. We could reveal the dynamic vortex core deformation preceding the core switching. Also, the threshold velocity could be measured, giving quantitative comparison with micromagnetic models

    Intra-tumoural microvessel density in human solid tumours

    Get PDF
    Over the last decade assessment of angiogenesis has emerged as a potentially useful biological prognostic and predictive factor in human solid tumours. With the development of highly specific endothelial markers that can be assessed in histological archival specimens, several quantitative studies have been performed in various solid tumours. The majority of published studies have shown a positive correlation between intra-tumoural microvessel density, a measure of tumour angiogenesis, and prognosis in solid tumours. A minority of studies have not demonstrated an association and this may be attributed to significant differences in the methodologies employed for sample selection, immunostaining techniques, vessel counting and statistical analysis, although a number of biological differences may account for the discrepancy. In this review we evaluate the quantification of angiogenesis by immunohistochemistry, the relationship between tumour vascularity and metastasis, and the clinicopathological studies correlating intra-tumoral microvessel density with prognosis and response to anti-cancer therapy. In view of the extensive nature of this retrospective body of data, comparative studies are needed to identify the optimum technique and endothelial antigens (activated or pan-endothelial antigens) but subsequently prospective studies that allocate treatment on the basis of microvessel density are required

    Crackling Noise

    Full text link
    Crackling noise arises when a system responds to changing external conditions through discrete, impulsive events spanning a broad range of sizes. A wide variety of physical systems exhibiting crackling noise have been studied, from earthquakes on faults to paper crumpling. Because these systems exhibit regular behavior over many decades of sizes, their behavior is likely independent of microscopic and macroscopic details, and progress can be made by the use of very simple models. The fact that simple models and real systems can share the same behavior on a wide range of scales is called universality. We illustrate these ideas using results for our model of crackling noise in magnets, explaining the use of the renormalization group and scaling collapses. This field is still developing: we describe a number of continuing challenges

    The one dimensional Kondo lattice model at partial band filling

    Full text link
    The Kondo lattice model introduced in 1977 describes a lattice of localized magnetic moments interacting with a sea of conduction electrons. It is one of the most important canonical models in the study of a class of rare earth compounds, called heavy fermion systems, and as such has been studied intensively by a wide variety of techniques for more than a quarter of a century. This review focuses on the one dimensional case at partial band filling, in which the number of conduction electrons is less than the number of localized moments. The theoretical understanding, based on the bosonized solution, of the conventional Kondo lattice model is presented in great detail. This review divides naturally into two parts, the first relating to the description of the formalism, and the second to its application. After an all-inclusive description of the bosonization technique, the bosonized form of the Kondo lattice hamiltonian is constructed in detail. Next the double-exchange ordering, Kondo singlet formation, the RKKY interaction and spin polaron formation are described comprehensively. An in-depth analysis of the phase diagram follows, with special emphasis on the destruction of the ferromagnetic phase by spin-flip disorder scattering, and of recent numerical results. The results are shown to hold for both antiferromagnetic and ferromagnetic Kondo lattice. The general exposition is pedagogic in tone.Comment: Review, 258 pages, 19 figure

    Permo–Triassic boundary carbon and mercury cycling linked to terrestrial ecosystem collapse

    Get PDF
    Records suggest that the Permo–Triassic mass extinction (PTME) involved one of the most severe terrestrial ecosystem collapses of the Phanerozoic. However, it has proved difficult to constrain the extent of the primary productivity loss on land, hindering our understanding of the effects on global biogeochemistry. We build a new biogeochemical model that couples the global Hg and C cycles to evaluate the distinct terrestrial contribution to atmosphere–ocean biogeochemistry separated from coeval volcanic fluxes. We show that the large short-lived Hg spike, and nadirs in δ²⁰²Hg and δ¹³C values at the marine PTME are best explained by a sudden, massive pulse of terrestrial biomass oxidation, while volcanism remains an adequate explanation for the longer-term geochemical changes. Our modelling shows that a massive collapse of terrestrial ecosystems linked to volcanism-driven environmental change triggered significant biogeochemical changes, and cascaded organic matter, nutrients, Hg and other organically-bound species into the marine system

    Variability in Working Memory Performance Explained by Epistasis vs Polygenic Scores in the ZNF804A Pathway

    Get PDF
    Importance: We investigated the variation in neuropsychological function explained by risk alleles at the psychosis susceptibility gene ZNF804A and its interacting partners using single nucleotide polymorphisms (SNPs), polygenic scores, and epistatic analyses. Of particular importance was the relative contribution of the polygenic score vs epistasis in variation explained. Objectives To (1) assess the association between SNPs in ZNF804A and the ZNF804A polygenic score with measures of cognition in cases with psychosis and (2) assess whether epistasis within the ZNF804A pathway could explain additional variation above and beyond that explained by the polygenic score. Design, Setting, and Participants: Patients with psychosis (n = 424) were assessed in areas of cognitive ability impaired in schizophrenia including IQ, memory, attention, and social cognition. We used the Psychiatric GWAS Consortium 1 schizophrenia genome-wide association study to calculate a polygenic score based on identified risk variants within this genetic pathway. Cognitive measures significantly associated with the polygenic score were tested for an epistatic component using a training set (n = 170), which was used to develop linear regression models containing the polygenic score and 2-SNP interactions. The best-fitting models were tested for replication in 2 independent test sets of cases: (1) 170 individuals with schizophrenia or schizoaffective disorder and (2) 84 patients with broad psychosis (including bipolar disorder, major depressive disorder, and other psychosis). Main Outcomes and Measures: Participants completed a neuropsychological assessment battery designed to target the cognitive deficits of schizophrenia including general cognitive function, episodic memory, working memory, attentional control, and social cognition. Results: Higher polygenic scores were associated with poorer performance among patients on IQ, memory, and social cognition, explaining 1% to 3% of variation on these scores (range, P = .01 to .03). Using a narrow psychosis training set and independent test sets of narrow phenotype psychosis (schizophrenia and schizoaffective disorder), broad psychosis, and control participants (n = 89), the addition of 2 interaction terms containing 2 SNPs each increased the R2 for spatial working memory strategy in the independent psychosis test sets from 1.2% using the polygenic score only to 4.8% (P = .11 and .001, respectively) but did not explain additional variation in control participants. Conclusions and Relevance: These data support a role for the ZNF804A pathway in IQ, memory, and social cognition in cases. Furthermore, we showed that epistasis increases the variation explained above the contribution of the polygenic score

    Evidence of causal effect of major depression on alcohol dependence: findings from the psychiatric genomics consortium

    Get PDF
    BACKGROUND Despite established clinical associations among major depression (MD), alcohol dependence (AD), and alcohol consumption (AC), the nature of the causal relationship between them is not completely understood. We leveraged genome-wide data from the Psychiatric Genomics Consortium (PGC) and UK Biobank to test for the presence of shared genetic mechanisms and causal relationships among MD, AD, and AC. METHODS Linkage disequilibrium score regression and Mendelian randomization (MR) were performed using genome-wide data from the PGC (MD: 135 458 cases and 344 901 controls; AD: 10 206 cases and 28 480 controls) and UK Biobank (AC-frequency: 438 308 individuals; AC-quantity: 307 098 individuals). RESULTS Positive genetic correlation was observed between MD and AD (rgMD−AD = + 0.47, P = 6.6 × 10−10). AC-quantity showed positive genetic correlation with both AD (rgAD−AC quantity = + 0.75, P = 1.8 × 10−14) and MD (rgMD−AC quantity = + 0.14, P = 2.9 × 10−7), while there was negative correlation of AC-frequency with MD (rgMD−AC frequency = −0.17, P = 1.5 × 10−10) and a non-significant result with AD. MR analyses confirmed the presence of pleiotropy among these four traits. However, the MD-AD results reflect a mediated-pleiotropy mechanism (i.e. causal relationship) with an effect of MD on AD (beta = 0.28, P = 1.29 × 10−6). There was no evidence for reverse causation. CONCLUSION This study supports a causal role for genetic liability of MD on AD based on genetic datasets including thousands of individuals. Understanding mechanisms underlying MD-AD comorbidity addresses important public health concerns and has the potential to facilitate prevention and intervention efforts
    corecore