1,498 research outputs found

    Analyzing Granger causality in climate data with time series classification methods

    Get PDF
    Attribution studies in climate science aim for scientifically ascertaining the influence of climatic variations on natural or anthropogenic factors. Many of those studies adopt the concept of Granger causality to infer statistical cause-effect relationships, while utilizing traditional autoregressive models. In this article, we investigate the potential of state-of-the-art time series classification techniques to enhance causal inference in climate science. We conduct a comparative experimental study of different types of algorithms on a large test suite that comprises a unique collection of datasets from the area of climate-vegetation dynamics. The results indicate that specialized time series classification methods are able to improve existing inference procedures. Substantial differences are observed among the methods that were tested

    Decision Support Based on Bio-PEPA Modeling and Decision Tree Induction: A New Approach, Applied to a Tuberculosis Case Study

    Get PDF
    The problem of selecting determinant features generating appropriate model structure is a challenge in epidemiological modelling. Disease spread is highly complex, and experts develop their understanding of its dynamic over years. There is an increasing variety and volume of epidemiological data which adds to the potential confusion. We propose here to make use of that data to better understand disease systems. Decision tree techniques have been extensively used to extract pertinent information and improve decision making. In this paper, we propose an innovative structured approach combining decision tree induction with Bio-PEPA computational modelling, and illustrate the approach through application to tuberculosis. By using decision tree induction, the enhanced Bio-PEPA model shows considerable improvement over the initial model with regard to the simulated results matching observed data. The key finding is that the developer expresses a realistic predictive model using relevant features, thus considering this approach as decision support, empowers the epidemiologist in his policy decision making

    Network analysis of shared interests represented by social bookmarking behaviors

    Get PDF
    Social bookmarking is a new phenomenon characterized by a number of features including active user participation, open and collective discovery of resources, and user-generated metadata. Among others, this study pays particular attention to its nature of being at the intersection of personal information space and social information space. While users of a social bookmarking site create and maintain their own bookmark collections, the users' personal information spaces, in aggregate, build up the information space of the site as a whole. The overall goal of this study is to understand how social information space may emerge when personal information spaces of users intersect and overlap with shared interests. The main purpose of the study is two-fold: first, to see whether and how we can identify shared interest space(s) within the general information space of a social bookmarking site; and second, to evaluate the applicability of social network analysis to this end. Delicious.com, one of the most successful instances of social bookmarking, was chosen as the case. The study was carried out in three phases asking separate yet interrelated questions concerning the overall level of interest overlap, the structural patterns in the network of users connected by shared interests, and the communities of interest within the network. The results indicate that, while individual users of delicious.com have a broad range of diverse interests, there is a considerable level of overlap and commonality, providing a ground for creating implicit networks of users with shared interests. The networks constructed based on common bookmarks revealed intriguing structural patterns commonly found in well-established social systems, including a core periphery structure with a high level of connectivity, which form a basis for efficient information sharing and knowledge transfer. Furthermore, an exploratory analysis of the network communities showed that each community has a distinct theme defining the shared interests of its members, at a high level of coherence. Overall, the results suggest that networks of people with shared interests can be induced from their social bookmarking behaviors and such networks can provide a venue for investigating social mechanisms of information sharing in this new information environment. Future research can be built upon the methods and findings of this study to further explore the implication of the emergent and implicit network of shared interests

    Utforsking av overgangen fra tradisjonell dataanalyse til metoder med maskin- og dyp læring

    Get PDF
    Data analysis methods based on machine- and deep learning approaches are continuously replacing traditional methods. Models based on deep learning (DL) are applicable to many problems and often have better prediction performance compared to traditional methods. One major difference between the traditional methods and machine learning (ML) approaches is the black box aspect often associated with ML and DL models. The use of ML and DL models offers many opportunities but also challenges. This thesis explores some of these opportunities and challenges of DL modelling with a focus on applications in spectroscopy. DL models are based on artificial neural networks (ANNs) and are known to automatically find complex relations in the data. In Paper I, this property is exploited by designing DL models to learn spectroscopic preprocessing based on classical preprocessing techniques. It is shown that the DL-based preprocessing has some merits with regard to prediction performance, but there is considerable extra effort required when training and tuning these DL models. The flexibility of ANN architecture designs is further studied in Paper II when a DL model for multiblock data analysis is proposed which can also quantify the importance of each data block. A drawback of the DL models is the lack of interpretability. To address this, a different modelling approach is taken in Paper III where the focus is to use DL models in such a way as to retain as much interpretability as possible. The paper presents the concept of non-linear error modelling, where the DL model is used to model the residuals of the linear model instead of the raw input data. The concept is essentially a shrinking of the black box aspect since the majority of the data modelling is done by a linear interpretable model. The final topic explored in this thesis is a more traditional modelling approach inspired by DL techniques. Data sometimes contain intrinsic subgroups which might be more accurately modelled separately than with a global model. Paper IV presents a modelling framework based on locally weighted models and fuzzy partitioning that automatically finds relevant clusters and combines the predictions of each local model. Compared to a DL model, the locally weighted modelling framework is more transparent. It is also shown how the framework can utilise DL techniques to be scaled to problems with huge amounts of data.Metoder basert på maskin- og dyp læring erstatter i stadig økende grad tradisjonell datamodellering. Modeller basert på dyp læring (DL) kan brukes på mange problemer og har ofte bedre prediksjonsevne sammenlignet med tradisjonelle metoder. En stor forskjell mellom tradisjonelle metoder og metoder basert på maskinlæring (ML) er den "svarte boksen" som ofte forbindes med ML- og DL-modeller. Bruken av ML- og DL-modeller åpner opp for mange muligheter, men også utfordringer. Denne avhandlingen utforsker noen av disse mulighetene og utfordringene med DL modeller, fokusert på anvendelser innen spektroskopi. DL-modeller er basert på kunstige nevrale nettverk (KNN) og er kjent for å kunne finne komplekse relasjoner i data. I Artikkel I blir denne egenskapen utnyttet ved å designe DL-modeller som kan lære spektroskopisk preprosessering basert på klassiske preprosesseringsteknikker. Det er vist at DL-basert preprosessering kan være gunstig med tanke på prediksjonsevne, men det kreves større innsats for å trene og justere disse DL-modellene. Fleksibiliteten til design av KNN-arkitekturer er studert videre i Artikkel II hvor en DL-modell for analyse av multiblokkdata er foreslått, som også kan kvantifisere viktigheten til hver datablokk. En ulempe med DL-modeller er manglende muligheter for tolkning. For å adressere dette, er en annen modelleringsframgangsmåte brukt i Artikkel III, hvor fokuset er på å bruke DL-modeller på en måte som bevarer mest mulig tolkbarhet. Artikkelen presenterer konseptet ikke-lineær feilmodellering, hvor en DL-modell blir bruk til å modellere residualer fra en lineær modell i stedet for rå inputdata. Konseptet kan ses på som en krymping av den svarte boksen, siden mesteparten av datamodelleringen er gjort av en lineær, tolkbar modell. Det siste temaet som er utforsket i denne avhandlingen er nærmere en tradisjonell modelleringsvariant, men som er inspirert av DL-teknikker. Data har av og til iboende undergrupper som kan bli mer nøyaktig modellert hver for seg enn med en global modell. Artikkel IV presenterer et modelleringsrammeverk basert på lokalt vektede modeller og "fuzzy" oppdeling, som automatisk finner relevante grupperinger ("clusters") og kombinerer prediksjonene fra hver lokale modell. Sammenlignet med en DL-modell, er det lokalt vektede modelleringsrammeverket mer transparent. Det er også vist hvordan rammeverket kan utnytte teknikker fra DL for å skalere opp til problemer med store mengder data

    Observational Data to Improve Clinical Decision Making in Acute Care

    Get PDF
    The premise of Big Data in acute medicine is to make medicine more efficient and effective. However, the translation of large observational data to knowledge is difficult. This thesis explores and discusses the three main types of research questions which can be asked from large observational data:1. What is current clinical practice?2. What is best practice?3. What patients need to be prioritised?This thesis will focus on traumatic brain injury and in-hospital cardiac arrest.<br/

    Longitudinal associations between the retail food environment, diet quality, and chronic disease risk among Black and White young adults in the United States

    Get PDF
    The recognition of the public health and economic consequences of nutrition-related chronic diseases has led to policies focused on improving the diets of the population subgroups at highest risk: low-income people and African Americans. Sometimes, however, the evidence base for these policies is weak. For example, in response to research suggesting that low-income areas have less access to affordable and nutritious foods, policy interventions have been proposed to address this access disparity; however, this association is far from well established. Similarly, the Dietary Guidelines for Americans (DGA) are widely promoted in public health campaigns, yet there is surprisingly little evidence that following them is effective at reducing risk of chronic disease in the general population, in part because most past studies have been cross-sectional, demographically homogeneous, or conceptually flawed (e.g., potential for reverse causality). Our research fills important substantive gaps in the understanding of these relationships using data from a geographically diverse cohort of Black and White young adults (followed from 1985 to 2005) and analytic methods that exploit the longitudinal structure of the data. First, we examined the longitudinal association between neighborhood socio-demographics and the availability of foods stores. We found that poor and high minority areas had higher availability of supermarkets and grocery stores. This suggests that, contrary to common assumption, stores where healthy foods can typically be purchased are available to the subgroups at highest risk. Second, we examined the prospective association between adherence to the 2005 DGA and risk of major weight gain, diabetes, and progression of other cardio-metabolic risk factors. Overall, we found little evidence that a higher diet quality led to better long-term health (except for blood pressure and HDL cholesterol outcomes). In Whites, higher diet quality was associated with less 20-year weight gain, but unrelated to diabetes incidence or insulin resistance. However, in Blacks, higher diet quality was associated with greater 20-year weight gain, and with slightly higher incidence of diabetes and increase in insulin resistance. Results of this research highlight the urgent need for effectiveness trials of the DGA

    Observational Data to Improve Clinical Decision Making in Acute Care

    Get PDF
    corecore