4,710 research outputs found

    Relating multi-sequence longitudinal intensity profiles and clinical covariates in new multiple sclerosis lesions

    Get PDF
    Structural magnetic resonance imaging (MRI) can be used to detect lesions in the brains of multiple sclerosis (MS) patients. The formation of these lesions is a complex process involving inflammation, tissue damage, and tissue repair, all of which are visible on MRI. Here we characterize the lesion formation process on longitudinal, multi-sequence structural MRI from 34 MS patients and relate the longitudinal changes we observe within lesions to therapeutic interventions. In this article, we first outline a pipeline to extract voxel level, multi-sequence longitudinal profiles from four MRI sequences within lesion tissue. We then propose two models to relate clinical covariates to the longitudinal profiles. The first model is a principal component analysis (PCA) regression model, which collapses the information from all four profiles into a scalar value. We find that the score on the first PC identifies areas of slow, long-term intensity changes within the lesion at a voxel level, as validated by two experienced clinicians, a neuroradiologist and a neurologist. On a quality scale of 1 to 4 (4 being the highest) the neuroradiologist gave the score on the first PC a median rating of 4 (95% CI: [4,4]), and the neurologist gave it a median rating of 3 (95% CI: [3,3]). In the PCA regression model, we find that treatment with disease modifying therapies (p-value < 0.01), steroids (p-value < 0.01), and being closer to the boundary of abnormal signal intensity (p-value < 0.01) are associated with a return of a voxel to intensity values closer to that of normal-appearing tissue. The second model is a function-on-scalar regression, which allows for assessment of the individual time points at which the covariates are associated with the profiles. In the function-on-scalar regression both age and distance to the boundary were found to have a statistically significant association with the profiles

    DNA expression microarrays may be the wrong tool to identify biological pathways

    Get PDF
    DNA microarray expression signatures are expected to provide new insights into patho- physiological pathways. Numerous variant statistical methods have been described for each step of the signal analysis. We employed five similar statistical tests on the same data set at the level of gene selection. Inter-test agreement for the identification of biological pathways in BioCarta, KEGG and Reactome was calculated using Cohen&#x2019;s k- score. The identification of specific biological pathways showed only moderate agreement (0.30 &#x3c; k &#x3c; 0.79) between the analysis methods used. Pathways identified by microarrays must be treated cautiously as they vary according to the statistical method used

    Selecting relevant predictors: impact of variable selection on model performance, uncertainty and applicability of models in environmental decision making

    Get PDF
    One of the crucial steps when developing models is the selection of appropriate variables. In this research we assessed the impact variable selection on the model performance and model applicability. Regression trees were built to understand the relationship between the ecological water quality and the physicalchemical and hydromorphological variables. Different model parameterizations and three combinations of explanatory variables were used for developing the trees. Once constructed, they were integrated with the water quality model (PEGASE) and used to simulate the future ecological water quality. These simulations were summarized per combination of explanatory variables and compared. Three key messages summarize our conclusions. First, it was confirmed that different parameterizations alter the statistical reliability of the trees produced. Secondly, it was found that statistical reliability of the models remained stable when different combinations of explanatory variables were implemented. The determination coefficient (R²) ranged from 0.68 to 0.86; Kappa statistic (K) ranged from 0.15 and 0.46; and the percentage of Correctly Classified Instances (CCI) from 33 to 59%. Thirdly, when applying the models on an independent dataset consisting of future physical-chemical water quality data, different conclusions may be taken, depending on the combination of variables used

    What Works Better? A Study of Classifying Requirements

    Full text link
    Classifying requirements into functional requirements (FR) and non-functional ones (NFR) is an important task in requirements engineering. However, automated classification of requirements written in natural language is not straightforward, due to the variability of natural language and the absence of a controlled vocabulary. This paper investigates how automated classification of requirements into FR and NFR can be improved and how well several machine learning approaches work in this context. We contribute an approach for preprocessing requirements that standardizes and normalizes requirements before applying classification algorithms. Further, we report on how well several existing machine learning methods perform for automated classification of NFRs into sub-categories such as usability, availability, or performance. Our study is performed on 625 requirements provided by the OpenScience tera-PROMISE repository. We found that our preprocessing improved the performance of an existing classification method. We further found significant differences in the performance of approaches such as Latent Dirichlet Allocation, Biterm Topic Modeling, or Naive Bayes for the sub-classification of NFRs.Comment: 7 pages, the 25th IEEE International Conference on Requirements Engineering (RE'17

    Automatic coding of short text responses via clustering in educational assessment

    Full text link
    Automatic coding of short text responses opens new doors in assessment. We implemented and integrated baseline methods of natural language processing and statistical modelling by means of software components that are available under open licenses. The accuracy of automatic text coding is demonstrated by using data collected in the Programme for International Student Assessment (PISA) 2012 in Germany. Free text responses of 10 items with Formula responses in total were analyzed. We further examined the effect of different methods, parameter values, and sample sizes on performance of the implemented system. The system reached fair to good up to excellent agreement with human codings Formula Especially items that are solved by naming specific semantic concepts appeared properly coded. The system performed equally well with Formula and somewhat poorer but still acceptable down to Formula Based on our findings, we discuss potential innovations for assessment that are enabled by automatic coding of short text responses. (DIPF/Orig.
    • …
    corecore