649 research outputs found
Breast MRI radiomics and machine learning radiomics-based predictions of response to neoadjuvant chemotherapy -- how are they affected by variations in tumour delineation?
Manual delineation of volumes of interest (VOIs) by experts is considered the
gold-standard method in radiomics analysis. However, it suffers from inter- and
intra-operator variability. A quantitative assessment of the impact of
variations in these delineations on the performance of the radiomics predictors
is required to develop robust radiomics based prediction models. In this study,
we developed radiomics models for the prediction of pathological complete
response to neoadjuvant chemotherapy in patients with two different breast
cancer subtypes based on contrast-enhanced magnetic resonance imaging acquired
prior to treatment (baseline MRI scans). Different mathematical operations such
as erosion, smoothing, dilation, randomization, and ellipse fitting were
applied to the original VOIs delineated by experts to simulate variations of
segmentation masks. The effects of such VOI modifications on various steps of
the radiomics workflow, including feature extraction, feature selection, and
prediction performance, were evaluated. Using manual tumor VOIs and radiomics
features extracted from baseline MRI scans, an AUC of up to 0.96 and 0.89 was
achieved for human epidermal growth receptor 2 positive and triple-negative
breast cancer, respectively. For smoothing and erosion, VOIs yielded the
highest number of robust features and the best prediction performance, while
ellipse fitting and dilation lead to the lowest robustness and prediction
performance for both breast cancer subtypes. At most 28% of the selected
features were similar to manual VOIs when different VOI delineation data were
used. Differences in VOI delineation affects different steps of radiomics
analysis, and their quantification is therefore important for development of
standardized radiomics research
Sekventiaalisen tiedon louhinta : segmenttirakenteita etsimässä
Segmentation is a data mining technique yielding simplified representations of sequences of ordered points. A sequence is divided into some number of homogeneous blocks, and all points within a segment are described by a single value. The focus in this thesis is on piecewise-constant segments, where the most likely description for each segment and the most likely segmentation into some number of blocks can be computed efficiently. Representing sequences as segmentations is useful in, e.g., storage and indexing tasks in sequence databases, and segmentation can be used as a tool in learning about the structure of a given sequence.
The discussion in this thesis begins with basic questions related to segmentation analysis, such as choosing the number of segments, and evaluating the obtained segmentations. Standard model selection techniques are shown to perform well for the sequence segmentation task. Segmentation evaluation is proposed with respect to a known segmentation structure. Applying segmentation on certain features of a sequence is shown to yield segmentations that are significantly close to the known underlying structure.
Two extensions to the basic segmentation framework are introduced: unimodal segmentation and basis segmentation. The former is concerned with segmentations where the segment descriptions first increase and then decrease, and the latter with the interplay between different dimensions and segments in the sequence. These problems are formally defined and algorithms for solving them are provided and analyzed.
Practical applications for segmentation techniques include time series and data stream analysis, text analysis, and biological sequence analysis. In this thesis segmentation applications are demonstrated in analyzing genomic sequences.Segmentointi on tiedon louhinnassa käytetty menetelmä, jonka avulla voidaan tuottaa yksinkertaisia kuvauksia sekvenssistä, joka koostuu järjestetystä jonosta pisteitä. Pisteet voivat olla joko yksi- tai moniulotteisia.
Segmentoinnissa sekvenssi jaetaan tiettyyn määrään yhtenäisiä alueita, segmenttejä, ja kunkin alueen sisältämiä pisteitä kuvataan yhdellä arvolla.
Väitöskirjassa keskitytään paloittain vakioiden segmenttirakenteiden etsintään. Tällaisille rakenteille kunkin segmentin paras kuvaus sekä koko sekvenssin paras jako segmentteihin voidaan laskea tehokkaasti. Tiedon mallintaminen segmentoinnin avulla on hyödyllistä mm. silloin kun tietoa tallennetaan ja indeksoidaan sekvenssitietokannoissa, sekä kun halutaan saada lisätietoja tietyn sekvenssin yleisrakenteesta.
Väitöskirjassa käsitellään ensin segmentointiin liittyviä peruskysymyksiä, segmenttien lukumäärän valitsemista ja segmentointitulosten arviointia.
Olemassa olevien mallinvalintamenetelmien näytetään soveltuvan hyvin segmenttien lukumäärän valitsemiseen. Segmentointien arviointia käsitellään suhteessa tunnettuun segmenttirakenteeseen. Voidaan näyttää, että
segmentoimalla sekvenssi sen tiettyjen ominaisuuksien suhteen saadaan tulokseksi segmentointeja, joiden samankaltaisuus tunnetun rakenteen kanssa on merkitsevä.
Perinteiseen segmentointikehykseen esitellään kaksi laajennosta: yksihuippuinen segmentointi ja kantasegmentointi. Yksihuippuisessa segmentoinnissa segmenttien kuvaukset saavat arvoja, jotka ensin kasvavat ja sitten vähenevät. Kantasegmentoinnissa puolestaan mallinnetaan segmenttien sekä sekvenssin eri ulottuvuuksien välisiä suhteita. Väitöskirjassa määritellään nämä kaksi uutta segmentointiongelmaa. Lisäksi sekä annetaan että analysoidaan laskennallisia menetelmiä, algoritmeja, niiden ratkaisemiseksi.
Segmentointimenetelmiä sovelletaan käytännössä mm. aikasarjojen, tietovirtojen, tekstin ja biologisten sekvenssien analysoinnissa.
Väitöskirjassa käsitellään esimerkinomaisesti segmentoinnin soveltamista genomisekvenssien analysoinnissa
Multi texture analysis of colorectal cancer continuum using multispectral imagery
Purpose
This paper proposes to characterize the continuum of colorectal cancer (CRC) using multiple texture features extracted from multispectral optical microscopy images. Three types of pathological tissues (PT) are considered: benign hyperplasia, intraepithelial neoplasia and carcinoma.
Materials and Methods
In the proposed approach, the region of interest containing PT is first extracted from multispectral
images using active contour segmentation. This region is then encoded using texture features based on the Laplacian-of-Gaussian (LoG) filter, discrete wavelets (DW) and gray level co-occurrence matrices (GLCM). To assess the significance of textural differences between PT types, a statistical analysis based on the Kruskal-Wallis test is performed. The usefulness of texture features is then evaluated quantitatively in terms of their ability to predict PT types using various classifier models.
Results
Preliminary results show significant texture differences between PT types, for all texture features (p-value < 0.01). Individually, GLCM texture features outperform LoG and DW features in terms of PT type prediction. However, a higher performance can be achieved by combining all texture features, resulting in a mean classification accuracy of 98.92%, sensitivity of 98.12%, and specificity of 99.67%.
Conclusions
These results demonstrate the efficiency and effectiveness of combining multiple texture features for characterizing the continuum of CRC and discriminating between pathological tissues in multispectral images
Robust machine learning segmentation for large-scale analysis of heterogeneous clinical brain MRI datasets
Every year, millions of brain MRI scans are acquired in hospitals, which is a
figure considerably larger than the size of any research dataset. Therefore,
the ability to analyse such scans could transform neuroimaging research. Yet,
their potential remains untapped, since no automated algorithm is robust enough
to cope with the high variability in clinical acquisitions (MR contrasts,
resolutions, orientations, artefacts, subject populations). Here we present
SynthSeg+, an AI segmentation suite that enables, for the first time, robust
analysis of heterogeneous clinical datasets. In addition to whole-brain
segmentation, SynthSeg+ also performs cortical parcellation, intracranial
volume estimation, and automated detection of faulty segmentations (mainly
caused by scans of very low quality). We demonstrate SynthSeg+ in seven
experiments, including an ageing study on 14,000 scans, where it accurately
replicates atrophy patterns observed on data of much higher quality. SynthSeg+
is publicly released as a ready-to-use tool to unlock the potential of
quantitative morphometry.Comment: under review, extension of MICCAI 2022 pape
Analyzing data from single-case alternating treatments designs
Alternating treatments designs (ATDs) have received comparatively less attention than other single-case experimental designs in terms of data analysis, as most analytical proposals and illustrations have been made in the context of designs including phases with several consecutive measurements in the same condition. One of the specific features of ATDs is the rapid (and usually randomly determined) alternation of conditions, which requires adapting the analytical techniques. First, we review the methodologically desirable features of ATDs, as well as the characteristics of the published single-case research using an ATD, which are relevant for data analysis. Second, we review several existing options for ATD data analysis. Third, we propose 2 new procedures, suggested as alternatives improving some of the limitations of extant analytical techniques. Fourth, we illustrate the application of existing techniques and the new proposals in order to discuss their differences and similarities. We advocate for the use of the new proposals in ATDs, because they entail meaningful comparisons between the conditions without assumptions about the design or the data pattern. We provide R code for all computations and for the graphical representation of the comparisons involved. (PsycINFO Database Record
Weighted Consensus Segmentations
The problem of segmenting linearly ordered data is frequently encountered in time-series analysis, computational biology, and natural language processing. Segmentations obtained independently from replicate data sets or from the same data with different methods or parameter settings pose the problem of computing an aggregate or consensus segmentation. This Segmentation Aggregation problem amounts to finding a segmentation that minimizes the sum of distances to the input segmentations. It is again a segmentation problem and can be solved by dynamic programming. The aim of this contribution is (1) to gain a better mathematical understanding of the Segmentation Aggregation problem and its solutions and (2) to demonstrate that consensus segmentations have useful applications. Extending previously known results we show that for a large class of distance functions only breakpoints present in at least one input segmentation appear in the consensus segmentation. Furthermore, we derive a bound on the size of consensus segments. As show-case applications, we investigate a yeast transcriptome and show that consensus segments provide a robust means of identifying transcriptomic units. This approach is particularly suited for dense transcriptomes with polycistronic transcripts, operons, or a lack of separation between transcripts. As a second application, we demonstrate that consensus segmentations can be used to robustly identify growth regimes from sets of replicate growth curves
- …