31 research outputs found

    Aboveground biomass density models for NASA's Global Ecosystem Dynamics Investigation (GEDI) lidar mission

    Get PDF
    NASA's Global Ecosystem Dynamics Investigation (GEDI) is collecting spaceborne full waveform lidar data with a primary science goal of producing accurate estimates of forest aboveground biomass density (AGBD). This paper presents the development of the models used to create GEDI's footprint-level (~25 m) AGBD (GEDI04_A) product, including a description of the datasets used and the procedure for final model selection. The data used to fit our models are from a compilation of globally distributed spatially and temporally coincident field and airborne lidar datasets, whereby we simulated GEDI-like waveforms from airborne lidar to build a calibration database. We used this database to expand the geographic extent of past waveform lidar studies, and divided the globe into four broad strata by Plant Functional Type (PFT) and six geographic regions. GEDI's waveform-to-biomass models take the form of parametric Ordinary Least Squares (OLS) models with simulated Relative Height (RH) metrics as predictor variables. From an exhaustive set of candidate models, we selected the best input predictor variables, and data transformations for each geographic stratum in the GEDI domain to produce a set of comprehensive predictive footprint-level models. We found that model selection frequently favored combinations of RH metrics at the 98th, 90th, 50th, and 10th height above ground-level percentiles (RH98, RH90, RH50, and RH10, respectively), but that inclusion of lower RH metrics (e.g. RH10) did not markedly improve model performance. Second, forced inclusion of RH98 in all models was important and did not degrade model performance, and the best performing models were parsimonious, typically having only 1-3 predictors. Third, stratification by geographic domain (PFT, geographic region) improved model performance in comparison to global models without stratification. Fourth, for the vast majority of strata, the best performing models were fit using square root transformation of field AGBD and/or height metrics. There was considerable variability in model performance across geographic strata, and areas with sparse training data and/or high AGBD values had the poorest performance. These models are used to produce global predictions of AGBD, but will be improved in the future as more and better training data become available

    Evaluation of 3D GANs for Lung Tissue Modelling in Pulmonary CT

    No full text
    GANs are able to model accurately the distribution of complex, high-dimensional datasets, e.g. images. This makes high-quality GANs useful for unsupervised anomaly detection in medical imaging. However, differences in training datasets such as output image dimensionality and appearance of semantically meaningful features mean that GAN models from the natural image domain may not work `out-of-the-box' for medical imaging, necessitating re-implementation and re-evaluation. In this work we adapt and evaluate three GAN models to the task of modelling 3D healthy image patches for pulmonary CT. To the best of our knowledge, this is the first time that such an evaluation has been performed. The DCGAN, styleGAN and the bigGAN architectures were investigated due to their ubiquity and high performance in natural image processing. We train different variants of these methods and assess their performance using the FID score. In addition, the quality of the generated images was evaluated by a human observer study, the ability of the networks to model 3D domain-specific features was investigated, and the structure of the GAN latent spaces was analysed. Results show that the 3D styleGAN produces realistic-looking images with meaningful 3D structure, but suffer from mode collapse which must be addressed during training to obtain samples diversity. Conversely, the 3D DCGAN models show a greater capacity for image variability, but at the cost of poor-quality images. The 3D bigGAN models provide an intermediate level of image quality, but most accurately model the distribution of selected semantically meaningful features. The results suggest that future development is required to realise a 3D GAN with sufficient capacity for patch-based lung CT anomaly detection and we offer recommendations for future areas of research, such as experimenting with other architectures and incorporation of position-encoding

    The pitfalls of sample selection: a case study on lung nodule classification

    No full text
    Using publicly available data to determine the performance of methodological contributions is important as it facilitates reproducibility and allows scrutiny of the published results. In lung nodule classification, for example, many works report results on the publicly available LIDC dataset. In theory, this should allow a direct comparison of the performance of proposed methods and assess the impact of individual contributions. When analyzing seven recent works, however, we find that each employs a different data selection process, leading to largely varying total number of samples and ratios between benign and malignant cases. As each subset will have different characteristics with varying difficulty for classification, a direct comparison between the proposed methods is thus not always possible, nor fair. We study the particular effect of truthing when aggregating labels from multiple experts. We show that specific choices can have severe impact on the data distribution where it may be possible to achieve superior performance on one sample distribution but not on another. While we show that we can further improve on the state-of-the-art on one sample selection, we also find that on a more challenging sample selection, on the same database, the more advanced models underperform with respect to very simple baseline methods, highlighting that the selected data distribution may play an even more important role than the model architecture. This raises concerns about the validity of claimed methodological contributions. We believe the community should be aware of these pitfalls and make recommendations on how these can be avoided in future work

    The effect of the loss on generalization: empirical study on synthetic lung nodule data

    No full text
    Convolutional Neural Networks (CNNs) are widely used for image classification in a variety of fields, including medical imaging. While most studies deploy cross-entropy as the loss function in such tasks, a growing number of approaches have turned to a family of contrastive learning-based losses. Even though performance metrics such as accuracy, sensitivity and specificity are regularly used for the evaluation of CNN classifiers, the features that these classifiers actually learn are rarely identified and their effect on the classification performance on out-of-distribution test samples is insufficiently explored. In this paper, motivated by the real-world task of lung nodule classification, we investigate the features that a CNN learns when trained and tested on different distributions of a synthetic dataset with controlled modes of variation. We show that different loss functions lead to different features being learned and consequently affect the generalization ability of the classifier on unseen data. This study provides some important insights into the design of deep learning solutions for medical imaging tasks

    Sap flow, leaf-level gas exchange and spectral responses to drought in Pinus sylvestris, Pinus pinea and Pinus halepensis

    No full text
    In a climate change scenario, Mediterranean forest species such as pines may be endangered by rising temperatures and reduced precipitation, thus calling for studies on the transpiration and water balance in pines. In this paper, the response of young plants of Pinus sylvestris L., Pinus pinea L. and Pinus halepensis Mill. to different irrigation treatments has been studied. Significant differences were found in water potential, sap flow, leaf-level gas exchange and spectral variables. P. sylvestris had higher pre-dawn and midday water potentials, sap flow rates and leaf-level gas exchange rates compared to the other two species in well-watered conditions. Vapor pressure gradient correlated with stomatal conductance, net assimilation and transpiration, but the association between stomatal conductance and sap flow was weak. The environmental variables more strongly associated with sap flow were solar radiation and reference evapo-transpiration, especially in the well-watered plants, but those associations were weaker in the stressed plants. All three pine species showed the isohydric, drought-avoiding strategy common in the genus Pinus, maintaining relatively high water potentials in dry conditions. Nevertheless, P. halepensis showed a water-saving strategy, with a stomatal closure behavior under drought. Stomatal regulation was less strict in P. sylvestris, closer to a water-spending pattern, while P. pinea showed an intermediate behavior. Significant differences were recorded among species in spectral reflectance in the visible and infra-red regions. Photochemical Reflectance Index, Normalized Difference Vegetation Index and combinations of other ratios permitted the discrimination among the three pine species. These spectral variables showed association with sap flow rate, water potential and leaf-level gas exchange variables. Both cluster analysis and k-means classification discriminated Scots pine and Aleppo pine in two different groups. On the other hand, Stone pine showed differences in spectral behavior depending on the hydric status of the plants. Well-watered Stone pine plants had the same spectral behavior as Scots pine, while the plants subjected to drought stress were closer to Aleppo pine plants in spectral response. These findings may help to quantify the impacts of early and mid-summer water deficit on Mediterranean pines in future climate regimes
    corecore