31 research outputs found
Recommended from our members
A simple approach to forest structure classification using airborne laser scanning that can be adopted across bioregions
Reliable assessment of forest structural types (FSTs) aids sustainable forest management. We developed a methodology for the identification of FSTs using airborne laser scanning (ALS), and demonstrate its generality by applying it to forests from Boreal, Mediterranean and Atlantic biogeographical regions. First, hierarchal clustering analysis (HCA) was applied and clusters (FSTs) were determined in coniferous and deciduous forests using four forest structural variables obtained from forest inventory data â quadratic mean diameter (QMD), Gini coefficient (GC), basal area larger than mean (BALM) and density of stems (N) â. Then, classification and regression tree analysis (CART) were used to extract the empirical threshold values for discriminating those clusters. Based on the classification trees, GC and BALM were the most important variables in the identification of FSTs. Lower, medium and high values of GC and BALM characterize single storey FSTs, multi-layered FSTs and exponentially decreasing size distributions (reversed J), respectively. Within each of these main FST groups, we also identified young/mature and sparse/dense subtypes using QMD and N. Then we used similar structural predictors derived from ALS â maximum height (Max), L-coefficient of variation (Lcv), L-skewness (Lskew), and percentage of penetration (cover), â and a nearest neighbour method to predict the FSTs. We obtained a greater overall accuracy in deciduous forest (0.87) as compared to the coniferous forest (0.72). Our methodology proves the usefulness of ALS data for structural heterogeneity assessment of forests across biogeographical regions. Our simple two-tier approach to FST classification paves the way toward transnational assessments of forest structure across bioregions
Recommended from our members
Enhancing of accuracy assessment for forest above-ground biomass estimates obtained from remote sensing via hypothesis testing and overfitting evaluation
The evaluation of accuracy is essential for assuring the reliability of ecological models. Usually, the accuracy of above-ground biomass (AGB) predictions obtained from remote sensing is assessed by the mean differences (MD), the root mean squared differences (RMSD), and the coefficient of determination (R^2) between observed and predicted values. In this article we propose a more thorough analysis of accuracy, including a hypothesis test to evaluate the agreement between observed and predicted values, and an assessment of the degree of overfitting to the sample employed for model training. Using the estimation of forest AGB from LIDAR and spectral sensors as a case study, we compared alternative prediction and variable selection methods using several statistical measures to evaluate their accuracy. We showed that the hypothesis tests provide an objective method to infer the statistical significance of agreement. We also observed that overfitting can be assessed by comparing the inflation in residual sums of squares experienced when carrying out a cross-validation. Our results suggest that this method may be more effective than analysing the deflation in R^2. We proved that overfitting needs to be specifically addressed since, in light of MD, RMSD and R^2 alone, predictions may apparently seem reliable even in clearly unrealistic circumstances, for instance when including too many predictor variables. Moreover, Theilâs partial inequality coefficients, which are employed to resolve the proportions of the total errors due to the unexplained variance, the slope and the bias, may become useful to detect averaging effects common in remote sensing predictions of AGB. We concluded that statistical measures of accuracy, precision and agreement are necessary but insufficient for model evaluation. We therefore advocate for incorporating evaluation measures specifically devoted to testing observed-versus-predicted fit, and to assessing the degree of overfitting
Aboveground biomass density models for NASA's Global Ecosystem Dynamics Investigation (GEDI) lidar mission
NASA's Global Ecosystem Dynamics Investigation (GEDI) is collecting spaceborne full waveform lidar data with a primary science goal of producing accurate estimates of forest aboveground biomass density (AGBD). This paper presents the development of the models used to create GEDI's footprint-level (~25 m) AGBD (GEDI04_A) product, including a description of the datasets used and the procedure for final model selection. The data used to fit our models are from a compilation of globally distributed spatially and temporally coincident field and airborne lidar datasets, whereby we simulated GEDI-like waveforms from airborne lidar to build a calibration database. We used this database to expand the geographic extent of past waveform lidar studies, and divided the globe into four broad strata by Plant Functional Type (PFT) and six geographic regions. GEDI's waveform-to-biomass models take the form of parametric Ordinary Least Squares (OLS) models with simulated Relative Height (RH) metrics as predictor variables. From an exhaustive set of candidate models, we selected the best input predictor variables, and data transformations for each geographic stratum in the GEDI domain to produce a set of comprehensive predictive footprint-level models. We found that model selection frequently favored combinations of RH metrics at the 98th, 90th, 50th, and 10th height above ground-level percentiles (RH98, RH90, RH50, and RH10, respectively), but that inclusion of lower RH metrics (e.g. RH10) did not markedly improve model performance. Second, forced inclusion of RH98 in all models was important and did not degrade model performance, and the best performing models were parsimonious, typically having only 1-3 predictors. Third, stratification by geographic domain (PFT, geographic region) improved model performance in comparison to global models without stratification. Fourth, for the vast majority of strata, the best performing models were fit using square root transformation of field AGBD and/or height metrics. There was considerable variability in model performance across geographic strata, and areas with sparse training data and/or high AGBD values had the poorest performance. These models are used to produce global predictions of AGBD, but will be improved in the future as more and better training data become available
Evaluation of 3D GANs for Lung Tissue Modelling in Pulmonary CT
GANs are able to model accurately the distribution of complex, high-dimensional datasets, e.g. images. This makes high-quality GANs useful for unsupervised anomaly detection in medical imaging. However, differences in training datasets such as output image dimensionality and appearance of semantically meaningful features mean that GAN models from the natural image domain may not work `out-of-the-box' for medical imaging, necessitating re-implementation and re-evaluation. In this work we adapt and evaluate three GAN models to the task of modelling 3D healthy image patches for pulmonary CT. To the best of our knowledge, this is the first time that such an evaluation has been performed. The DCGAN, styleGAN and the bigGAN architectures were investigated due to their ubiquity and high performance in natural image processing. We train different variants of these methods and assess their performance using the FID score. In addition, the quality of the generated images was evaluated by a human observer study, the ability of the networks to model 3D domain-specific features was investigated, and the structure of the GAN latent spaces was analysed. Results show that the 3D styleGAN produces realistic-looking images with meaningful 3D structure, but suffer from mode collapse which must be addressed during training to obtain samples diversity. Conversely, the 3D DCGAN models show a greater capacity for image variability, but at the cost of poor-quality images. The 3D bigGAN models provide an intermediate level of image quality, but most accurately model the distribution of selected semantically meaningful features. The results suggest that future development is required to realise a 3D GAN with sufficient capacity for patch-based lung CT anomaly detection and we offer recommendations for future areas of research, such as experimenting with other architectures and incorporation of position-encoding
The pitfalls of sample selection: a case study on lung nodule classification
Using publicly available data to determine the performance of methodological contributions is important as it facilitates reproducibility and allows scrutiny of the published results. In lung nodule classification, for example, many works report results on the publicly available LIDC dataset. In theory, this should allow a direct comparison of the performance of proposed methods and assess the impact of individual contributions. When analyzing seven recent works, however, we find that each employs a different data selection process, leading to largely varying total number of samples and ratios between benign and malignant cases. As each subset will have different characteristics with varying difficulty for classification, a direct comparison between the proposed methods is thus not always possible, nor fair. We study the particular effect of truthing when aggregating labels from multiple experts. We show that specific choices can have severe impact on the data distribution where it may be possible to achieve superior performance on one sample distribution but not on another. While we show that we can further improve on the state-of-the-art on one sample selection, we also find that on a more challenging sample selection, on the same database, the more advanced models underperform with respect to very simple baseline methods, highlighting that the selected data distribution may play an even more important role than the model architecture. This raises concerns about the validity of claimed methodological contributions. We believe the community should be aware of these pitfalls and make recommendations on how these can be avoided in future work
The effect of the loss on generalization: empirical study on synthetic lung nodule data
Convolutional Neural Networks (CNNs) are widely used for image classification in a variety of fields, including medical imaging. While most studies deploy cross-entropy as the loss function in such tasks, a growing number of approaches have turned to a family of contrastive learning-based losses. Even though performance metrics such as accuracy, sensitivity and specificity are regularly used for the evaluation of CNN classifiers, the features that these classifiers actually learn are rarely identified and their effect on the classification performance on out-of-distribution test samples is insufficiently explored. In this paper, motivated by the real-world task of lung nodule classification, we investigate the features that a CNN learns when trained and tested on different distributions of a synthetic dataset with controlled modes of variation. We show that different loss functions lead to different features being learned and consequently affect the generalization ability of the classifier on unseen data. This study provides some important insights into the design of deep learning solutions for medical imaging tasks
Sap flow, leaf-level gas exchange and spectral responses to drought in Pinus sylvestris, Pinus pinea and Pinus halepensis
In a climate change scenario, Mediterranean forest species such as pines may be endangered by rising temperatures and reduced precipitation, thus calling for studies on the transpiration and water balance in pines. In this paper, the response of young plants of Pinus sylvestris L., Pinus pinea L. and Pinus halepensis Mill. to different irrigation treatments has been studied. Significant differences were found in water potential, sap flow, leaf-level gas exchange and spectral variables. P. sylvestris had higher pre-dawn and midday water potentials, sap flow rates and leaf-level gas exchange rates compared to the other two species in well-watered conditions. Vapor pressure gradient correlated with stomatal conductance, net assimilation and transpiration, but the association between stomatal conductance and sap flow was weak. The environmental variables more strongly associated with sap flow were solar radiation and reference evapo-transpiration, especially in the well-watered plants, but those associations were weaker in the stressed plants. All three pine species showed the isohydric, drought-avoiding strategy common in the genus Pinus, maintaining relatively high water potentials in dry conditions. Nevertheless, P. halepensis showed a water-saving strategy, with a stomatal closure behavior under drought. Stomatal regulation was less strict in P. sylvestris, closer to a water-spending pattern, while P. pinea showed an intermediate behavior. Significant differences were recorded among species in spectral reflectance in the visible and infra-red regions. Photochemical Reflectance Index, Normalized Difference Vegetation Index and combinations of other ratios permitted the discrimination among the three pine species. These spectral variables showed association with sap flow rate, water potential and leaf-level gas exchange variables. Both cluster analysis and k-means classification discriminated Scots pine and Aleppo pine in two different groups. On the other hand, Stone pine showed differences in spectral behavior depending on the hydric status of the plants. Well-watered Stone pine plants had the same spectral behavior as Scots pine, while the plants subjected to drought stress were closer to Aleppo pine plants in spectral response. These findings may help to quantify the impacts of early and mid-summer water deficit on Mediterranean pines in future climate regimes