16,611 research outputs found

    Classification of airborne laser scanning point clouds based on binomial logistic regression analysis

    Get PDF
    This article presents a newly developed procedure for the classification of airborne laser scanning (ALS) point clouds, based on binomial logistic regression analysis. By using a feature space containing a large number of adaptable geometrical parameters, this new procedure can be applied to point clouds covering different types of topography and variable point densities. Besides, the procedure can be adapted to different user requirements. A binomial logistic model is estimated for all a priori defined classes, using a training set of manually classified points. For each point, a value is calculated defining the probability that this point belongs to a certain class. The class with the highest probability will be used for the final point classification. Besides, the use of statistical methods enables a thorough model evaluation by the implementation of well-founded inference criteria. If necessary, the interpretation of these inference analyses also enables the possible definition of more sub-classes. The use of a large number of geometrical parameters is an important advantage of this procedure in comparison with current classification algorithms. It allows more user modifications for the large variety of types of ALS point clouds, while still achieving comparable classification results. It is indeed possible to evaluate parameters as degrees of freedom and remove or add parameters as a function of the type of study area. The performance of this procedure is successfully demonstrated by classifying two different ALS point sets from an urban and a rural area. Moreover, the potential of the proposed classification procedure is explored for terrestrial data

    Evaluating subset selection methods for use case points estimation

    Get PDF
    When the Use Case Points method is used for software effort estimation, users are faced with low model accuracy which impacts on its practical application. This study investigates the significance of using subset selection methods for the prediction accuracy of Multiple Linear Regression models, obtained by the stepwise approach. K-means, Spectral Clustering, the Gaussian Mixture Model and Moving Window are evaluated as appropriate subset selection techniques. The methods were evaluated according to several evaluation criteria and then statistically tested. Evaluation was performing on two independent datasets-which differ in project types and size. Both were cut by the hold-out method. If clustering were used, the training sets were clustered into 3 classes; and, for each of class, an independent regression model was created. These were later used for the prediction of testing sets. If Moving Window was used, then window of sizes 5, 10 and 15 were tested. The results show that clustering techniques decrease prediction errors significantly when compared to Use Case Points or moving windows methods. Spectral Clustering was selected as the best-performing solution, because it achieves a Sum of Squared Errors reduction of 32% for the first dataset, and 98% for the second dataset. The Mean Absolute Percentage Error is less than 1% for the second dataset for Spectral Clustering; 9% for moving window; and 27% for Use Case Points. When the first dataset is used, then prediction errors are significantly higher -53% for Spectral Clustering, but Use Case Points produces a 165% result. It can be concluded that this study proves subset selection techniques as a significant method for improving the prediction ability of linear regression models - which are used for software development effort prediction. It can also be concluded that the clustering method performs better than the moving window method

    Categorical variable segmentation model for software development effort estimation

    Get PDF
    This paper proposes a new software development effort estimation model. The new model's design is based on the function point analysis, categorical variable segmentation (CVS), and stepwise regression. The stepwise regression method is used for the creation of the unique estimation model of each segment. The estimation accuracy of the proposed model is compared to clustering-based models and the international function point user group model. It is shown that the proposed model increases estimation accuracy when compared to baseline methods: non-clustered functional point analysis and clustering-based models. The new CVS model achieves a significantly higher accuracy than the baseline methods. © 2013 IEEE.Faculty of Applied Informatics, Tomas Bata University in Zlin [RO30186021025/2102

    An evaluation of the signature extension approach to large area crop inventories utilizing space image data

    Get PDF
    The author has identified the following significant results. Two examples of haze correction algorithms were tested: CROP-A and XSTAR. The CROP-A was tested in a unitemporal mode on data collected in 1973-74 over ten sample segments in Kansas. Because of the uniformly low level of haze present in these segments, no conclusion could be reached about CROP-A's ability to compensate for haze. It was noted, however, that in some cases CROP-A made serious errors which actually degraded classification performance. The haze correction algorithm XSTAR was tested in a multitemporal mode on 1975-76 LACIE sample segment data over 23 blind sites in Kansas and 18 sample segments in North Dakota, providing wide range of haze levels and other conditions for algorithm evaluation. It was found that this algorithm substantially improved signature extension classification accuracy when a sum-of-likelihoods classifier was used with an alien rejection threshold

    Explainable AI in Fintech and Insurtech

    Get PDF
    The growing application of black-box Artificial Intelligence algorithms in many real-world application is raising the importance of understanding how models make their decision. The research field that aims to look into the inner workings of the black-box and to make predictions more interpretable is referred to as eXplainable Artificial Intelligence (XAI). Over the recent years, the research domain of XAI has seen important contributions and continuous developments, achieving great results with theoretically sound applied methodologies. These achievements enable both industry and regulators to improve on existing models and their supervision; this is done in term of explainability, which is the main purpose of these models, but it also brings new possibilities, namely the employment of eXplainable AI models and their outputs as an intermediate step to new applications, greatly expanding their usefulness beyond explainability of model decisions. This thesis is composed of six chapters: an introduction and a conclusion plus four self contained sections reporting the corresponding papers. Chapter 1 proposes the use of Shapley values in similarity networks and clustering models in order to bring out new pieces of information, useful for classification and analysis of the customer base, in an insurtech setting. In chapter 2 a comparison between SHAP and LIME, two of the most important XAI models, evaluating their parameters attribution methodologies and the information they are capable of include thereof, in italian Small and Medium Enterprises’ Probability of Default (PD) estimation, with balance sheet data as inputs. Chapter 3 introduces the use of Shapley values in feature selection techniques, with the analysis of wrapper and embedded feature selection algorithms and their ability to select relevant features with both raw data and their Shapley values, again in the setting of SME PD estimation. In chapter 4, a new methodology of model selection based on Lorenz Zoonoid is introduced, highlighting similarities with the game-theoretical concept of Shapley values and their variability decomposition attribution to independent variables as well as some advantages in terms of model comparability and standardization. These properties are explored through both a simulated example and the application to a real world dataset, provided by EU-certified rating agency Modefinance.The growing application of black-box Artificial Intelligence algorithms in many real-world application is raising the importance of understanding how models make their decision. The research field that aims to look into the inner workings of the black-box and to make predictions more interpretable is referred to as eXplainable Artificial Intelligence (XAI). Over the recent years, the research domain of XAI has seen important contributions and continuous developments, achieving great results with theoretically sound applied methodologies. These achievements enable both industry and regulators to improve on existing models and their supervision; this is done in term of explainability, which is the main purpose of these models, but it also brings new possibilities, namely the employment of eXplainable AI models and their outputs as an intermediate step to new applications, greatly expanding their usefulness beyond explainability of model decisions. This thesis is composed of six chapters: an introduction and a conclusion plus four self contained sections reporting the corresponding papers. Chapter 1 proposes the use of Shapley values in similarity networks and clustering models in order to bring out new pieces of information, useful for classification and analysis of the customer base, in an insurtech setting. In chapter 2 a comparison between SHAP and LIME, two of the most important XAI models, evaluating their parameters attribution methodologies and the information they are capable of include thereof, in italian Small and Medium Enterprises’ Probability of Default (PD) estimation, with balance sheet data as inputs. Chapter 3 introduces the use of Shapley values in feature selection techniques, with the analysis of wrapper and embedded feature selection algorithms and their ability to select relevant features with both raw data and their Shapley values, again in the setting of SME PD estimation. In chapter 4, a new methodology of model selection based on Lorenz Zoonoid is introduced, highlighting similarities with the game-theoretical concept of Shapley values and their variability decomposition attribution to independent variables as well as some advantages in terms of model comparability and standardization. These properties are explored through both a simulated example and the application to a real world dataset, provided by EU-certified rating agency Modefinance

    Detection of internal quality in kiwi with time-domain diffuse reflectance spectroscopy

    Get PDF
    Time-domain diffuse reflectance spectroscopy (TRS), a medical sensing technique, was used to evaluate internal kiwi fruit quality. The application of this pulsed laser spectroscopic technique was studied as a new, possible non-destructive, method to detect optically different quality parameters: firmness, sugar content, and acidity. The main difference with other spectroscopic techniques is that TRS estimates separately and at the same time absorbed light and scattering inside the sample, at each wavelength, allowing simultaneous estimations of firmness and chemical contents. Standard tests (flesh puncture, compression with ball, .Brix, total acidity, skin color) have been used as references to build estimative models, using a multivariate statistical approach. Classification functions of the fruits into three groups achieved a performance of 75% correctly classified fruits for firmness, 60% for sugar content, and 97% for acidity. Results demonstrate good potential for this technique to be used in the development of new sensors for non-destructive quality assessment
    • …
    corecore