11,380 research outputs found
Automated design of robust discriminant analysis classifier for foot pressure lesions using kinematic data
In the recent years, the use of motion tracking systems for acquisition of functional biomechanical gait data, has received increasing interest due to the richness and accuracy of the measured kinematic information. However, costs frequently restrict the number of subjects employed, and this makes the dimensionality of the collected data far higher than the available samples. This paper applies discriminant analysis algorithms to the classification of patients with different types of foot lesions, in order to establish an association between foot motion and lesion formation. With primary attention to small sample size situations, we compare different types of Bayesian classifiers and evaluate their performance with various dimensionality reduction techniques for feature extraction, as well as search methods for selection of raw kinematic variables. Finally, we propose a novel integrated method which fine-tunes the classifier parameters and selects the most relevant kinematic variables simultaneously. Performance comparisons are using robust resampling techniques such as Bootstrapand k-fold cross-validation. Results from experimentations with lesion subjects suffering from pathological plantar hyperkeratosis, show that the proposed method can lead tocorrect classification rates with less than 10% of the original features
Optimal Exploitation of the Sentinel-2 Spectral Capabilities for Crop Leaf Area Index Mapping
The continuously increasing demand of accurate quantitative high quality information on land surface properties will be faced by a new generation of environmental Earth observation (EO) missions. One current example, associated with a high potential to contribute to those demands, is the multi-spectral ESA Sentinel-2 (S2) system. The present study focuses on the evaluation of spectral information content needed for crop leaf area index (LAI) mapping in view of the future sensors. Data from a field campaign were used to determine the optimal spectral sampling from available S2 bands applying inversion of a radiative transfer model (PROSAIL) with look-up table (LUT) and artificial neural network (ANN) approaches. Overall LAI estimation performance of the proposed LUT approach (LUTN₅₀) was comparable in terms of retrieval performances with a tested and approved ANN method. Employing seven- and eight-band combinations, the LUTN₅₀ approach obtained LAI RMSE of 0.53 and normalized LAI RMSE of 0.12, which was comparable to the results of the ANN. However, the LUTN50 method showed a higher robustness and insensitivity to different band settings. Most frequently selected wavebands were located in near infrared and red edge spectral regions. In conclusion, our results emphasize the potential benefits of the Sentinel-2 mission for agricultural applications
Histopathological image analysis : a review
Over the past decade, dramatic increases in computational power and improvement in image analysis algorithms have allowed the development of powerful computer-assisted analytical approaches to radiological data. With the recent advent of whole slide digital scanners, tissue histopathology slides can now be digitized and stored in digital image form. Consequently, digitized tissue histopathology has now become amenable to the application of computerized image analysis and machine learning techniques. Analogous to the role of computer-assisted diagnosis (CAD) algorithms in medical imaging to complement the opinion of a radiologist, CAD algorithms have begun to be developed for disease detection, diagnosis, and prognosis prediction to complement the opinion of the pathologist. In this paper, we review the recent state of the art CAD technology for digitized histopathology. This paper also briefly describes the development and application of novel image analysis technology for a few specific histopathology related problems being pursued in the United States and Europe
Recommended from our members
Evolutionary computation-based feature selection for finding a stable set of features in high-dimensional data
Evolutionary Computation (EC) algorithms have proved to work well for feature selection because they are powerful search techniques and can produce multiple good solutions. However, they suffer from some limitations for real world applications. Firstly, ECs require high computation time as they evaluate many solutions at each iteration. Secondly, a classifier is usually used as their fitness function which causes the selected subset to perform well only on the utilised classifier (e.g. classifier-bias). Lastly, ECs, as stochastic search methods, return a different final subset in different runs which poses a problem for finding a stable set of features (e.g. stability issue). To address computation time and classifier-bias limitations, this thesis proposes a new two-stage selection approach called filter/filter in which two filter feature selection algorithms are combined. In the first stage, a ranking algorithm forms a reduced dataset by selecting the most informative features from the original dataset. In the second stage, the reduced dataset is fed to a novel EC algorithm to select final feature subset. This new EC algorithm is a Tabu search hybridised with an Asexual Genetic Algorithm called TAGA. TAGA benefits from new search components and solution representation which can effectively reduce computation time. To select a classifier-unbiased final subset, a statistical criterion is used as the fitness function which evaluates the subset independent of any classifier. Experiments show that the proposed filter/filter requires an acceptable computation time and selects more classifier-unbiased features compared to the state-of-the-arts. To find a stable set of features, a novel Generalisation Power Index (GPI) is proposed to analyse the generalisation power of final subsets of an EC in several runs. Generalisation power refers to performance capability of a subset over wide range of classifiers. Computation results confirm that GPI is able to find a stable set of features which achieves near optimal accuracy when used to train various classifiers. To ex amine the suitability of the proposed methods for real-world applications, the filter/filter approach and GPI are integrated to select a stable set of features for METABRIC breast cancer subtype classification problem. Experimental results show that this integration not only can address the limitations of ECs for a real-world biomedical feature selection problem but it performs better than alternatives methods
Chemometrics Methods for Specificity, Authenticity and Traceability Analysis of Olive Oils: Principles, Classifications and Applications
International audienceBackground. Olive oils (OOs) show high chemical variability due to several factors of genetic, environmental and anthropic types. Genetic and environmental factors are responsible for natural compositions and polymorphic diversification resulting in different varietal patterns and phenotypes. Anthropic factors, however, are at the origin of different blends' preparation leading to normative, labelled or adulterated commercial products. Control of complex OO samples requires their (i) characterization by specific markers; (ii) authentication by fingerprint patterns; and (iii) monitoring by traceability analysis.Methods. These quality control and management aims require the use of several multivariate statistical tools: specificity highlighting requires ordination methods; authentication checking calls for classification and pattern recognition methods; traceability analysis implies the use of network-based approaches able to separate or extract mixed information and memorized signals from complex matrices. Results. This chapter presents a review of different chemometrics methods applied for the control of OO variability from metabolic and physical-chemical measured characteristics. The different chemometrics methods are illustrated by different study cases on monovarietal and blended OO originated from different countries.Conclusion. Chemometrics tools offer multiple ways for quantitative evaluations and qualitative control of complex chemical variability of OO in relation to several intrinsic and extrinsic factors
Fuzzy clustering of univariate and multivariate time series by genetic multiobjective optimization
Given a set of time series, it is of interest to discover subsets that share similar properties. For instance, this may be useful for identifying and estimating a single model that may fit conveniently several time series, instead of performing the usual identification and estimation steps for each one. On the other hand time series in the same cluster are related with respect to the measures assumed for cluster analysis and are suitable for building multivariate time series models. Though many approaches to clustering time series exist, in this view the most effective method seems to have to rely on choosing some features relevant for the problem at hand and seeking for clusters according to their measurements, for instance the autoregressive coe±cients, spectral measures or the eigenvectors of the covariance matrix. Some new indexes based on goodnessof-fit criteria will be proposed in this paper for fuzzy clustering of multivariate time series. A general purpose fuzzy clustering algorithm may be used to estimate the proper cluster structure according to some internal criteria of cluster validity. Such indexes are known to measure actually definite often conflicting cluster properties, compactness or connectedness, for instance, or distribution, orientation, size and shape. It is argued that the multiobjective optimization supported by genetic algorithms is a most effective choice in such a di±cult context. In this paper we use the Xie-Beni index and the C-means functional as objective functions to evaluate the cluster validity in a multiobjective optimization framework. The concept of Pareto optimality in multiobjective genetic algorithms is used to evolve a set of potential solutions towards a set of optimal non-dominated solutions. Genetic algorithms are well suited for implementing di±cult optimization problems where objective functions do not usually have good mathematical properties such as continuity, differentiability or convexity. In addition the genetic algorithms, as population based methods, may yield a complete Pareto front at each step of the iterative evolutionary procedure. The method is illustrated by means of a set of real data and an artificial multivariate time series data set.Fuzzy clustering, Internal criteria of cluster validity, Genetic algorithms, Multiobjective optimization, Time series, Pareto optimality
Multivariate NIR studies of seed-water interaction in Scots Pine Seeds (Pinus sylvestris L.)
This thesis describes seed-water interaction using near infrared (NIR) spectroscopy, multivariate regression models and Scots pine seeds. The presented research covers classification of seed viability, prediction of seed moisture content, selection of NIR wavelengths and interpretation of seed-water interaction modelled and analysed by principal component analysis, ordinary least squares (OLS), partial least squares (PLS), bi-orthogonal least squares (BPLS) and genetic algorithms. The potential of using multivariate NIR calibration models for seed classification was demonstrated using filled viable and non-viable seeds that could be separated with an accuracy of 98-99%. It was also shown that multivariate NIR calibration models gave low errors (0.7% and 1.9%) in prediction of seed moisture content for bulk seed and single seeds, respectively, using either NIR reflectance or transmittance spectroscopy. Genetic algorithms selected three to eight wavelength bands in the NIR region and these narrow bands gave about the same prediction of seed moisture content (0.6% and 1.7%) as using the whole NIR interval in the PLS regression models. The selected regions were simulated as NIR filters in OLS regression resulting in predictions of the same quality (0.7 % and 2.1%). This finding opens possibilities to apply NIR sensors in fast and simple spectrometers for the determination of seed moisture content. Near infrared (NIR) radiation interacts with overtones of vibrating bonds in polar molecules. The resulting spectra contain chemical and physical information. This offers good possibilities to measure seed-water interactions, but also to interpret processes within seeds. It is shown that seed-water interaction involves both transitions and changes mainly in covalent bonds of O-H, C-H, C=O and N-H emanating from ongoing physiological processes like seed respiration and protein metabolism. I propose that BPLS analysis that has orthonormal loadings and orthogonal scores giving the same predictions as using conventional PLS regression, should be used as a standard to harmonise the interpretation of NIR spectra
Multivariate Analysis in Metabolomics
Metabolomics aims to provide a global snapshot of all small-molecule metabolites in cells and biological fluids, free of observational biases inherent to more focused studies of metabolism. However, the staggeringly high information content of such global analyses introduces a challenge of its own; efficiently forming biologically relevant conclusions from any given metabolomics dataset indeed requires specialized forms of data analysis. One approach to finding meaning in metabolomics datasets involves multivariate analysis (MVA) methods such as principal component analysis (PCA) and partial least squares projection to latent structures (PLS), where spectral features contributing most to variation or separation are identified for further analysis. However, as with any mathematical treatment, these methods are not a panacea; this review discusses the use of multivariate analysis for metabolomics, as well as common pitfalls and misconceptions
- …