905 research outputs found
Recommended from our members
Tissue-specific and interpretable sub-segmentation of whole tumour burden on CT images by unsupervised fuzzy clustering.
BACKGROUND: Cancer typically exhibits genotypic and phenotypic heterogeneity, which can have prognostic significance and influence therapy response. Computed Tomography (CT)-based radiomic approaches calculate quantitative features of tumour heterogeneity at a mesoscopic level, regardless of macroscopic areas of hypo-dense (i.e., cystic/necrotic), hyper-dense (i.e., calcified), or intermediately dense (i.e., soft tissue) portions. METHOD: With the goal of achieving the automated sub-segmentation of these three tissue types, we present here a two-stage computational framework based on unsupervised Fuzzy C-Means Clustering (FCM) techniques. No existing approach has specifically addressed this task so far. Our tissue-specific image sub-segmentation was tested on ovarian cancer (pelvic/ovarian and omental disease) and renal cell carcinoma CT datasets using both overlap-based and distance-based metrics for evaluation. RESULTS: On all tested sub-segmentation tasks, our two-stage segmentation approach outperformed conventional segmentation techniques: fixed multi-thresholding, the Otsu method, and automatic cluster number selection heuristics for the K-means clustering algorithm. In addition, experiments showed that the integration of the spatial information into the FCM algorithm generally achieves more accurate segmentation results, whilst the kernelised FCM versions are not beneficial. The best spatial FCM configuration achieved average Dice similarity coefficient values starting from 81.94±4.76 and 83.43±3.81 for hyper-dense and hypo-dense components, respectively, for the investigated sub-segmentation tasks. CONCLUSIONS: The proposed intelligent framework could be readily integrated into clinical research environments and provides robust tools for future radiomic biomarker validation
Fuzzy-Granular Based Data Mining for Effective Decision Support in Biomedical Applications
Due to complexity of biomedical problems, adaptive and intelligent knowledge discovery and data mining systems are highly needed to help humans to understand the inherent mechanism of diseases. For biomedical classification problems, typically it is impossible to build a perfect classifier with 100% prediction accuracy. Hence a more realistic target is to build an effective Decision Support System (DSS). In this dissertation, a novel adaptive Fuzzy Association Rules (FARs) mining algorithm, named FARM-DS, is proposed to build such a DSS for binary classification problems in the biomedical domain. Empirical studies show that FARM-DS is competitive to state-of-the-art classifiers in terms of prediction accuracy. More importantly, FARs can provide strong decision support on disease diagnoses due to their easy interpretability. This dissertation also proposes a fuzzy-granular method to select informative and discriminative genes from huge microarray gene expression data. With fuzzy granulation, information loss in the process of gene selection is decreased. As a result, more informative genes for cancer classification are selected and more accurate classifiers can be modeled. Empirical studies show that the proposed method is more accurate than traditional algorithms for cancer classification. And hence we expect that genes being selected can be more helpful for further biological studies
Technical and Fundamental Features Analysis for Stock Market Prediction with Data Mining Methods
Predicting stock prices is an essential objective in the financial world. Forecasting stock returns and their risk represents one of the most critical concerns of market decision makers. This thesis investigates the stock price forecasting with three approaches from the data mining concept and shows how different elements in the stock price can help to enhance the accuracy of our prediction. For this reason, the first and second approaches capture many fundamental indicators from the stocks and implement them as explanatory variables to do stock price classification and forecasting. In the third approach, technical features from the candlestick representation of the share prices are extracted and used to enhance the accuracy of the forecasting. In each approach, different tools and techniques from data mining and machine learning are employed to justify why the forecasting is working.
Furthermore, since the idea is to evaluate the potential of features in the stock trend forecasting, therefore we diversify our experiments using both technical and fundamental features. Therefore, in the first approach, a three-stage methodology is developed while in the first step, a comprehensive investigation of all possible features which can be effective on stocks risk and return are identified. Then, in the next stage, risk and return are predicted by applying data mining techniques for the given features. Finally, we develop a hybrid algorithm, based on some filters and function-based clustering; and re-predicted the risk and return of stocks.
In the second approach, instead of using single classifiers, a fusion model is proposed based on the use of multiple diverse base classifiers that operate on a common input and a meta-classifier that learns from base classifiers’ outputs to obtain a more precise stock return and risk predictions. A set of diversity methods, including Bagging, Boosting, and AdaBoost, is applied to create diversity in classifier combinations. Moreover, the number and procedure for selecting base classifiers for fusion schemes are determined using a methodology based on dataset clustering and candidate classifiers’ accuracy.
Finally, in the third approach, a novel forecasting model for stock markets based on the wrapper ANFIS (Adaptive Neural Fuzzy Inference System) – ICA (Imperialist Competitive Algorithm) and technical analysis of Japanese Candlestick is presented. Two approaches of Raw-based and Signal-based are devised to extract the model’s input variables and buy and sell signals are considered as output variables.
To illustrate the methodologies, for the first and second approaches, Tehran Stock Exchange (TSE) data for the period from 2002 to 2012 are applied, while for the third approach, we used General Motors and Dow Jones indexes.Predicting stock prices is an essential objective in the financial world. Forecasting stock returns and their risk represents one of the most critical concerns of market decision makers. This thesis investigates the stock price forecasting with three approaches from the data mining concept and shows how different elements in the stock price can help to enhance the accuracy of our prediction. For this reason, the first and second approaches capture many fundamental indicators from the stocks and implement them as explanatory variables to do stock price classification and forecasting. In the third approach, technical features from the candlestick representation of the share prices are extracted and used to enhance the accuracy of the forecasting. In each approach, different tools and techniques from data mining and machine learning are employed to justify why the forecasting is working.
Furthermore, since the idea is to evaluate the potential of features in the stock trend forecasting, therefore we diversify our experiments using both technical and fundamental features. Therefore, in the first approach, a three-stage methodology is developed while in the first step, a comprehensive investigation of all possible features which can be effective on stocks risk and return are identified. Then, in the next stage, risk and return are predicted by applying data mining techniques for the given features. Finally, we develop a hybrid algorithm, based on some filters and function-based clustering; and re-predicted the risk and return of stocks.
In the second approach, instead of using single classifiers, a fusion model is proposed based on the use of multiple diverse base classifiers that operate on a common input and a meta-classifier that learns from base classifiers’ outputs to obtain a more precise stock return and risk predictions. A set of diversity methods, including Bagging, Boosting, and AdaBoost, is applied to create diversity in classifier combinations. Moreover, the number and procedure for selecting base classifiers for fusion schemes are determined using a methodology based on dataset clustering and candidate classifiers’ accuracy.
Finally, in the third approach, a novel forecasting model for stock markets based on the wrapper ANFIS (Adaptive Neural Fuzzy Inference System) – ICA (Imperialist Competitive Algorithm) and technical analysis of Japanese Candlestick is presented. Two approaches of Raw-based and Signal-based are devised to extract the model’s input variables and buy and sell signals are considered as output variables.
To illustrate the methodologies, for the first and second approaches, Tehran Stock Exchange (TSE) data for the period from 2002 to 2012 are applied, while for the third approach, we used General Motors and Dow Jones indexes.154 - Katedra financívyhově
Vision-based neural network classifiers and their applications
A thesis submitted for the degree of Doctor of Philosophy of University of LutonVisual inspection of defects is an important part of quality assurance in many fields of production. It plays a very useful role in industrial applications in order to relieve human inspectors and improve the inspection accuracy and hence increasing productivity. Research has previously been done in defect classification of wood veneers using techniques such as neural networks, and a certain degree of success has been achieved. However, to improve results in tenus of both classification accuracy and running time are necessary if the techniques are to be widely adopted in industry, which has motivated this research.
This research presents a method using rough sets based neural network with fuzzy input (RNNFI). Variable precision rough set (VPRS) method is proposed to remove redundant features utilising the characteristics of VPRS for data analysis and processing. The reduced data is fuzzified to represent the feature data in a more suitable foml for input to an improved BP neural network classifier. The improved BP neural network classifier is improved in three aspects: additional momentum, self-adaptive learning rates and dynamic error segmenting. Finally, to further consummate the classifier, a uniform design CUD) approach is introduced to optimise the key parameters because UD can generate a minimal set of uniform and representative design points scattered within the experiment domain. Optimal factor settings are achieved using a response surface (RSM) model and the nonlinear quadratic programming algorithm (NLPQL).
Experiments have shown that the hybrid method is capable of classifying the defects of wood veneers with a fast convergence speed and high classification accuracy, comparing with other methods such as a neural network with fuzzy input and a rough sets based neural network. The research has demonstrated a methodology for visual inspection of defects, especially for situations where there is a large amount of data and a fast running speed is required. It is expected that this method can be applied to automatic visual inspection for production lines of other products such as ceramic tiles and strip steel
Neuro-Fuzzy Based Intelligent Approaches to Nonlinear System Identification and Forecasting
Nearly three decades back nonlinear system identification consisted of several ad-hoc approaches, which were restricted to a very limited class of systems. However, with the advent of the various soft computing methodologies like neural networks and the fuzzy logic combined with optimization techniques, a wider class of systems can be handled at present. Complex systems may be of diverse characteristics and nature. These systems may be linear or nonlinear, continuous or discrete, time varying or time invariant, static or dynamic, short term or long term, central or distributed, predictable or unpredictable, ill or well defined. Neurofuzzy hybrid modelling approaches have been developed as an ideal technique for utilising linguistic values and numerical data. This Thesis is focused on the development of advanced neurofuzzy modelling architectures and their application to real case studies. Three potential requirements have been identified as desirable characteristics for such design: A model needs to have minimum number of rules; a model needs to be generic acting either as Multi-Input-Single-Output (MISO) or Multi-Input-Multi-Output (MIMO) identification model; a model needs to have a versatile nonlinear membership function.
Initially, a MIMO Adaptive Fuzzy Logic System (AFLS) model which incorporates a prototype defuzzification scheme, while utilising an efficient, compared to the Takagi–Sugeno–Kang (TSK) based systems, fuzzification layer has been developed for the detection of meat spoilage using Fourier transform infrared (FTIR) spectroscopy. The identification strategy involved not only the classification of beef fillet samples in their respective quality class (i.e. fresh, semi-fresh and spoiled), but also the simultaneous prediction of their associated microbiological population directly from FTIR spectra. In the case of AFLS, the number of memberships for each input variable was directly associated to the number of rules, hence, the “curse of dimensionality” problem was significantly reduced. Results confirmed the advantage of the proposed scheme against Adaptive Neurofuzzy Inference System (ANFIS), Multilayer Perceptron (MLP) and Partial Least Squares (PLS) techniques used in the same case study.
In the case of MISO systems, the TSK based structure, has been utilized in many neurofuzzy systems, like ANFIS. At the next stage of research, an Adaptive Fuzzy Inference Neural
Network (AFINN) has been developed for the monitoring the spoilage of minced beef utilising multispectral imaging information. This model, which follows the TSK structure,
incorporates a clustering pre-processing stage for the definition of fuzzy rules, while its final fuzzy rule base is determined by competitive learning. In this specific case study, AFINN model was also able to predict for the first time in the literature, the beef’s temperature directly from imaging information. Results again proved the superiority of the adopted model. By extending the line of research and adopting specific design concepts from the previous case studies, the Asymmetric Gaussian Fuzzy Inference Neural Network (AGFINN) architecture has been developed. This architecture has been designed based on the above design principles. A clustering preprocessing scheme has been applied to minimise the number of fuzzy rules. AGFINN incorporates features from the AFLS concept, by having the
same number of rules as well as fuzzy memberships. In spite of the extensive use of the standard symmetric Gaussian membership functions, AGFINN utilizes an asymmetric
function acting as input linguistic node. Since the asymmetric Gaussian membership function’s variability and flexibility are higher than the traditional one, it can partition the input space more effectively. AGFINN can be built either as an MISO or as an MIMO system. In the MISO case, a TSK defuzzification scheme has been implemented, while two different learning algorithms have been implemented. AGFINN has been tested on real datasets related to electricity price forecasting for the ISO New England Power Distribution System. Its performance was compared against a number of alternative models, including ANFIS, AFLS, MLP and Wavelet Neural Network (WNN), and proved to be superior. The concept of asymmetric functions proved to be a valid hypothesis and certainly it can find application to other architectures, such as in Fuzzy Wavelet Neural Network models, by designing a suitable flexible wavelet membership function. AGFINN’s MIMO characteristics also make the proposed architecture suitable for a larger range of applications/problems
- …