18 research outputs found

    Data mining in soft computing framework: a survey

    Get PDF
    The present article provides a survey of the available literature on data mining using soft computing. A categorization has been provided based on the different soft computing tools and their hybridizations used, the data mining function implemented, and the preference criterion selected by the model. The utility of the different soft computing methodologies is highlighted. Generally fuzzy sets are suitable for handling the issues related to understandability of patterns, incomplete/noisy data, mixed media information and human interaction, and can provide approximate solutions faster. Neural networks are nonparametric, robust, and exhibit good learning and generalization capabilities in data-rich environments. Genetic algorithms provide efficient search algorithms to select a model, from mixed media data, based on some preference criterion/objective function. Rough sets are suitable for handling different types of uncertainty in data. Some challenges to data mining and the application of soft computing methodologies are indicated. An extensive bibliography is also included

    Genetic Algorithm Optimization for Determining Fuzzy Measures from Fuzzy Data

    Get PDF
    Fuzzy measures and fuzzy integrals have been successfully used in many real applications. How to determine fuzzy measures is a very difficult problem in these applications. Though there have existed some methodologies for solving this problem, such as genetic algorithms, gradient descent algorithms, neural networks, and particle swarm algorithm, it is hard to say which one is more appropriate and more feasible. Each method has its advantages. Most of the existed works can only deal with the data consisting of classic numbers which may arise limitations in practical applications. It is not reasonable to assume that all data are real data before we elicit them from practical data. Sometimes, fuzzy data may exist, such as in pharmacological, financial and sociological applications. Thus, we make an attempt to determine a more generalized type of general fuzzy measures from fuzzy data by means of genetic algorithms and Choquet integrals. In this paper, we make the first effort to define the σ-λ rules. Furthermore we define and characterize the Choquet integrals of interval-valued functions and fuzzy-number-valued functions based on σ-λ rules. In addition, we design a special genetic algorithm to determine a type of general fuzzy measures from fuzzy data

    Medical data mining using Bayesian network and DNA sequence analysis.

    Get PDF
    Lee Kit Ying.Thesis (M.Phil.)--Chinese University of Hong Kong, 2004.Includes bibliographical references (leaves 115-117).Abstracts in English and Chinese.Abstract --- p.iAcknowledgement --- p.ivChapter 1 --- Introduction --- p.1Chapter 1.1 --- Project Background --- p.1Chapter 1.2 --- Problem Specifications --- p.3Chapter 1.3 --- Contributions --- p.5Chapter 1.4 --- Thesis Organization --- p.6Chapter 2 --- Background --- p.8Chapter 2.1 --- Medical Data Mining --- p.8Chapter 2.1.1 --- General Information --- p.9Chapter 2.1.2 --- Related Research --- p.10Chapter 2.1.3 --- Characteristics and Difficulties Encountered --- p.11Chapter 2.2 --- DNA Sequence Analysis --- p.13Chapter 2.3 --- Hepatitis B Virus --- p.14Chapter 2.3.1 --- Virus Characteristics --- p.15Chapter 2.3.2 --- Important Findings on the Virus --- p.17Chapter 2.4 --- Bayesian Network and its Classifiers --- p.17Chapter 2.4.1 --- Formal Definition --- p.18Chapter 2.4.2 --- Existing Learning Algorithms --- p.19Chapter 2.4.3 --- Evolutionary Algorithms and Hybrid EP (HEP) --- p.22Chapter 2.4.4 --- Bayesian Network Classifiers --- p.25Chapter 2.4.5 --- Learning Algorithms for BN Classifiers --- p.32Chapter 3 --- Bayesian Network Classifier for Clinical Data --- p.35Chapter 3.1 --- Related Work --- p.36Chapter 3.2 --- Proposed BN-augmented Naive Bayes Classifier (BAN) --- p.38Chapter 3.2.1 --- Definition --- p.38Chapter 3.2.2 --- Learning Algorithm with HEP --- p.39Chapter 3.2.3 --- Modifications on HEP --- p.39Chapter 3.3 --- Proposed General Bayesian Network with Markov Blan- ket (GBN) --- p.40Chapter 3.3.1 --- Definition --- p.41Chapter 3.3.2 --- Learning Algorithm with HEP --- p.41Chapter 3.4 --- Findings on Bayesian Network Parameters Calculation --- p.43Chapter 3.4.1 --- Situation and Errors --- p.43Chapter 3.4.2 --- Proposed Solution --- p.46Chapter 3.5 --- Performance Analysis on Proposed BN Classifier Learn- ing Algorithms --- p.47Chapter 3.5.1 --- Experimental Methodology --- p.47Chapter 3.5.2 --- Benchmark Data --- p.48Chapter 3.5.3 --- Clinical Data --- p.50Chapter 3.5.4 --- Discussion --- p.55Chapter 3.6 --- Summary --- p.56Chapter 4 --- Classification in DNA Analysis --- p.57Chapter 4.1 --- Related Work --- p.58Chapter 4.2 --- Problem Definition --- p.59Chapter 4.3 --- Proposed Methodology Architecture --- p.60Chapter 4.3.1 --- Overall Design --- p.60Chapter 4.3.2 --- Important Components --- p.62Chapter 4.4 --- Clustering --- p.63Chapter 4.5 --- Feature Selection Algorithms --- p.65Chapter 4.5.1 --- Information Gain --- p.66Chapter 4.5.2 --- Other Approaches --- p.67Chapter 4.6 --- Classification Algorithms --- p.67Chapter 4.6.1 --- Naive Bayes Classifier --- p.68Chapter 4.6.2 --- Decision Tree --- p.68Chapter 4.6.3 --- Neural Networks --- p.68Chapter 4.6.4 --- Other Approaches --- p.69Chapter 4.7 --- Important Points on Evaluation --- p.69Chapter 4.7.1 --- Errors --- p.70Chapter 4.7.2 --- Independent Test --- p.70Chapter 4.8 --- Performance Analysis on Classification of DNA Data --- p.71Chapter 4.8.1 --- Experimental Methodology --- p.71Chapter 4.8.2 --- Using Naive-Bayes Classifier --- p.73Chapter 4.8.3 --- Using Decision Tree --- p.73Chapter 4.8.4 --- Using Neural Network --- p.74Chapter 4.8.5 --- Discussion --- p.76Chapter 4.9 --- Summary --- p.77Chapter 5 --- Adaptive HEP for Learning Bayesian Network Struc- ture --- p.78Chapter 5.1 --- Background --- p.79Chapter 5.1.1 --- Objective --- p.79Chapter 5.1.2 --- Related Work - AEGA --- p.79Chapter 5.2 --- Feasibility Study --- p.80Chapter 5.3 --- Proposed A-HEP Algorithm --- p.82Chapter 5.3.1 --- Structural Dissimilarity Comparison --- p.82Chapter 5.3.2 --- Dynamic Population Size --- p.83Chapter 5.4 --- Evaluation on Proposed Algorithm --- p.88Chapter 5.4.1 --- Experimental Methodology --- p.89Chapter 5.4.2 --- Comparison on Running Time --- p.93Chapter 5.4.3 --- Comparison on Fitness of Final Network --- p.94Chapter 5.4.4 --- Comparison on Similarity to the Original Network --- p.95Chapter 5.4.5 --- Parameter Study --- p.96Chapter 5.5 --- Applications on Medical Domain --- p.100Chapter 5.5.1 --- Discussion --- p.100Chapter 5.5.2 --- An Example --- p.101Chapter 5.6 --- Summary --- p.105Chapter 6 --- Conclusion --- p.107Chapter 6.1 --- Summary --- p.107Chapter 6.2 --- Future Work --- p.109Bibliography --- p.11

    Machine Learning towards General Medical Image Segmentation

    Get PDF
    The quality of patient care associated with diagnostic radiology is proportionate to a physician\u27s workload. Segmentation is a fundamental limiting precursor to diagnostic and therapeutic procedures. Advances in machine learning aims to increase diagnostic efficiency to replace single applications with generalized algorithms. We approached segmentation as a multitask shape regression problem, simultaneously predicting coordinates on an object\u27s contour while jointly capturing global shape information. Shape regression models inherent point correlations to recover ambiguous boundaries not supported by clear edges and region homogeneity. Its capabilities was investigated using multi-output support vector regression (MSVR) on head and neck (HaN) CT images. Subsequently, we incorporated multiplane and multimodality spinal images and presented the first deep learning multiapplication framework for shape regression, the holistic multitask regression network (HMR-Net). MSVR and HMR-Net\u27s performance were comparable or superior to state-of-the-art algorithms. Multiapplication frameworks bridges any technical knowledge gaps and increases workflow efficiency

    Untangling hotel industry’s inefficiency: An SFA approach applied to a renowned Portuguese hotel chain

    Get PDF
    The present paper explores the technical efficiency of four hotels from Teixeira Duarte Group - a renowned Portuguese hotel chain. An efficiency ranking is established from these four hotel units located in Portugal using Stochastic Frontier Analysis. This methodology allows to discriminate between measurement error and systematic inefficiencies in the estimation process enabling to investigate the main inefficiency causes. Several suggestions concerning efficiency improvement are undertaken for each hotel studied.info:eu-repo/semantics/publishedVersio

    Studies on Water Management Issues

    Get PDF
    This book shares knowledge gained through water management related research. It describes a broad range of approaches and technologies, of which have been developed and used by researchers for managing water resource problems. This multidisciplinary book covers water management issues under surface water management, groundwater management, water quality management, and water resource planning management subtopics. The main objective of this book is to enable a better understanding of these perspectives relating to water management practices. This book is expected to be useful to researchers, policy-makers, and non-governmental organizations working on water related projects in countries worldwide

    Statistical Analysis and Forecasting of Economic Structural Change

    Get PDF
    In 1984, the University of Bonn (FRG) and IIASA created a joint research group to analyze the relationship between economic growth and structural change. The research team was to examine the commodity composition as well as the size and direction of commodity and credit flows among countries and regions. Krelle (1988) reports on the results of this "Bonn-IIASA" research project. At the same time, an informal IIASA Working Group was initiated to deal with problems of the statistical analysis of economic data in the context of structural change: What tools do we have to identify nonconstancy of model parameters? What type of models are particularly applicable to nonconstant structure? How is forecasting affected by the presence of nonconstant structure? What problems should be anticipated in applying these tools and models? Some 50 experts, mainly statisticians or econometricians from about 15 countries, came together in Lodz, Poland (May 1985); Berlin, GDR (June 1986); and Sulejov, Poland (September 1986) to present and discuss their findings. This volume contains a selected set of those conference contributions as well as several specially invited chapters. The introductory chapter "What can statistics contribute to the analysis of economic structural change?", discusses not only the role of statistics in the detection and assimilation of structural changes, but also the relevance of respective methods in the evaluation of econometric models. Trends in the development of these methods are indicated, and the contributions to the present volume are put into a broader context of empirical economics to help to bridge the gap between economists and statisticians. The chapters in the first section are concerned with the detection of parameter nonconstancy. The procedures discussed range from classical methods, such as the CUSUM test, to new concepts, particularly those based on nonparametric statistics. Several chapters assess the conditions under which these methods can be applied and their robustness under such conditions. The second section addresses models that are in some sense generalizations of nonconstant-parameter models, so that they can assimilate structural changes. The last section deals with real-life structural change situations
    corecore