1,722 research outputs found
Wavelet feature extraction and genetic algorithm for biomarker detection in colorectal cancer data
Biomarkers which predict patient’s survival can play an important role in medical diagnosis and
treatment. How to select the significant biomarkers from hundreds of protein markers is a key step in
survival analysis. In this paper a novel method is proposed to detect the prognostic biomarkers ofsurvival in colorectal cancer patients using wavelet analysis, genetic algorithm, and Bayes classifier. One dimensional discrete wavelet transform (DWT) is normally used to reduce the dimensionality of biomedical data. In this study one dimensional continuous wavelet transform (CWT) was proposed to extract the features of colorectal cancer data. One dimensional CWT has no ability to reduce
dimensionality of data, but captures the missing features of DWT, and is complementary part of DWT. Genetic algorithm was performed on extracted wavelet coefficients to select the optimized features, using Bayes classifier to build its fitness function. The corresponding protein markers were
located based on the position of optimized features. Kaplan-Meier curve and Cox regression model 2 were used to evaluate the performance of selected biomarkers. Experiments were conducted on colorectal cancer dataset and several significant biomarkers were detected. A new protein biomarker CD46 was found to significantly associate with survival time
Automated design of robust discriminant analysis classifier for foot pressure lesions using kinematic data
In the recent years, the use of motion tracking systems for acquisition of functional biomechanical gait data, has received increasing interest due to the richness and accuracy of the measured kinematic information. However, costs frequently restrict the number of subjects employed, and this makes the dimensionality of the collected data far higher than the available samples. This paper applies discriminant analysis algorithms to the classification of patients with different types of foot lesions, in order to establish an association between foot motion and lesion formation. With primary attention to small sample size situations, we compare different types of Bayesian classifiers and evaluate their performance with various dimensionality reduction techniques for feature extraction, as well as search methods for selection of raw kinematic variables. Finally, we propose a novel integrated method which fine-tunes the classifier parameters and selects the most relevant kinematic variables simultaneously. Performance comparisons are using robust resampling techniques such as Bootstrapand k-fold cross-validation. Results from experimentations with lesion subjects suffering from pathological plantar hyperkeratosis, show that the proposed method can lead tocorrect classification rates with less than 10% of the original features
A Survey on Data Mining Techniques Applied to Energy Time Series Forecasting
Data mining has become an essential tool during the last decade to analyze large sets of data. The variety of techniques it includes and the successful results obtained in many application fields, make this family of approaches powerful and widely used. In particular, this work explores the application of these techniques to time series forecasting. Although classical statistical-based methods provides reasonably good results, the result of the application of data mining outperforms those of classical ones. Hence, this work faces two main challenges: (i) to provide a compact mathematical formulation of the mainly used techniques; (ii) to review the latest works of time series forecasting and, as case study, those related to electricity price and demand markets.Ministerio de Economía y Competitividad TIN2014-55894-C2-RJunta de Andalucía P12- TIC-1728Universidad Pablo de Olavide APPB81309
The 1995 Goddard Conference on Space Applications of Artificial Intelligence and Emerging Information Technologies
This publication comprises the papers presented at the 1995 Goddard Conference on Space Applications of Artificial Intelligence and Emerging Information Technologies held at the NASA/Goddard Space Flight Center, Greenbelt, Maryland, on May 9-11, 1995. The purpose of this annual conference is to provide a forum in which current research and development directed at space applications of artificial intelligence can be presented and discussed
Intelligent maintenance management in a reconfigurable manufacturing environment using multi-agent systems
Thesis (M. Tech.) -- Central University of Technology, Free State, 2010Traditional corrective maintenance is both costly and ineffective. In some situations it is more cost effective to replace a device than to maintain it; however it is far more likely that the cost of the device far outweighs the cost of performing routine maintenance. These device related costs coupled with the profit loss due to reduced production levels, makes this reactive maintenance approach unacceptably inefficient in many situations. Blind predictive maintenance without considering the actual physical state of the hardware is an improvement, but is still far from ideal. Simply maintaining devices on a schedule without taking into account the operational hours and workload can be a costly mistake.
The inefficiencies associated with these approaches have contributed to the development of proactive maintenance strategies. These approaches take the device health state into account. For this reason, proactive maintenance strategies are inherently more efficient compared to the aforementioned traditional approaches. Predicting the health degradation of devices allows for easier anticipation of the required maintenance resources and costs. Maintenance can also be scheduled to accommodate production needs.
This work represents the design and simulation of an intelligent maintenance management system that incorporates device health prognosis with maintenance schedule generation. The simulation scenario provided prognostic data to be used to schedule devices for maintenance. A production rule engine was provided with a feasible starting schedule. This schedule was then improved and the process was determined by adhering to a set of criteria. Benchmarks were conducted to show the benefit of optimising the starting schedule and the results were presented as proof.
Improving on existing maintenance approaches will result in several benefits for an organisation. Eliminating the need to address unexpected failures or perform maintenance prematurely will ensure that the relevant resources are available when they are required. This will in turn reduce the expenditure related to wasted maintenance resources without compromising the health of devices or systems in the organisation
FlashProfile: A Framework for Synthesizing Data Profiles
We address the problem of learning a syntactic profile for a collection of
strings, i.e. a set of regex-like patterns that succinctly describe the
syntactic variations in the strings. Real-world datasets, typically curated
from multiple sources, often contain data in various syntactic formats. Thus,
any data processing task is preceded by the critical step of data format
identification. However, manual inspection of data to identify the different
formats is infeasible in standard big-data scenarios.
Prior techniques are restricted to a small set of pre-defined patterns (e.g.
digits, letters, words, etc.), and provide no control over granularity of
profiles. We define syntactic profiling as a problem of clustering strings
based on syntactic similarity, followed by identifying patterns that succinctly
describe each cluster. We present a technique for synthesizing such profiles
over a given language of patterns, that also allows for interactive refinement
by requesting a desired number of clusters.
Using a state-of-the-art inductive synthesis framework, PROSE, we have
implemented our technique as FlashProfile. Across tasks over large
real datasets, we observe a median profiling time of only s.
Furthermore, we show that access to syntactic profiles may allow for more
accurate synthesis of programs, i.e. using fewer examples, in
programming-by-example (PBE) workflows such as FlashFill.Comment: 28 pages, SPLASH (OOPSLA) 201
- …