21,950 research outputs found
Data Science and Ebola
Data Science---Today, everybody and everything produces data. People produce
large amounts of data in social networks and in commercial transactions.
Medical, corporate, and government databases continue to grow. Sensors continue
to get cheaper and are increasingly connected, creating an Internet of Things,
and generating even more data. In every discipline, large, diverse, and rich
data sets are emerging, from astrophysics, to the life sciences, to the
behavioral sciences, to finance and commerce, to the humanities and to the
arts. In every discipline people want to organize, analyze, optimize and
understand their data to answer questions and to deepen insights. The science
that is transforming this ocean of data into a sea of knowledge is called data
science. This lecture will discuss how data science has changed the way in
which one of the most visible challenges to public health is handled, the 2014
Ebola outbreak in West Africa.Comment: Inaugural lecture Leiden Universit
A Fuzzy Association Rule Mining Expert-Driven (FARME-D) approach to Knowledge Acquisition
Fuzzy Association Rule Mining Expert-Driven (FARME-D) approach to knowledge acquisition is proposed in this paper as a viable solution to the challenges of rule-based unwieldiness and sharp boundary problem in building a fuzzy rule-based expert system. The fuzzy models were based on domain experts’ opinion about the data description. The proposed approach is committed to modelling of a
compact Fuzzy Rule-Based Expert Systems. It is also aimed at providing a platform for instant update of the knowledge-base in case new knowledge is discovered. The insight to the new approach strategies and underlining assumptions, the structure of FARME-D and its
practical application in medical domain was discussed. Also, the modalities for the validation of the FARME-D approach were discussed
Decision Making in the Medical Domain: Comparing the Effectiveness of GP-Generated Fuzzy Intelligent Structures
ABSTRACT: In this work, we examine the effectiveness of two intelligent models in medical domains. Namely, we apply grammar-guided genetic programming to produce fuzzy intelligent structures, such as fuzzy rule-based systems and fuzzy Petri nets, in medical data mining tasks. First, we use two context-free grammars to describe fuzzy rule-based systems and fuzzy Petri nets with genetic programming. Then, we apply cellular encoding in order to express the fuzzy Petri nets with arbitrary size and topology. The models are examined thoroughly in four real-world medical data sets. Results are presented in detail and the competitive advantages and drawbacks of the selected methodologies are discussed, in respect to the nature of each application domain. Conclusions are drawn on the effectiveness and efficiency of the presented approach
Modeling Stroke Diagnosis with the Use of Intelligent Techniques
The purpose of this work is to test the efficiency of specific intelligent classification algorithms when dealing with the domain of stroke medical diagnosis. The dataset consists of patient records of the ”Acute Stroke Unit”, Alexandra Hospital, Athens, Greece, describing patients suffering one of 5 different stroke types diagnosed by 127 diagnostic attributes / symptoms collected during the first hours of the emergency stroke situation as well as during the hospitalization and recovery phase of the patients. Prior to the application of the intelligent classifier the dimensionality of the dataset is further reduced using a variety of classic and state of the art dimensionality reductions techniques so as to capture the intrinsic dimensionality of the data. The results obtained indicate that the proposed methodology achieves prediction accuracy levels that are comparable to those obtained by intelligent classifiers trained on the original feature space
Recommended from our members
Prediction of progression in idiopathic pulmonary fibrosis using CT scans atbaseline: A quantum particle swarm optimization - Random forest approach
Idiopathic pulmonary fibrosis (IPF) is a fatal lung disease characterized by an unpredictable progressive declinein lung function. Natural history of IPF is unknown and the prediction of disease progression at the time ofdiagnosis is notoriously difficult. High resolution computed tomography (HRCT) has been used for the diagnosisof IPF, but not generally for monitoring purpose. The objective of this work is to develop a novel predictivemodel for the radiological progression pattern at voxel-wise level using only baseline HRCT scans. Mainly, thereare two challenges: (a) obtaining a data set of features for region of interest (ROI) on baseline HRCT scans andtheir follow-up status; and (b) simultaneously selecting important features from high-dimensional space, andoptimizing the prediction performance. We resolved the first challenge by implementing a study design andhaving an expert radiologist contour ROIs at baseline scans, depending on its progression status in follow-upvisits. For the second challenge, we integrated the feature selection with prediction by developing an algorithmusing a wrapper method that combines quantum particle swarm optimization to select a small number of featureswith random forest to classify early patterns of progression. We applied our proposed algorithm to analyzeanonymized HRCT images from 50 IPF subjects from a multi-center clinical trial. We showed that it yields aparsimonious model with 81.8% sensitivity, 82.2% specificity and an overall accuracy rate of 82.1% at the ROIlevel. These results are superior to other popular feature selections and classification methods, in that ourmethod produces higher accuracy in prediction of progression and more balanced sensitivity and specificity witha smaller number of selected features. Our work is the first approach to show that it is possible to use onlybaseline HRCT scans to predict progressive ROIs at 6 months to 1year follow-ups using artificial intelligence
Association Rules Mining Based Clinical Observations
Healthcare institutes enrich the repository of patients' disease related
information in an increasing manner which could have been more useful by
carrying out relational analysis. Data mining algorithms are proven to be quite
useful in exploring useful correlations from larger data repositories. In this
paper we have implemented Association Rules mining based a novel idea for
finding co-occurrences of diseases carried by a patient using the healthcare
repository. We have developed a system-prototype for Clinical State Correlation
Prediction (CSCP) which extracts data from patients' healthcare database,
transforms the OLTP data into a Data Warehouse by generating association rules.
The CSCP system helps reveal relations among the diseases. The CSCP system
predicts the correlation(s) among primary disease (the disease for which the
patient visits the doctor) and secondary disease/s (which is/are other
associated disease/s carried by the same patient having the primary disease).Comment: 5 pages, MEDINFO 2010, C. Safran et al. (Eds.), IOS Pres
A systematic review of data quality issues in knowledge discovery tasks
Hay un gran crecimiento en el volumen de datos porque las organizaciones capturan permanentemente la cantidad colectiva de datos para lograr un mejor proceso de toma de decisiones. El desafío mas fundamental es la exploración de los grandes volúmenes de datos y la extracción de conocimiento útil para futuras acciones por medio de tareas para el descubrimiento del conocimiento; sin embargo, muchos datos presentan mala calidad. Presentamos una revisión sistemática de los asuntos de calidad de datos en las áreas del descubrimiento de conocimiento y un estudio de caso aplicado a la enfermedad agrícola conocida como la roya del café.Large volume of data is growing because the organizations are continuously capturing the collective amount of data for better decision-making process. The most fundamental challenge is to explore the large volumes of data and extract useful knowledge for future actions through knowledge discovery tasks, nevertheless many data has poor quality. We presented a systematic review of the data quality issues in knowledge discovery tasks and a case study applied to agricultural disease named coffee rust
- …