1,248 research outputs found
Classification Based Analysis on Cancer Datasets Using Predictor Measures
Cancer is a life-threatening disease. Probably the most effective way to reduce cancer deaths is to detect it earlier. Diagnosing the disease earlier needs an accurate and reliable procedure which could be used by physicians to distinguish between cancer from malignant ones without leaving for surgical biopsy. Data mining offers solution for such types of the problems where a large quantity of information about patients and their conditions are stored in clinical database. This paper focuses on prediction of some such diseases like Leukemia and Breast cancers. Naïve Bayes and SVM prediction models are built for the prediction and classification. The performance of the proposed models produced significant results of above 96% while compared with other models in terms of accuracy, computational time and convergence. Keywords: Prediction, Data Mining, Diagnosis, Cancer, Naïve Bayes, Supper Vector machine (SVM). DOI: 10.7176/CEIS/10-6-05 Publication date:July 31st 201
Machine Learning Approach for Cancer Entities Association and Classification
According to the World Health Organization (WHO), cancer is the second
leading cause of death globally. Scientific research on different types of
cancers grows at an ever-increasing rate, publishing large volumes of research
articles every year. The insight information and the knowledge of the drug,
diagnostics, risk, symptoms, treatments, etc., related to genes are significant
factors that help explore and advance the cancer research progression. Manual
screening of such a large volume of articles is very laborious and
time-consuming to formulate any hypothesis. The study uses the two most
non-trivial NLP, Natural Language Processing functions, Entity Recognition, and
text classification to discover knowledge from biomedical literature. Named
Entity Recognition (NER) recognizes and extracts the predefined entities
related to cancer from unstructured text with the support of a user-friendly
interface and built-in dictionaries. Text classification helps to explore the
insights into the text and simplifies data categorization, querying, and
article screening. Machine learning classifiers are also used to build the
classification model and Structured Query Languages (SQL) is used to identify
the hidden relations that may lead to significant predictions
Automatic Image Detection of Halloysite Clay Nanotubes as a Future Ultrasound Theranostic Agent for Tumoral Cell Targeting and Treatment
none7Halloysite clay Nanotubes (HNTs) are nanomaterials composed of double layered aluminosilicate minerals with a hollow tubular structure in the submicron range. They are characterized by a wide range of applications in anticancer therapy as agent delivery. In this work we aim to investigate the automatic detection features of HNTs through advanced quantitative ultrasound imaging employing different concentrations (3-5 mg/mL) at clinical conventional frequency, i.e. 7 MHz. Different tissue mimicking samples of HNT containing agarose gel were imaged through a commercially available echographic system, that was opportunely combined with ultrasound signal analysis research platform for extracting the raw ultrasound radiofrequency (RF) signals. Acquired data were stored and analyzed by means of an in-house developed algorithm based on wavelet decomposition, in order to identify the specific spectrum contribution of the HNTs and generate corresponding image mapping. Sensitivity and specificity of the HNT detection were quantified. Average specificity (94.36%) was very high with reduced dependency on HNT concentration, while sensitivity showed a proportional increase with concentration with an average of 46.78%. However, automatic detection performances are currently under investigation for further improvement taking into account image enhancement and biocompatibility issues.openCasciaro Sergio; Soloperto Giulia; Conversano Francesco; Casciaro Ernesto; Greco Antonio; Leporatti Stefano; Lay-Ekuakille Aime; Gigli GiuseppeCasciaro, Sergio; Soloperto, Giulia; Conversano, Francesco; Casciaro, Ernesto; Greco, Antonio; Leporatti, Stefano; LAY EKUAKILLE, Aime; Gigli, Giusepp
Big data analytics for preventive medicine
© 2019, Springer-Verlag London Ltd., part of Springer Nature. Medical data is one of the most rewarding and yet most complicated data to analyze. How can healthcare providers use modern data analytics tools and technologies to analyze and create value from complex data? Data analytics, with its promise to efficiently discover valuable pattern by analyzing large amount of unstructured, heterogeneous, non-standard and incomplete healthcare data. It does not only forecast but also helps in decision making and is increasingly noticed as breakthrough in ongoing advancement with the goal is to improve the quality of patient care and reduces the healthcare cost. The aim of this study is to provide a comprehensive and structured overview of extensive research on the advancement of data analytics methods for disease prevention. This review first introduces disease prevention and its challenges followed by traditional prevention methodologies. We summarize state-of-the-art data analytics algorithms used for classification of disease, clustering (unusually high incidence of a particular disease), anomalies detection (detection of disease) and association as well as their respective advantages, drawbacks and guidelines for selection of specific model followed by discussion on recent development and successful application of disease prevention methods. The article concludes with open research challenges and recommendations
Recommended from our members
UNDERSTANDING CONDITIONAL MODES OF ACTIONS IN CHEMICAL-INDUCED TOXICITY USING RULE MODELS
It is estimated that 115 million animals are used in experimental testing each year. Hence,
shifting efforts toward alternative methods for toxicity assessment is essential. However, slow regulatory acceptance of new approaches is governed by knowledge gaps in toxicity modes of action. In this thesis, I describe these challenges and the use of in vitro screening as an alternative of animal testing. I also discuss common data-based methods to derive hypotheses about toxicity modes of actions, and the associated limitations in capturing multiple biological perturbations.
I applied novel data-based workflows, using rule models, to prioritize in vitro assays predictive of toxicity as well as to detect significant polypharmacology profiles. I explain how constraints were applied to rule-based models to inform meaningful mechanistic interpretation for two toxicity endpoints: rat hepatotoxicity and acute toxicity. I compared assays selected, by rules, for predicting hepatotoxicity with endpoints used in in
vitro models from commercial sources. An overlap was observed including cytochrome
activity, mitochondrial toxicity and immunological responses. However, nuclear receptor
activity, identified in rules, is not currently covered in commercial setups. I also demonstrate that endocrine disruption endpoints extrapolate better into in vivo toxicity when a set of specific conditions are met, such as physicochemical properties associated with good bioavailability.
Next, I examined synergistic interactions between conditions in rules describing acute toxicity. I gained novel insights into how specific stressors potentiate the perturbation by known key events, such as acetylcholinesterase inhibition and neuro-signalling disruption. I show that examining polypharmacology profiles is particularly important at low bioactive potencies.
Further, the overall predictive performance of rules describing acute toxicity was tested against a benchmark Random Forest model in a conformal prediction framework. Irrespective to the data type used in the training, the models were prone to bias over compounds promiscuity, by which high promiscuous compounds were more likely to be predicted as toxic.
Overall, the studies conducted in this thesis provide novel insights into molecular mechanisms of toxicity, namely hepatotoxicity and acute toxicity, and with regards to chemical properties and polypharmacology. This knowledge can be used to improve the utility and design of alternative methods for toxicity, and hence, accelerate the regulatory acceptance.Islamic Development Bank
Cambridge Trust Fun
Granular Support Vector Machines Based on Granular Computing, Soft Computing and Statistical Learning
With emergence of biomedical informatics, Web intelligence, and E-business, new challenges are coming for knowledge discovery and data mining modeling problems. In this dissertation work, a framework named Granular Support Vector Machines (GSVM) is proposed to systematically and formally combine statistical learning theory, granular computing theory and soft computing theory to address challenging predictive data modeling problems effectively and/or efficiently, with specific focus on binary classification problems. In general, GSVM works in 3 steps. Step 1 is granulation to build a sequence of information granules from the original dataset or from the original feature space. Step 2 is modeling Support Vector Machines (SVM) in some of these information granules when necessary. Finally, step 3 is aggregation to consolidate information in these granules at suitable abstract level. A good granulation method to find suitable granules is crucial for modeling a good GSVM. Under this framework, many different granulation algorithms including the GSVM-CMW (cumulative margin width) algorithm, the GSVM-AR (association rule mining) algorithm, a family of GSVM-RFE (recursive feature elimination) algorithms, the GSVM-DC (data cleaning) algorithm and the GSVM-RU (repetitive undersampling) algorithm are designed for binary classification problems with different characteristics. The empirical studies in biomedical domain and many other application domains demonstrate that the framework is promising. As a preliminary step, this dissertation work will be extended in the future to build a Granular Computing based Predictive Data Modeling framework (GrC-PDM) with which we can create hybrid adaptive intelligent data mining systems for high quality prediction
DES-mutation : system for exploring links of mutations and diseases
During cellular division DNA replicates and this process is the basis for passing genetic information to the next generation. However, the DNA copy process sometimes produces a copy that is not perfect, that is, one with mutations. The collection of all such mutations in the DNA copy of an organism makes it unique and determines the organism's phenotype. However, mutations are often the cause of diseases. Thus, it is useful to have the capability to explore links between mutations and disease. We approached this problem by analyzing a vast amount of published information linking mutations to disease states. Based on such information, we developed the DES-Mutation knowledgebase which allows for exploration of not only mutation-disease links, but also links between mutations and concepts from 27 topic-specific dictionaries such as human genes/proteins, toxins, pathogens, etc. This allows for a more detailed insight into mutation-disease links and context. On a sample of 600 mutation-disease associations predicted and curated, our system achieves precision of 72.83%. To demonstrate the utility of DES-Mutation, we provide case studies related to known or potentially novel information involving disease mutations. To our knowledge, this is the first mutation-disease knowledgebase dedicated to the exploration of this topic through text-mining and data-mining of different mutation types and their associations with terms from multiple thematic dictionaries
- …