Search CORE

20,823 research outputs found

Machine Learning for Survival Prediction in Breast Cancer

Author: Vanneschi Leonardo
Publication venue: Instituto Superior de Estatística e Gestão de Informação da Universidade Nova de Lisboa. NOVA Information Management School (NOVA IMS)
Publication date: 01/01/2021
Field of study

In the last few years, machine learning revealed an important instrument to support decision making in oncology. In this manuscript, an application is presented about the use of several machine learning algorithms for the prediction of the survival rate of breast cancer patients. Before presenting the results, the manuscript contains a rather basic introduction to the foundations of machine learning, that can be useful for medical doctors that are not expert in the area. The experiments were carried on using the well-known 70-gene signature dataset for breast cancer. The presented results highlight that genetic programming has interesting advantages compared to other machine learning algorithms, both in terms of prediction accuracy and in terms of model interpretability.info:eu-repo/semantics/publishedVersio

Repositório da Universidade Nova de Lisboa

Genetic programming for mining DNA chip data from cancer patients

Author: Buxton BF
Langdon WB
Publication venue
Publication date: 01/01/2004
Field of study

In machine learning terms DNA (gene) chip data is unusual in having thousands of attributes (the gene expression values) but few (<100) records (the patients). A GP based method for both feature selection and generating simple models based on a few genes is demonstrated on cancer data

CiteSeerX

UCL Discovery

Recommended from our members

Disease modelling using evolved discriminate function

Author: Kalganova T
Werner J C
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2003
Field of study

Precocious diagnosis increases the survival time and patient quality of life. It is a binary classification, exhaustively studied in the literature. This paper innovates proposing the application of genetic programming to obtain a discriminate function. This function contains the disease dynamics used to classify the patients with as little false negative diagnosis as possible. If its value is greater than zero then it means that the patient is ill, otherwise healthy. A graphical representation is proposed to show the influence of each dataset attribute in the discriminate function. The experiment deals with Breast Cancer and Thrombosis & Collagen diseases diagnosis. The main conclusion is that the discriminate function is able to classify the patient using numerical clinical data, and the graphical representation displays patterns that allow understanding of the model

Brunel University Research Archive

Disease modeling using Evolved Discriminate Function

Author: Kalganova T
Werner JC
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2003
Field of study

CiteSeerX

Brunel University Research Archive

A System for Accessible Artificial Intelligence

Author: D. A. Ferrucci
EM Ronald
F Pedregosa
Ignacio Arnaldo
Jason H. Moore
JH Moore
JH Moore
JH Moore
Karthik Kannappan
Randal S. Olson
RS Olson
Sara Silva
William La Cava
William La Cava
Publication venue
Publication date: 10/08/2017
Field of study

While artificial intelligence (AI) has become widespread, many commercial AI systems are not yet accessible to individual researchers nor the general public due to the deep knowledge of the systems required to use them. We believe that AI has matured to the point where it should be an accessible technology for everyone. We present an ongoing project whose ultimate goal is to deliver an open source, user-friendly AI system that is specialized for machine learning analysis of complex data in the biomedical and health care domains. We discuss how genetic programming can aid in this endeavor, and highlight specific examples where genetic programming has automated machine learning analyses in previous projects.Comment: 14 pages, 5 figures, submitted to Genetic Programming Theory and Practice 2017 worksho

arXiv.org e-Print Archive

Crossref

Overview of Random Forest Methodology and Practical Guidance with Emphasis on Computational Biology and Bioinformatics

Author: Boulesteix Anne-Laure
Janitza Silke
Kruppa Jochen
König Inke R.
Publication venue
Publication date: 25/07/2012
Field of study

The Random Forest (RF) algorithm by Leo Breiman has become a standard data analysis tool in bioinformatics. It has shown excellent performance in settings where the number of variables is much larger than the number of observations, can cope with complex interaction structures as well as highly correlated variables and returns measures of variable importance. This paper synthesizes ten years of RF development with emphasis on applications to bioinformatics and computational biology. Special attention is given to practical aspects such as the selection of parameters, available RF implementations, and important pitfalls and biases of RF and its variable importance measures (VIMs). The paper surveys recent developments of the methodology relevant to bioinformatics as well as some representative examples of RF applications in this context and possible directions for future research

Crossref

Open Access LMU

A practical, bioinformatic workflow system for large data sets generated by next-generation sequencing

Author: Aaron R. Jex
Altschul
Anja Joachim
Ashburner
Bentley
Bethony
Björnberg
Blaxter
Boag
Bronwyn E. Campbell
Caffrey
Campbell
Cantacessi
Cantacessi
Cantacessi
Cantacessi
Chan
Chang
Cinzia Cantacessi
Clifton
Conesa
Cottee
Cottee
Datu
DeRisi
Doyle
Flicek
Freigofas
Gasser
Golden
Greene
Gupta
Hawdon
Hopkins
Hotez
Hu
Huang
Hunter
Iseli
Jackson
Joachim
Joachim
Keil
Krasky
Letunic
Li
Li
Li
Lipinski
Makedonka Mitreva
Margulies
Matthew J. Nolan
McKay
Metzker
Miller
Miller
Mizuarai
Moreno
Morozova
Moser
Mufson
Mulvenna
Nagaraj
Nagaraj
Neil D. Young
Nikolaou
Nisbet
Olson
Parkinson
Paul W. Sternberg
Pong
Portman
Ranganathan
Ren
Robertson
Robin B. Gasser
Robinson
Ross S. Hall
Sahar Abubucker
Sanger
Sanger
Santos
Shoba Ranganathan
Soderlund
Stathopoulos
Stockdale
Tanaka
Vibranovski
Wang
Williamson
Wilson
Wu
Young
Young
Zhan
Zhong
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2010
Field of study

Transcriptomics (at the level of single cells, tissues and/or whole organisms) underpins many fields of biomedical science, from understanding the basic cellular function in model organisms, to the elucidation of the biological events that govern the development and progression of human diseases, and the exploration of the mechanisms of survival, drug-resistance and virulence of pathogens. Next-generation sequencing (NGS) technologies are contributing to a massive expansion of transcriptomics in all fields and are reducing the cost, time and performance barriers presented by conventional approaches. However, bioinformatic tools for the analysis of the sequence data sets produced by these technologies can be daunting to researchers with limited or no expertise in bioinformatics. Here, we constructed a semi-automated, bioinformatic workflow system, and critically evaluated it for the analysis and annotation of large-scale sequence data sets generated by NGS. We demonstrated its utility for the exploration of differences in the transcriptomes among various stages and both sexes of an economically important parasitic worm (Oesophagostomum dentatum) as well as the prediction and prioritization of essential molecules (including GTPases, protein kinases and phosphatases) as novel drug target candidates. This workflow system provides a practical tool for the assembly, annotation and analysis of NGS data sets, also to researchers with a limited bioinformatic expertise. The custom-written Perl, Python and Unix shell computer scripts used can be readily modified or adapted to suit many different applications. This system is now utilized routinely for the analysis of data sets from pathogens of major socio-economic importance and can, in principle, be applied to transcriptomics data sets from any organism

CiteSeerX

ResearchOnline@JCU

Crossref

ResearchOnline at James Cook University

PubMed Central

Digital Commons@Becker

Caltech Authors

UGD Academic Repository

Macquarie University ResearchOnline

University of Melbourne Institutional Repository