Search CORE

14,603 research outputs found

Learning from the past with experiment databases

Author: C. Perlich
D. Brain
H. Blockeel
I.H. Witten
J. Vanschoren
J. Vanschoren
M. Someren Van
R. Holte
Y. Peng
Publication venue: University of Waikato, Department of Computer Science
Publication date: 01/01/2008
Field of study

Thousands of Machine Learning research papers contain experimental comparisons that usually have been conducted with a single focus of interest, and detailed results are usually lost after publication. Once past experiments are collected in experiment databases they allow for additional and possibly much broader investigation. In this paper, we show how to use such a repository to answer various interesting research questions about learning algorithms and to verify a number of recent studies. Alongside performing elaborate comparisons and rankings of algorithms, we also investigate the effects of algorithm parameters and data properties, and study the learning curves and bias-variance profiles of algorithms to gain deeper insights into their behavior

CiteSeerX

Crossref

Research Commons@Waikato

PRESISTANT: Learning based assistant for data pre-processing

Author: Abelló Alberto
Aluja-Banet Tomàs
Bilalli Besim
Wrembel Robert
Publication venue
Publication date: 02/03/2018
Field of study

Data pre-processing is one of the most time consuming and relevant steps in a data analysis process (e.g., classification task). A given data pre-processing operator (e.g., transformation) can have positive, negative or zero impact on the final result of the analysis. Expert users have the required knowledge to find the right pre-processing operators. However, when it comes to non-experts, they are overwhelmed by the amount of pre-processing operators and it is challenging for them to find operators that would positively impact their analysis (e.g., increase the predictive accuracy of a classifier). Existing solutions either assume that users have expert knowledge, or they recommend pre-processing operators that are only "syntactically" applicable to a dataset, without taking into account their impact on the final analysis. In this work, we aim at providing assistance to non-expert users by recommending data pre-processing operators that are ranked according to their impact on the final analysis. We developed a tool PRESISTANT, that uses Random Forests to learn the impact of pre-processing operators on the performance (e.g., predictive accuracy) of 5 different classification algorithms, such as J48, Naive Bayes, PART, Logistic Regression, and Nearest Neighbor. Extensive evaluations on the recommendations provided by our tool, show that PRESISTANT can effectively help non-experts in order to achieve improved results in their analytical tasks

arXiv.org e-Print Archive

UPCommons. Portal del coneixement obert de la UPC

Convolutional Sparse Kernel Network for Unsupervised Medical Image Analysis

Author: Ahn Euijoon
Feng Dagan
Fulham Michael
Kim Jinman
Kumar Ashnil
Publication venue: 'Elsevier BV'
Publication date: 01/01/2019
Field of study

The availability of large-scale annotated image datasets and recent advances in supervised deep learning methods enable the end-to-end derivation of representative image features that can impact a variety of image analysis problems. Such supervised approaches, however, are difficult to implement in the medical domain where large volumes of labelled data are difficult to obtain due to the complexity of manual annotation and inter- and intra-observer variability in label assignment. We propose a new convolutional sparse kernel network (CSKN), which is a hierarchical unsupervised feature learning framework that addresses the challenge of learning representative visual features in medical image analysis domains where there is a lack of annotated training data. Our framework has three contributions: (i) We extend kernel learning to identify and represent invariant features across image sub-patches in an unsupervised manner. (ii) We initialise our kernel learning with a layer-wise pre-training scheme that leverages the sparsity inherent in medical images to extract initial discriminative features. (iii) We adapt a multi-scale spatial pyramid pooling (SPP) framework to capture subtle geometric differences between learned visual features. We evaluated our framework in medical image retrieval and classification on three public datasets. Our results show that our CSKN had better accuracy when compared to other conventional unsupervised methods and comparable accuracy to methods that used state-of-the-art supervised convolutional neural networks (CNNs). Our findings indicate that our unsupervised CSKN provides an opportunity to leverage unannotated big data in medical imaging repositories.Comment: Accepted by Medical Image Analysis (with a new title 'Convolutional Sparse Kernel Network for Unsupervised Medical Image Analysis'). The manuscript is available from following link (https://doi.org/10.1016/j.media.2019.06.005

arXiv.org e-Print Archive

ResearchOnline at James Cook University

Pre and Post-hoc Diagnosis and Interpretation of Malignancy from Breast DCE-MRI

Author: Bradley Andrew P.
Carneiro Gustavo
Maicas Gabriel
Nascimento Jacinto C.
Reid Ian
Publication venue
Publication date: 03/02/2019
Field of study

We propose a new method for breast cancer screening from DCE-MRI based on a post-hoc approach that is trained using weakly annotated data (i.e., labels are available only at the image level without any lesion delineation). Our proposed post-hoc method automatically diagnosis the whole volume and, for positive cases, it localizes the malignant lesions that led to such diagnosis. Conversely, traditional approaches follow a pre-hoc approach that initially localises suspicious areas that are subsequently classified to establish the breast malignancy -- this approach is trained using strongly annotated data (i.e., it needs a delineation and classification of all lesions in an image). Another goal of this paper is to establish the advantages and disadvantages of both approaches when applied to breast screening from DCE-MRI. Relying on experiments on a breast DCE-MRI dataset that contains scans of 117 patients, our results show that the post-hoc method is more accurate for diagnosing the whole volume per patient, achieving an AUC of 0.91, while the pre-hoc method achieves an AUC of 0.81. However, the performance for localising the malignant lesions remains challenging for the post-hoc method due to the weakly labelled dataset employed during training.Comment: Submitted to Medical Image Analysi

arXiv.org e-Print Archive

Queensland University of Technology ePrints Archive

Personal life event detection from social media

Author: Alani Harith
Choudhury Smitashree
Publication venue: CEUR
Publication date: 01/09/2014
Field of study

Creating video clips out of personal content from social media is on the rise. MuseumOfMe, Facebook Lookback, and Google Awesome are some popular examples. One core challenge to the creation of such life summaries is the identification of personal events, and their time frame. Such videos can greatly benefit from automatically distinguishing between social media content that is about someone's own wedding from that week, to an old wedding, or to that of a friend. In this paper, we describe our approach for identifying a number of common personal life events from social media content (in this paper we have used Twitter for our test), using multiple feature-based classifiers. Results show that combination of linguistic and social interaction features increases overall classification accuracy of most of the events while some events are relatively more difficult than others (e.g. new born with mean precision of .6 from all three models)

CiteSeerX

Open Research Online (The Open University)

Artificial intelligence and computer-aided diagnosis in colonoscopy: current evidence and future directions

Author: Angermann
Barclay
Bernal
Brandao
Byrne
Byrne
Chen
Chen
Coe
Corley
Corley
Crockett
East
Ehteshami Bejnordi
Esteva
Facciorusso
Ferlitsch
Fernández-Esparrach
Filip
Gross
Hazewinkel
Ignjatovic
Kaminski
Kaminski
Karkanis
Kominami
Kudo
Kuiper
Ladabaum
Le Clercq
Lee
Lee
Litjens
McGill
Mesejo
Misawa
Misawa
Mori
Mori
Mori
Morris
Prior
Rath
Rees
Rembacken
Rex
Rex
Russakovsky
Samek
Sawhney
Shen
Silver
Stanek
Tajbakhsh
Takeda
Takemura
Takemura
Tischendorf
Torre
Urban
van der Vlugt
Van Rijn
Wang
Wang
Winawer
Zhang
Publication venue: 'Elsevier BV'
Publication date: 01/01/2019
Field of study

Computer-aided diagnosis offers a promising solution to reduce variation in colonoscopy performance. Pooled miss rates for polyps are as high as 22%, and associated interval colorectal cancers after colonoscopy are of concern. Optical biopsy, whereby in-vivo classification of polyps based on enhanced imaging replaces histopathology, has not been incorporated into routine practice because it is limited by interobserver variability and generally only meets accepted standards in expert settings. Real-time decision-support software has been developed to detect and characterise polyps, and also to offer feedback on the technical quality of inspection. Some of the current algorithms, particularly with recent advances in artificial intelligence techniques, match human expert performance for optical biopsy. In this Review, we summarise the evidence for clinical applications of computer-aided diagnosis and artificial intelligence in colonoscopy

Crossref

UCL Discovery

White Rose Research Online