Search CORE

63 research outputs found

Conformal prediction of biological activity of chemical compounds

Author: A Gammerman
Alexander Gammerman
AN Jain
C-C Chang
DC Weis
EY Chang
F Pedregosa
G Shafer
Ilia Nouretdinov
J-L Faulon
K Woodsend
Paolo Toccaceli
V Monev
V Vovk
Y Wang
Y You
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/10/2017
Field of study

Crossref

Royal Holloway - Pure

Criteria of efficiency for set-valued classification

Author: Alex Gammerman
EL Lehmann
Ilia Nouretdinov
Ivan Petej
J Lei
J Lei
J Lei
RM Karp
S Martello
T Gneiting
V Fedorova
V Vovk
Valentina Fedorova
Vladimir Vovk
Y Le Cun
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Crossref

Springer - Publisher Connector

Royal Holloway - Pure

PlantProm: a database of plant promoter sequences

Author: Bramley P M
Gammerman A J
Hancock J M
Shahmuradov I A
Solovyev V V
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2003
Field of study

PlantProm DB, a plant promoter database, is an annotated, non-redundant collection of proximal promoter sequences for RNA polymerase II with experimentally determined transcription start site(s), TSS, from various plant species. The first release (2002.01) of PlantProm DBcontains 305 entries including 71, 220 and 14 promoters from monocot, dicot and other plants, respectively. It provides DNA sequence of the promoter regions ( 200: þ51) with TSS on the fixed position þ201, taxonomic/promoter type classification of promoters and Nucleotide Frequency Matrices (NFM) for promoter elements: TATA-box, CCAAT-box and TSS-motif (Inr). Analysis of TSS-motifs revealed that their composition is different in dicots and monocots, as well as for TATA and TATA-less promoters. The database serves as learning set in developing plant promoter prediction programs. One such program (TSSP) based on discriminant analysis has been created by Softberry Inc. and the application of a support vector machine approach for promoter identification is under development

CiteSeerX

Royal Holloway Research Online

Crossref

Royal Holloway - Pure

PubMed Central

Recommended from our members

Early Detection of Ovarian Cancer in Samples Pre-Diagnosis Using CA125 and MALDI-MS Peaks

Author: Burford B
Camuzeaux S
Cramer R
Devetyarov D
Ford J
Gammerman A
Gentry-Maharaj A
Hallett R
Jacobs I
Luo ZY
Mccurrie K
Menon U
Nouretdinov I
Smith C
Timms JF
Tiss A
Vovk V
Publication venue: INT INST ANTICANCER RESEARCH
Publication date: 01/01/2011
Field of study

Aim: A nested case-control discovery study was undertaken 10 test whether information within the serum peptidome can improve on the utility of CA125 for early ovarian cancer detection. Materials and Methods: High-throughput matrix-assisted laser desorption ionisation mass spectrometry (MALDI-MS) was used to profile 295 serum samples from women pre-dating their ovarian cancer diagnosis and from 585 matched control samples. Classification rules incorporating CA125 and MS peak intensities were tested for discriminating ability. Results: Two peaks were found which in combination with CA125 discriminated cases from controls up to 15 and 11 months before diagnosis, respectively, and earlier than using CA125 alone. One peak was identified as connective tissue-activating peptide III (CTAPIII), whilst the other was putatively identified as platelet factor 4 (PF4). ELISA data supported the down-regulation of PF4 in early cancer cases. Conclusion: Serum peptide information with CA125 improves lead time for early detection of ovarian cancer. The candidate markers are platelet-derived chemokines, suggesting a link between platelet function and tumour development

Central Archive at the University of Reading

LSHTM Research Online

UCL Discovery

The University of Manchester - Institutional Repository

Transductive Learning for Spatial Data Classification

Author: A. Appice
A. Frank
A. Gammerman
A. Mukerjee
D. Malerba
D. Malerba
D. Malerba
D. Malerba
D. McIver
F. Esposito
G. Góra
J. Han
J. Sander
J.A. Robinson
K. Koperski
K.P. Bennett
L. Džeroski
L. Raedt De
L. Raedt De
M. Ceci
M. Ceci
M. Ceci
M. Ester
M. Krogel
M. Kukar
M.-A. Krogel
M.J. Egenhofer
N. Lavrač
P. Legendre
R.S. Michalski
S. Muggleton
S. Shekhar
S. Shekhar
S. Shekhar
T. Joachims
T. Joachims
T. Mitchell
V. Vapnik
V. Vapnik
W. Klösgen
Y. Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Learning classifiers of spatial data presents several issues, such as the heterogeneity of spatial objects, the implicit definition of spatial relationships among objects, the spatial autocorrelation and the abundance of unlabelled data which potentially convey a large amount of information. The first three issues are due to the inherent structure of spatial units of analysis, which can be easily accommodated if a (multi-)relational data mining approach is considered. The fourth issue demands for the adoption of a transductive setting, which aims to make predictions for a given set of unlabelled data. Transduction is also motivated by the contiguity of the concept of positive autocorrelation, which typically affect spatial phenomena, with the smoothness assumption which characterize the transductive setting. In this work, we investigate a relational approach to spatial classification in a transductive setting. Computational solutions to the main difficulties met in this approach are presented. In particular, a relational upgrade of the nave Bayes classifier is proposed as discriminative model, an iterative algorithm is designed for the transductive classification of unlabelled data, and a distance measure between relational descriptions of spatial objects is defined in order to determine the k-nearest neighbors of each example in the dataset. Computational solutions have been tested on two real-world spatial datasets. The transformation of spatial data into a multi-relational representation and experimental results are reported and commented

Crossref

Archivio istituzionale della ricerca - Università di Bari

Kent Academic Repository

Measuring the functional sequence complexity of proteins

Author: A Gammerman
AD Ellington
AKC Wong
AKC Wong
C Shannon
C Tuerk
David KY Chiu
David L Abel
DKY Chiu
DKY Chiu
DKY Chiu
DKY Chiu
DL Abel
DL Abel
DL Abel
DL Robertson
G Ertem
G Steinman
H Kobayashi
H Liao
HP Yockey
J Griesemer
Jack T Trevors
JF Chaparro-Riggers
JW Szostak
Kirk K Durston
KK Durston
L Gao
LM Rocha
LM Rocha
M Barbieri
M Oti
M Ronshaugen
MB Gerstein
O Weiss
PD Karp
R Backofen
S Oyama
WJL Cook
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background Abel and Trevors have delineated three aspects of sequence complexity, Random Sequence Complexity (RSC), Ordered Sequence Complexity (OSC) and Functional Sequence Complexity (FSC) observed in biosequences such as proteins. In this paper, we provide a method to measure functional sequence complexity. Methods and Results We have extended Shannon uncertainty by incorporating the data variable with a functionality variable. The resulting measured unit, which we call Functional bit (Fit), is calculated from the sequence data jointly with the defined functionality variable. To demonstrate the relevance to functional bioinformatics, a method to measure functional sequence complexity was developed and applied to 35 protein families. Considerations were made in determining how the measure can be used to correlate functionality when relating to the whole molecule and sub-molecule. In the experiment, we show that when the proposed measure is applied to the aligned protein sequences of ubiquitin, 6 of the 7 highest value sites correlate with the binding domain. Conclusion For future extensions, measures of functional bioinformatics may provide a means to evaluate potential evolving pathways from effects such as mutations, as well as analyzing the internal structural and functional relationships within the 3-D structure of proteins.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Conformal predictors in early diagnostics of ovarian and breast cancers

Author: Burford B
Camuzeaux S
Chervonenkis A
Cramer R
Devetyarov D
Gammerman A
Gentry-Maharaj A
Hallett R
Jacobs I
Luo Z
Menon U
Nouretdinov I
Sinclair J
Smith C
Timms JF
Tiss A
Vovk V
Waterfield MD
Publication venue
Publication date: 01/09/2012
Field of study

The paper describes an application of a recently developed machine learning technique called Mondrian predictors to risk assessment of ovarian and breast cancers. The analysis is based on mass spectrometry profiling of human serum samples that were collected in the United Kingdom Collaborative Trial of Ovarian Cancer Screening. The paper describes the technique and presents the results of classification (diagnosis) and the corresponding measures of confidence of the diagnostics. The main advantage of this approach is a proven validity of prediction. The paper also describes an approach to improve early diagnosis of ovarian and breast cancers since the data in the United Kingdom Collaborative Trial of Ovarian Cancer Screening were collected over a period of seven years and do allow to make observations of changes in human serum over that period of time. Significance of improvement is confirmed statistically (for up to 11 months for Ovarian Cancer and 9 months for Breast Cancer). In addition, the methodology allowed us to pinpoint the same mass spectrometry peaks as previously detected as carrying statistically significant information for discrimination between healthy and diseased patients. The results are discussed

UCL Discovery

Autoantibodies to aberrantly glycosylated MUC1 in early stage breast cancer are associated with a better prognosis

Author: AL Sørensen
Alex Gammerman
BB Hermsen
BJ Campbell
Brian Burford
C Chapman
C Desmetz
C Desmetz
Deanna Bueti
Diane Allen
DW Kufe
DY Wang
F Lumachi
HH Wandall
I Brockhausen
I Brockhausen
Ian Fentiman
J Burchell
JM Burchell
Joy M Burchell
Joyce Taylor-Papadimitriou
L Zhong
M Bäckström
MA Tarp
ME Weksler
Michael Hollingsworth
N Ohyabu
O Blixt
O Blixt
O Türeci
Ola Blixt
PJ Mintz
PJ Sabbatini
R Lubin
S Hanash
S von Mensdorff-Pouilly
S von Mensdorff-Pouilly
SJ Storr
SP Pinheiro
Sylvain Julien
T Iwai
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

Copenhagen University Research Information System

King's Research Portal

A Hybrid Color Space for Skin Detection Using Genetic Algorithm Heuristic Search and Principal Component Analysis Technique

Author: A Gammerman
A Martinez
AA Zaidan
AM Aibinu
AM Aibinu
Anazida Zainal
C Chen
C Kim
CF Juang
D Chai
D Pascale
E Casiraghi
Gajendra P. S. Raghava
I Jolliffe
J Shlens
JG Wang
JM Chaves-González
JS Lee
JY Choi
KHB Ghazali
KHB Ghazali
L Breiman
L Tao
LM Bergasa
M Kawulok
Mahdi Maktabdar Oghaz
MH Yang
MM Oghaz
Mohd Aizaini Maarof
Mohd Foad Rohani
P Shih
PJ Phillips
R Kemp
R Khan
RC Gonzalez
S Haykin
S. Hadi Yaghoubyan
SJ Schmugge
SK Singh
SL Phung
V Vezhnevets
WR Tan
Y Xu
Z Liu
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/08/2015
Field of study

Color is one of the most prominent features of an image and used in many skin and face detection applications. Color space transformation is widely used by researchers to improve face and skin detection performance. Despite the substantial research efforts in this area, choosing a proper color space in terms of skin and face classification performance which can address issues like illumination variations, various camera characteristics and diversity in skin color tones has remained an open issue. This research proposes a new three-dimensional hybrid color space termed SKN by employing the Genetic Algorithm heuristic and Principal Component Analysis to find the optimal representation of human skin color in over seventeen existing color spaces. Genetic Algorithm heuristic is used to find the optimal color component combination setup in terms of skin detection accuracy while the Principal Component Analysis projects the optimal Genetic Algorithm solution to a less complex dimension. Pixel wise skin detection was used to evaluate the performance of the proposed color space. We have employed four classifiers including Random Forest, Naïve Bayes, Support Vector Machine and Multilayer Perceptron in order to generate the human skin color predictive model. The proposed color space was compared to some existing color spaces and shows superior results in terms of pixel-wise skin detection accuracy. Experimental results show that by using Random Forest classifier, the proposed SKN color space obtained an average F-score and True Positive Rate of 0.953 and False Positive Rate of 0.0482 which outperformed the existing color spaces in terms of pixel wise skin detection accuracy. The results also indicate that among the classifiers used in this study, Random Forest is the most suitable classifier for pixel wise skin detection applications

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

Anglia Ruskin Research

PubMed Central

Universiti Teknologi Malaysia Institutional Repository

FigShare