Search CORE

1,567 research outputs found

Separability-Oriented Subclass Discriminant Analysis

Author: Guo Gongde
Wan Huan
Wang Hui
Wei Xin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 22/02/2017
Field of study

Sparse Discriminant Analysis

Author: Clemmensen Line Katrine Harder
Ersbøll Bjarne Kjær
Hastie Trevor
Publication venue: Technical University of Denmark, DTU Informatics, Building 321
Publication date: 01/01/2008
Field of study

tionanddimensionreductionareofgreatimportanceiscommonin Classi cationinhigh-dimensionalfeaturespaceswhereinterpreta-biologicalandmedicalapplications. methodsasmicroarrays,1DNMR,andspectroscopyhavebecomeev- Fortheseapplicationsstandard erydaytoolsformeasuringthousandsoffeaturesinsamplesofinterest. Furthermore,thesamplesareoftencostlyandthereforemanysuch problemshavefewobservationsinrelationtothenumberoffeatures. Traditionallysuchdataareanalyzedby lectionbeforeclassi cation. Weproposeamethodwhichperforms rstperformingafeaturese-lineardiscriminantanalysiswithasparsenesscriterionimposedsuch thattheclassi mergedintooneanalysis. cation, featureselectionanddimensionreductionis thantraditionalfeatureselectionmethodsbasedoncomputationally Thesparsediscriminantanalysisisfaster heavycriteriasuchasWilk'slambda,andtheresultsarebetterwith regardstoclassi tomixturesofGaussianswhichisusefulwhene.g.biologicalclusters cationratesandsparseness.Themethodisextended arepresentwithineachclass. low-dimensionalviewsofthediscriminativedirections. Finally,themethodsproposedprovide 1

CiteSeerX

Algebraic Comparison of Partial Lists in Bioinformatics

Author: A Gobbi
A Kalousis
A Kossenkov
A Sboner
AC Haury
AL Boulesteix
Arkady B. Khodursky
B Di Camillo
B Efron
B Efron
B Efron
B Schowe
C Cortes
C Cortes
C Furlanello
C Schneider
C Schneider
C Soneson
C Yao
Cesare Furlanello
Consortium The MicroArray Quality Control (MAQC)
D Albanese
D Cai
D Corrada
D Critchlow
D Saari
D Witten
G Guzzetta
G Jurman
G Jurman
G Lance
G Lance
G Smyth
Giuseppe Jurman
GS Cheon
I Guyon
I Jeffery
I Lönnstedt
J Bar-Ilan
J Borda
J Chen
J Ioannidis
J Neter
J Storey
L Ein-Dor
L Kuncheva
L Yu
L Zhang
M Desarkar
M Kauers
M Kauers
M Kendall
M Schimek
M Schimek
M Slawski
M Villarino
M Villarino
O Bousquet
P Baldi
P Diaconis
P Diaconis
P Hall
P Hall
P Krízek
PC Boutros
R Fagin
R Gentleman
R Graham
R Pearson
R Pique-Regi
R Pique-Regi
R Simon
Roberto Visintainer
S Abramov
S Dudoit
S Lin
S Lin
S Mukherjee
S Setlur
S Simićc
S Vanderlooy
Samantha Riccadonna
SK Lau
T Bø
T Calders
V Tusher
Visintainer
W Fury
W Hoeffding
W Shi
X Wang
X Yang
Y Xiao
Y Xiao
Z He
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 08/04/2010
Field of study

The outcome of a functional genomics pipeline is usually a partial list of genomic features, ranked by their relevance in modelling biological phenotype in terms of a classification or regression model. Due to resampling protocols or just within a meta-analysis comparison, instead of one list it is often the case that sets of alternative feature lists (possibly of different lengths) are obtained. Here we introduce a method, based on the algebraic theory of symmetric groups, for studying the variability between lists ("list stability") in the case of lists of unequal length. We provide algorithms evaluating stability for lists embedded in the full feature set or just limited to the features occurring in the partial lists. The method is demonstrated first on synthetic data in a gene filtering task and then for finding gene profiles on a recent prostate cancer dataset

arXiv.org e-Print Archive

Archivio della ricerca - Fondazione Bruno Kessler

Directory of Open Access Journals

Cluster-Based Supervised Classification

Author: Wan Huan
Publication venue
Publication date: 01/11/2020
Field of study

Ulster University's Research Portal

Simultaneous prediction of wrist/hand motion via wearable ultrasound sensing

Author: Fang Yinfeng
Liu Honghai
Yan Jipeng
Yang Xingchen
Zhou Dalin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 02/03/2020
Field of study

Nonlinear Supervised Dimensionality Reduction via Smooth Regular Embeddings

Author: Ornek Cem
Vural Elif
Publication venue
Publication date: 28/05/2018
Field of study

The recovery of the intrinsic geometric structures of data collections is an important problem in data analysis. Supervised extensions of several manifold learning approaches have been proposed in the recent years. Meanwhile, existing methods primarily focus on the embedding of the training data, and the generalization of the embedding to initially unseen test data is rather ignored. In this work, we build on recent theoretical results on the generalization performance of supervised manifold learning algorithms. Motivated by these performance bounds, we propose a supervised manifold learning method that computes a nonlinear embedding while constructing a smooth and regular interpolation function that extends the embedding to the whole data space in order to achieve satisfactory generalization. The embedding and the interpolator are jointly learnt such that the Lipschitz regularity of the interpolator is imposed while ensuring the separation between different classes. Experimental results on several image data sets show that the proposed method outperforms traditional classifiers and the supervised dimensionality reduction algorithms in comparison in terms of classification accuracy in most settings

arXiv.org e-Print Archive

Within-class Multimodal Classification

Author: Liu Jun
Ng Wing
Scotney Bryan
Wan Huan
Wang Hui
Publication venue
Publication date: 22/06/2020
Field of study

Ulster University's Research Portal

Diagnostic prediction of complex diseases using phase-only correlation based on virtual sample template

Author: Fang Jianwen
Fang Yaping
Wang Shu-Lin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Motivation: Complex diseases induce perturbations to interaction and regulation networks in living systems, resulting in dynamic equilibrium states that differ for different diseases and also normal states. Thus identifying gene expression patterns corresponding to different equilibrium states is of great benefit to the diagnosis and treatment of complex diseases. However, it remains a major challenge to deal with the high dimensionality and small size of available complex disease gene expression datasets currently used for discovering gene expression patterns. Results: Here we present a phase-only correlation (POC) based classification method for recognizing the type of complex diseases. First, a virtual sample template is constructed for each subclass by averaging all samples of each subclass in a training dataset. Then the label of a test sample is determined by measuring the similarity between the test sample and each template. This novel method can detect the similarity of overall patterns emerged from the differentially expressed genes or proteins while ignoring small mismatches. Conclusions: The experimental results obtained on seven publicly available complex disease datasets including microarray and protein array data demonstrate that the proposed POC-based disease classification method is effective and robust for diagnosing complex diseases with regard to the number of initially selected features, and its recognition accuracy is better than or comparable to other state-of-the-art machine learning methods. In addition, the proposed method does not require parameter tuning and data scaling, which can effectively reduce the occurrence of over-fitting and bias

Springer - Publisher Connector

Performance of Feature Selection Methods

Author: Dougherty Edward R
Hua Jianping
Sima Chao
Publication venue: Bentham Science Publishers Ltd.
Publication date: 01/01/2009
Field of study

High-throughput biological technologies offer the promise of finding feature sets to serve as biomarkers for medical applications; however, the sheer number of potential features (genes, proteins, etc.) means that there needs to be massive feature selection, far greater than that envisioned in the classical literature. This paper considers performance analysis for feature-selection algorithms from two fundamental perspectives: How does the classification accuracy achieved with a selected feature set compare to the accuracy when the best feature set is used and what is the optimal number of features that should be used? The criteria manifest themselves in several issues that need to be considered when examining the efficacy of a feature-selection algorithm: (1) the correlation between the classifier errors for the selected feature set and the theoretically best feature set; (2) the regressions of the aforementioned errors upon one another; (3) the peaking phenomenon, that is, the effect of sample size on feature selection; and (4) the analysis of feature selection in the framework of high-dimensional models corresponding to high-throughput data

CiteSeerX