Search CORE

530 research outputs found

Recommended from our members

Evolutionary and deep mining models for effective biomarker discovery

Author: Alzubaidi AHA
Publication venue
Publication date: 01/09/2019
Field of study

With the advent of high-throughput biology, large amounts of molecular data are available for purposeful analysis and evaluation. Extracting relevant knowledge from high-throughput biomedical datasets has become a common goal of current approaches to personalised cancer medicine and understanding cancer genotype and phenotype. However, the datasets are characterised by high dimensionality and relatively small sample sizes with small signal-to-noise ratios. Extracting and interpreting relevant knowledge from such complex datasets therefore remains a significant challenge for the fields of machine learning and data mining. This is evidenced by the limited success these methods have had in detecting robust and reliable biomarkers for cancers and other complicated diseases. This could also explain the lack of finding generic biomarkers among the identified published genes for identical diseases or clinical conditions. This thesis proposes and evaluates the efficacy of two novel feature mining models established on the basis of the evolutionary computation and deep learning paradigms to position and solve biomarker discovery as an optimisation problem. Deep learning methods lack the transparency and interpretability found in the evolutionary paradigm. To overcome the inherent issue of poor explanatory power associated with the deep learning, this research also introduces a novel deep mining model that helps to deconstruct the internal state of such deep learning models to reveal key determinants underlying its latent representations to aid feature selection. As a result, salient biomarkers for breast cancer and the positivity of the Estrogen and Progesterone receptors are discovered robustly and validated reliably across a wide range of independently generated breast cancer data samples

Nottingham Trent Institutional Repository (IRep)

Statistical strategies for avoiding false discoveries in metabolomics and related experiments

Author: A. Bradford Hill
A. Cornish-Bowden
A. Demiriz
A. Goffeau
A. Golbraikh
A. Hutchinson
A. Linden
A. Reiner
A. Saltelli
A.C. Leon
A.C. Tas
A.H. Fielding
A.J. Miller
A.W.F. Edwards
B. Efron
B. Efron
B. Efron
B. Fortner
B. Shipley
B.F.J. Manly
B.K. Alsberg
B.K. Alsberg
B.R. Kirkwood
B.S. Everitt
C. Chatfield
C. Mering von
C. Rijsbergen van
C. Stephan
C.A. Coello
C.A. Goble
C.B. Lucasius
C.B. Lucasius
C.E. Metz
C.J. Needham
C.R. Hicks
D. Broadhurst
D. Camacho
D. di Bernardo
D. Edwards
D. Hand
D.A. Berry
D.A. Fell
D.A. Veldhuizen Van
D.B. Kell
D.B. Kell
D.B. Kell
D.B. Kell
D.B. Kell
D.B. Kell
D.B. Kell
D.B. Kell
D.B. Kell
D.B. Kell
D.B. Kell
D.C. Montgomery
D.F. Ransohoff
D.F. Ransohoff
D.F. Ransohoff
D.G. Altman
D.G. Altman
D.J.C. Mackay
D.S. Grimes
David I. Broadhurst
Douglas B. Kell
E. Jellum
E. Urbanczyk-Wochniak
E. Zitzler
E.C. Horning
E.E. Ntzani
E.F. Petricoin III
E.P. Diamandis
E.R. Gansner
E.R. Tufte
F. Kose
F.V. Jensen
G. Casella
G.A.F. Seber
G.E.P. Box
G.G. Harrigan
G.S. Catchpole
H. Brenner
H. Martens
H. White
H.-X. Li
H.C. Frey
H.L. Kirschenlohr
H.V. Westerhoff
H.W. Ressom
I.T. Jolliffe
J. Cornfield
J. Handl
J. Pearl
J. Pearl
J. Sacks
J. Zupan
J.A. Hanley
J.A. Todd
J.D. Barrow
J.D. Storey
J.D. Storey
J.E. Oakley
J.H. Zhang
J.J. Rowland
J.L. Ringuest
J.M. Bernardo
J.M. Bland
J.P. Egan
J.P. Ioannidis
J.P. Ioannidis
J.P. Ioannidis
J.P. Ioannidis
J.P. Ioannidis
J.R. Koza
J.R. Koza
J.W. Sammon Jr.
J.W. Tukey
K. Bennett
K. Deb
K.A. Baggerly
K.J. Rothman
L. Breiman
L. Breiman
L. Ein-Dor
L. Eriksson
L. Hubert
L. Wilkinson
L.A. Zadeh
L.G. Valiant
L.J. ‘t Veer van
L.M. Raamsdonk
M. Anthony
M. Bland
M. Brown
M. Cascante
M. Chen
M. Friendly
M. Hollander
M. Peleg
M. Ramoni
M. Woodward
M.B. Seasholtz
M.H. Zweig
M.J. Gardner
M.J. Vijver van de
M.J.A. Berry
M.S. Sehgal
N. Rifai
N.A. Obuchowski
O. Troyanskaya
O.P. Rud
P. Adriaans
P. Baldi
P. Cabena
P. Dasgupta
P. Duesberg
P. Eades
P. Langley
P. Romano
P.E. Rapp
P.R. Williamson
R. Bellman
R. Brent
R. Brent
R. Brent
R. Goodacre
R. Goodacre
R. Heinrich
R. Judson
R. Kruse
R. Royall
R. Steuer
R. Steuer
R. Stevens
R.E. Shaffer
R.F. Raubertas
R.G. Brereton
R.H. Myers
R.J. Cook
R.M. Jarvis
R.O. Duda
R.R. Sokal
S. Natarajan
S. O’Hagan
S. Wacholder
S. Wold
S.B. Crary
S.C. Potter
S.G. Baker
S.G. Oliver
S.H. Jung
S.H. Weiss
S.J. Sharp
S.K. Kim
S.M. Weiss
S.N. Deming
S.N. Goodman
T. Hastie
T. Kamada
T. Kohonen
T. Oinn
T. Oinn
T.A. White
T.M. Mitchell
T.M.D. Ebbels
T.M.J. Fruchterman
T.R. Golub
T.V. Perneger
U. Horchner
V.C.P. Chen
V.J. Gillet
V.N. Vapnik
W. Greenaway
W. Weckwerth
W.B. Kannel
W.B. Langdon
W.E. Evans
W.E. Evans
W.E. Evans
W.J. Conover
W.J. Krzanowski
W.S. Cleveland
W.S. Cleveland
X. Cui
X. Zhou
X.H. Zhou
Y. Benjamini
Y. Liang
Y. Tu
Y. Wang
Y. Xie
Z. Michalewicz
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Kernelized partial least squares for feature reduction and classification of gene microarray data

Author: Borgia Jeffrey A
Deng Youping
Ford William S
Land Walker H
Margolis Daniel E
Paquette Christopher T
Perez-Rogers Joseph F
Qiao Xingye
Yang Jack Y
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

Identification of Single- and Multiple-Class Specific Signature Genes from Gene Expression Profiles by Group Marker Index

Author: A Bhattacharjee
A Rocchi
A Yoshimura
AM Martoglio
AM Patel
AS Kostyukova
BJ McHugh
C Han
CH Ooi
CT Yap
D Nadano
DH Campbell
DW Huang
DW Huang
E Davicioni
EB Huerta
F Chiarini
G Agatha
G Gerlitz
H Ishii
H Watanabe
I Aifantis
I Guyon
I-Fang Chung
IM Depaz
J Khan
J Khan
JA Cancelas
JR Downing
K Baird
K Kuroda
K Mengubas
K Scotlandi
Kripamoy Aguan
L Li
L Martins
L Sun
L Zhang
M Bustin
M Kanehisa
M Kanehisa
M Kanehisa
M Linial
M Maekawa
M Salagierski
M Wang
M Yousef
M Yousef
ME Atz
MS Lan
N Yamashita
NH Bishopric
Nikhil R. Pal
NK Mukhopadhyay
NR Pal
P Pavlidis
PA Zweidler-McKay
Q Liu
R Fernández-Chacón
R Fiancette
R Hulshizer
R Nahar
R Opgen-Rhein
RJ van Alphen
S Dudoit
S Niijima
S Ocak
S Seo
S Tavor
SA Armstrong
SL Pomeroy
Sumitra Deb
T Jirapech-Umpai
T Tian
TR Golub
V Cerisano
V Zuber
VG Tusher
VI Taylor JG
WD Liu
WG Dilley
WZ Ren
X Zhou
XX Liu
Y Gu
Y Gu
Y Saeys
Y Yu
YS Tsai
Yu-Shuen Tsai
Ø Bruserud
Publication venue: Public Library of Science
Publication date: 01/09/2011
Field of study

Informative genes from microarray data can be used to construct prediction model and investigate biological mechanisms. Differentially expressed genes, the main targets of most gene selection methods, can be classified as single- and multiple-class specific signature genes. Here, we present a novel gene selection algorithm based on a Group Marker Index (GMI), which is intuitive, of low-computational complexity, and efficient in identification of both types of genes. Most gene selection methods identify only single-class specific signature genes and cannot identify multiple-class specific signature genes easily. Our algorithm can detect de novo certain conditions of multiple-class specificity of a gene and makes use of a novel non-parametric indicator to assess the discrimination ability between classes. Our method is effective even when the sample size is small as well as when the class sizes are significantly different. To compare the effectiveness and robustness we formulate an intuitive template-based method and use four well-known datasets. We demonstrate that our algorithm outperforms the template-based method in difficult cases with unbalanced distribution. Moreover, the multiple-class specific genes are good biomarkers and play important roles in biological pathways. Our literature survey supports that the proposed method identifies unique multiple-class specific marker genes (not reported earlier to be related to cancer) in the Central Nervous System data. It also discovers unique biomarkers indicating the intrinsic difference between subtypes of lung cancer. We also associate the pathway information with the multiple-class specific signature genes and cross-reference to published studies. We find that the identified genes participate in the pathways directly involved in cancer development in leukemia data. Our method gives a promising way to find genes that can involve in pathways of multiple diseases and hence opens up the possibility of using an existing drug on other diseases as well as designing a single drug for multiple diseases

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Gene Regulatory Network Analysis and Web-based Application Development

Author: Yang Yi
Publication venue: The Aquila Digital Community
Publication date: 01/12/2013
Field of study

Microarray data is a valuable source for gene regulatory network analysis. Using earthworm microarray data analysis as an example, this dissertation demonstrates that a bioinformatics-guided reverse engineering approach can be applied to analyze time-series data to uncover the underlying molecular mechanism. My network reconstruction results reinforce previous findings that certain neurotransmitter pathways are the target of two chemicals - carbaryl and RDX. This study also concludes that perturbations to these pathways by sublethal concentrations of these two chemicals were temporary, and earthworms were capable of fully recovering. Moreover, differential networks (DNs) analysis indicates that many pathways other than those related to synaptic and neuronal activities were altered during the exposure phase. A novel differential networks (DNs) approach is developed in this dissertation to connect pathway perturbation with toxicity threshold setting from Live Cell Array (LCA) data. Findings from this proof-of-concept study suggest that this DNs approach has a great potential to provide a novel and sensitive tool for threshold setting in chemical risk assessment. In addition, a web-based tool “Web-BLOM” was developed for the reconstruction of gene regulatory networks from time-series gene expression profiles including microarray and LCA data. This tool consists of several modular components: a database, the gene network reconstruction model and a user interface. The Bayesian Learning and Optimization Model (BLOM), originally implemented in MATLAB, was adopted by Web-BLOM to provide an online reconstruction of large-scale gene regulation networks. Compared to other network reconstruction models, BLOM can infer larger networks with compatible accuracy, identify hub genes and is much more computationally efficient

Aquila Digital Community (University of Southern Mississippi, USM)

Navigating the Human Metabolome for Biomarker Identification and Design of Pharmaceutical Molecules

Author: Kouskoumvekaki Irene
Panagiotou Gianni
Publication venue: Hindawi Publishing Corporation
Publication date: 01/01/2010
Field of study

Metabolomics is a rapidly evolving discipline that involves the systematic study of endogenous small molecules that characterize the metabolic pathways of biological systems. The study of metabolism at a global level has the potential to contribute significantly to biomedical research, clinical medical practice, as well as drug discovery. In this paper, we present the most up-to-date metabolite and metabolic pathway resources, and we summarize the statistical, and machine-learning tools used for the analysis of data from clinical metabolomics. Through specific applications on cancer, diabetes, neurological and other diseases, we demonstrate how these tools can facilitate diagnosis and identification of potential biomarkers for use within disease diagnosis. Additionally, we discuss the increasing importance of the integration of metabolomics data in drug discovery. On a case-study based on the Human Metabolome Database (HMDB) and the Chinese Natural Product Database (CNPD), we demonstrate the close relatedness of the two data sets of compounds, and we further illustrate how structural similarity with human metabolites could assist in the design of novel pharmaceuticals and the elucidation of the molecular mechanisms of medicinal plants

Crossref

Directory of Open Access Journals

PubMed Central

Online Research Database In Technology

HKU Scholars Hub

Large-scale dimensionality reduction using perturbation theory and singular vectors

Author: Afshar Majid
Publication venue: Memorial University of Newfoundland
Publication date: 01/03/2021
Field of study

Massive volumes of high-dimensional data have become pervasive, with the number of features significantly exceeding the number of samples in many applications. This has resulted in a bottleneck for data mining applications and amplified the computational burden of machine learning algorithms that perform classification or pattern recognition. Dimensionality reduction can handle this problem in two ways, i.e. feature selection (FS) and feature extraction. In this thesis, we focus on FS, because, in many applications like bioinformatics, the domain experts need to validate a set of original features to corroborate the hypothesis of the prediction models. In processing the high-dimensional data, FS mainly involves detecting a limited number of important features among tens/hundreds of thousands of irrelevant and redundant features. We start with filtering the irrelevant features using our proposed Sparse Least Squares (SLS) method, where a score is assigned to each feature, and the low-scoring features are removed using a soft threshold. To demonstrate the effectiveness of SLS, we used it to augment the well-known FS methods, thereby achieving substantially reduced running times while improving or at least maintaining the prediction accuracy of the models. We developed a linear FS method (DRPT) which, upon data reduction by SLS, clusters the reduced data using the perturbation theory to detect correlations between the remaining features. Important features are ultimately selected from each cluster, discarding the redundant features. To extend the clustering applicability in grouping the redundant features, we proposed a new Singular Vectors FS (SVFS) method that is capable of both removing the irrelevant features and effectively clustering the remaining features. As such, the features in each cluster solely exhibit inner correlations with each other. The independently selected important features from different clusters comprise the final rank. Devising thresholds for filtering irrelevant and redundant features has facilitated the adaptability of our model to the particular needs of various applications. A comprehensive evaluation based on benchmark biological and image datasets shows the superiority of our proposed methods compared to the state-of-the-art FS methods in terms of classification accuracy, running time, and memory usage

Memorial University Research Repository

Development of statistical tools for integrating time course ‘omics’ data

Author: Straube Jasmin
Publication venue: 'University of Queensland Library'
Publication date: 08/05/2017
Field of study

University of Queensland eSpace

Promises and pitfalls of deep neural networks in neuroimaging-based psychiatric research

Author: Eitel Fabian
Ritter Kerstin
Schulz Marc-André
Seiler Moritz
Walter Henrik
Publication venue: 'Elsevier BV'
Publication date: 20/01/2023
Field of study

By promising more accurate diagnostics and individual treatment recommendations, deep neural networks and in particular convolutional neural networks have advanced to a powerful tool in medical imaging. Here, we first give an introduction into methodological key concepts and resulting methodological promises including representation and transfer learning, as well as modelling domain-specific priors. After reviewing recent applications within neuroimaging-based psychiatric research, such as the diagnosis of psychiatric diseases, delineation of disease subtypes, normative modeling, and the development of neuroimaging biomarkers, we discuss current challenges. This includes for example the difficulty of training models on small, heterogeneous and biased data sets, the lack of validity of clinical labels, algorithmic bias, and the influence of confounding variables

arXiv.org e-Print Archive