Search CORE

130 research outputs found

Image auto-annotation with automatic selection of the annotation length

Author: A Llorente
A Makadia
A Smeulders
A Wichert
AM Tousch
C Carson
DG Lowe
E Chang
G Boccignone
G Carneiro
H Kwasnicka
H Kwasnicka
H Tamura
Halina Kwasnicka
J Huang
J Shi
J Verbeek
K Michalak
K Mikolajczyk
M Lux
Michal Stanek
NB Aoun
Oskar Maier
P Duygulu
R Datta
RM Haralick
SE Grigorescu
SF Chang
SL Feng
T Deselaers
T Deselaers
X Zhang
Y Jin
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Imaging genetics : Methodological approaches to overcoming high dimensional barriers

Author: Roshchupkin G.V. (Gennady)
Publication venue: Imaging genetics is still a quite novel area of research which attempts to discover how genetic factors affect brain structures and functions. In this thesis, using a various methodological approaches I showed how it can contribute to our understanding of the complex genetic architecture of the human brain.
Publication date: 25/09/2018
Field of study

Imaging genetics is still a quite novel area of research which attempts to discover how genetic factors affect brain structures and functions. In this thesis, using a various methodological approaches I showed how it can contribute to our understanding of the complex genetic architecture of the human brain

EUR Research Repository

Erasmus University Digital Repository

Handbook of Mathematical Geosciences

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

This Open Access handbook published at the IAMG's 50th anniversary, presents a compilation of invited path-breaking research contributions by award-winning geoscientists who have been instrumental in shaping the IAMG. It contains 45 chapters that are categorized broadly into five parts (i) theory, (ii) general applications, (iii) exploration and resource estimation, (iv) reviews, and (v) reminiscences covering related topics like mathematical geosciences, mathematical morphology, geostatistics, fractals and multifractals, spatial statistics, multipoint geostatistics, compositional data analysis, informatics, geocomputation, numerical methods, and chaos theory in the geosciences

OAPEN Library

Statistical approaches of gene set analysis with quantitative trait loci for high-throughput genomic studies.

Author: Das Samarendra
Publication venue: ThinkIR: The University of Louisville\u27s Institutional Repository
Publication date: 01/01/2020
Field of study

Recently, gene set analysis has become the first choice for gaining insights into the underlying complex biology of diseases through high-throughput genomic studies, such as Microarrays, bulk RNA-Sequencing, single cell RNA-Sequencing, etc. It also reduces the complexity of statistical analysis and enhances the explanatory power of the obtained results. Further, the statistical structure and steps common to these approaches have not yet been comprehensively discussed, which limits their utility. Hence, a comprehensive overview of the available gene set analysis approaches used for different high-throughput genomic studies is provided. The analysis of gene sets is usually carried out based on gene ontology terms, known biological pathways, etc., which may not establish any formal relation between genotype and trait specific phenotype. Further, in plant biology and breeding, gene set analysis with trait specific Quantitative Trait Loci data are considered to be a great source for biological knowledge discovery. Therefore, innovative statistical approaches are developed for analyzing, and interpreting gene expression data from Microarrays, RNA-sequencing studies in the context of gene sets with trait specific Quantitative Trait Loci. The utility of the developed approaches is studied on multiple real gene expression datasets obtained from various Microarrays and RNA-sequencing studies. The selection of gene sets through differential expression analysis is the primary step of gene set analysis, and which can be achieved through using gene selection methods. The existing methods for such analysis in high-throughput studies, such as Microarrays, RNA-sequencing studies, suffer from serious limitations. For instance, in Microarrays, most of the available methods are either based on relevancy or redundancy measures. Through these methods, the ranking of genes is done on single Microarray expression data, which leads to the selection of spuriously associated, and redundant gene sets. Therefore, newer, and innovative differential expression analytical methods have been developed for Microarrays, and single-cell RNA-sequencing studies for identification of gene sets to successfully carry out the gene set and other downstream analyses. Furthermore, several methods specifically designed for single-cell data have been developed in the literature for the differential expression analysis. To provide guidance on choosing an appropriate tool or developing a new one, it is necessary to review the performance of the existing methods. Hence, a comprehensive overview, classification, and comparative study of the available single-cell methods is hereby undertaken to study their unique features, underlying statistical models and their shortcomings on real applications. Moreover, to address one of the shortcomings (i.e., higher dropout events due to lower cell capture rates), an improved statistical method for downstream analysis of single-cell data has been developed. From the users’ point of view, the different developed statistical methods are implemented in various software tools and made publicly available. These methods and tools will help the experimental biologists and genome researchers to analyze their experimental data more objectively and efficiently. Moreover, the limitations and shortcomings of the available methods are reported in this study, and these need to be addressed by statisticians and biologists collectively to develop efficient approaches. These new approaches will be able to analyze high-throughput genomic data more efficiently to better understand the biological systems and increase the specificity, sensitivity, utility, and relevance of high-throughput genomic studies

University of Louisville

KRISHI Publications and Data Repository

Large-scale Machine Learning in High-dimensional Datasets

Author: Hansen Toke Jansen
Publication venue: Technical University of Denmark
Publication date: 01/01/2013
Field of study

Online Research Database In Technology

Hidden Markov Models

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Hidden Markov Models (HMMs), although known for decades, have made a big career nowadays and are still in state of development. This book presents theoretical issues and a variety of HMMs applications in speech recognition and synthesis, medicine, neurosciences, computational biology, bioinformatics, seismology, environment protection and engineering. I hope that the reader will find this book useful and helpful for their own research

Directory of Open Access Books (DOAB)

Data Mining

Author
Publication venue: 'IntechOpen'
Publication date: 27/07/2022
Field of study

The availability of big data due to computerization and automation has generated an urgent need for new techniques to analyze and convert big data into useful information and knowledge. Data mining is a promising and leading-edge technology for mining large volumes of data, looking for hidden information, and aiding knowledge discovery. It can be used for characterization, classification, discrimination, anomaly detection, association, clustering, trend or evolution prediction, and much more in fields such as science, medicine, economics, engineering, computers, and even business analytics. This book presents basic concepts, ideas, and research in data mining

Directory of Open Access Books (DOAB)

Analysing functional genomics data using novel ensemble, consensus and data fusion techniques

Author: Glaab Enrico
Publication venue
Publication date: 15/10/2011
Field of study

Motivation: A rapid technological development in the biosciences and in computer science in the last decade has enabled the analysis of high-dimensional biological datasets on standard desktop computers. However, in spite of these technical advances, common properties of the new high-throughput experimental data, like small sample sizes in relation to the number of features, high noise levels and outliers, also pose novel challenges. Ensemble and consensus machine learning techniques and data integration methods can alleviate these issues, but often provide overly complex models which lack generalization capability and interpretability. The goal of this thesis was therefore to develop new approaches to combine algorithms and large-scale biological datasets, including novel approaches to integrate analysis types from different domains (e.g. statistics, topological network analysis, machine learning and text mining), to exploit their synergies in a manner that provides compact and interpretable models for inferring new biological knowledge. Main results: The main contributions of the doctoral project are new ensemble, consensus and cross-domain bioinformatics algorithms, and new analysis pipelines combining these techniques within a general framework. This framework is designed to enable the integrative analysis of both large- scale gene and protein expression data (including the tools ArrayMining, Top-scoring pathway pairs and RNAnalyze) and general gene and protein sets (including the tools TopoGSA , EnrichNet and PathExpand), by combining algorithms for different statistical learning tasks (feature selection, classification and clustering) in a modular fashion. Ensemble and consensus analysis techniques employed within the modules are redesigned such that the compactness and interpretability of the resulting models is optimized in addition to the predictive accuracy and robustness. The framework was applied to real-word biomedical problems, with a focus on cancer biology, providing the following main results: (1) The identification of a novel tumour marker gene in collaboration with the Nottingham Queens Medical Centre, facilitating the distinction between two clinically important breast cancer subtypes (framework tool: ArrayMining) (2) The prediction of novel candidate disease genes for Alzheimer’s disease and pancreatic cancer using an integrative analysis of cellular pathway definitions and protein interaction data (framework tool: PathExpand, collaboration with the Spanish National Cancer Centre) (3) The prioritization of associations between disease-related processes and other cellular pathways using a new rule-based classification method integrating gene expression data and pathway definitions (framework tool: Top-scoring pathway pairs) (4) The discovery of topological similarities between differentially expressed genes in cancers and cellular pathway definitions mapped to a molecular interaction network (framework tool: TopoGSA, collaboration with the Spanish National Cancer Centre) In summary, the framework combines the synergies of multiple cross-domain analysis techniques within a single easy-to-use software and has provided new biological insights in a wide variety of practical settings

Nottingham eTheses

Integrative genetic and network approaches to identify key regulators of cardiac fibrosis

Author: Moreno Moral Aida
Publication venue: Institute of Clinical Science, Imperial College London
Publication date: 01/02/2016
Field of study

Excessive fibrogenic response is a pathological hallmark of chronic complex diseases, including cardiovascular disease. To date, very few gene targets for cardiac fibrosis that led to effective treatments have been identified in humans. In this thesis I study and dissect the genetic component underlying cardiac fibrosis. This study integrates histomorphometric measurements of fibrosis in the rat left ventricle (LV) with gene expression (RNA-Seq from LV) and genetic data in a panel of recombinant inbred (RI) rat strains (n=30). In addition, I integrated RNA-seq LV and genetic data in humans (n=187, healthy and dilated cardiomyopathy (DCM) patients), as well as DCM genome-wide association studies (GWAS) data. I started by carrying out an unbiased co-expression network analysis in the rat heart. The reconstructed cardiac transcriptional modules were associated with quantitative levels of fibrosis. Co-expression networks were also independently built in the heart of DCM patients and by using the rat data, co-expression networks associated with fibrosis, conserved across rats and humans and not present in control human heart were prioritised. In the prioritised networks, I also analysed their cardiac cell type specificity, differential expression after TGFβ induction, potential driving transcription factors and conservation in other fibrotic diseases by analysing human data collected from other organs. Furthermore, I aimed to identify common genetic regulators of the networks (also called master genetic regulators) by using Bayesian multivariate regression approaches. Finally, I integrated GWAS data in DCM (n=2,287) to dissect the genetic basis of DCM. This systems genetics study evidences that there are transcriptional processes involved in the human cardiac fibrogenic response that are conserved across rats and humans, some of them also underlying DCM aetiology. In an attempt to suggest new gene targets for cardiac fibrosis, I also identified the WWP2 gene as a novel trans-acting genetic regulator of cardiac fibrosis.Open Acces

Spiral - Imperial College Digital Repository

Analysing functional genomics data using novel ensemble, consensus and data fusion techniques

Author: Glaab Enrico
Publication venue
Publication date
Field of study

Nottingham ePrints