Search CORE

123 research outputs found

Machine Learning and Integrative Analysis of Biomedical Big Data.

Author: Choi Howard
Chung Neo Christopher
Mirza Bilal
Ping Peipei
Wang Jie
Wang Wei
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues

Multidisciplinary Digital Publishing Institute

Ezid

Directory of Open Access Journals

eScholarship - University of California

A comparison of graph- and kernel-based –omics data integration algorithms for classifying complex traits

Author: Pang HMH
YAN K
Zhao H
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

published_or_final_versio

HKU Scholars Hub

Modern Views of Machine Learning for Precision Psychiatry

Author: Bigio Benedetta
Chen Zhe Sage
Galatzer-Levy Isaac R.
Kulkarni
Nasca Carla
Prathamesh
Zhang Yu
Publication venue
Publication date: 04/04/2022
Field of study

In light of the NIMH's Research Domain Criteria (RDoC), the advent of functional neuroimaging, novel technologies and methods provide new opportunities to develop precise and personalized prognosis and diagnosis of mental disorders. Machine learning (ML) and artificial intelligence (AI) technologies are playing an increasingly critical role in the new era of precision psychiatry. Combining ML/AI with neuromodulation technologies can potentially provide explainable solutions in clinical practice and effective therapeutic treatment. Advanced wearable and mobile technologies also call for the new role of ML/AI for digital phenotyping in mobile mental health. In this review, we provide a comprehensive review of the ML methodologies and applications by combining neuroimaging, neuromodulation, and advanced mobile technologies in psychiatry practice. Additionally, we review the role of ML in molecular phenotyping and cross-species biomarker identification in precision psychiatry. We further discuss explainable AI (XAI) and causality testing in a closed-human-in-the-loop manner, and highlight the ML potential in multimedia information extraction and multimodal data fusion. Finally, we discuss conceptual and practical challenges in precision psychiatry and highlight ML opportunities in future research

arXiv.org e-Print Archive

Learning to cope with small noisy data in software effort estimation

Author: Song Liyan
Publication venue
Publication date: 01/07/2019
Field of study

Though investigated for decades, Software Effort Estimation (SEE) remains a challenging problem in software project management. However, there are several factors hindering the practical use of SEE models. One major factor is the scarcity of software projects that are used to construct SEE models due to the long process of software development. Even given a large number of projects, the collected effort values are usually corrupted by noise due to the participation of humans. Furthermore, even given enough and noise-free software projects, SEE models may have sensitive parameters to tune possibly causing model sensitivity problem. The thesis focuses on tackling these three issues. It proposes a synthetic data generator to tackle the data scarcity problem, introduces/constructs uncertain effort estimators to tackle the data noise problem, and analyses the sensitivity to parameter settings of popular SEE models. The main contributions of the thesis include: 1. Propose a synthetic project generator and provide an understanding of when and why it improves prediction performance of what baseline models. 2. Introduce relevance vector machine for uncertain effort estimation. 3. Propose a better uncertain estimation method based on an ensemble strategy. 4. Provide a better understanding of the impact of parameter tuning for SEE methods

University of Birmingham Research Archive, E-theses Repository

Hybrid Advanced Optimization Methods with Evolutionary Computation Techniques in Energy Forecasting

Author: Wei-Chiang Hong (Ed.)
Publication venue: 'MDPI AG'
Publication date: 01/01/2018
Field of study

More accurate and precise energy demand forecasts are required when energy decisions are made in a competitive environment. Particularly in the Big Data era, forecasting models are always based on a complex function combination, and energy data are always complicated. Examples include seasonality, cyclicity, fluctuation, dynamic nonlinearity, and so on. These forecasting models have resulted in an over-reliance on the use of informal judgment and higher expenses when lacking the ability to determine data characteristics and patterns. The hybridization of optimization methods and superior evolutionary algorithms can provide important improvements via good parameter determinations in the optimization process, which is of great assistance to actions taken by energy decision-makers. This book aimed to attract researchers with an interest in the research areas described above. Specifically, it sought contributions to the development of any hybrid optimization methods (e.g., quadratic programming techniques, chaotic mapping, fuzzy inference theory, quantum computing, etc.) with advanced algorithms (e.g., genetic algorithms, ant colony optimization, particle swarm optimization algorithm, etc.) that have superior capabilities over the traditional optimization approaches to overcome some embedded drawbacks, and the application of these advanced hybrid approaches to significantly improve forecasting accuracy

Directory of Open Access Books (DOAB)

Bayesian Inference for Genomic Data Integration Reduces Misclassification Rate in Predicting Protein-Protein Interactions

Author: A Elefsinioti
A Valencia
AJ Enright
AK Ramani
AL Hopkins
BA Shoemaker
C von Mering
C von Mering
CC Wu
Christos A. Ouzounis
Chuanhua Xing
CS Goh
David B. Dunson
DB Dunson
DR Rhodes
EC Butcher
EM Marcotte
F Browne
F Pazos
GT Hart
H Huang
H Ishwaran
H Yu
I Lee
IW Taylor
J Saric
J Sun
JS Bader
L Hakes
L Hood
L Lu
LJ Jensen
LJ Lu
LV Zhang
M Huang
M Persico
MA Yildirim
MP Brown
MS Scott
N Lin
OG Troyanskaya
P Aloy
P Bork
P Pagel
P Sham
R Chowdhary
R Jansen
R Malik
R Mrowka
S Dolma
S Kim
S Tsoka
SV Date
Y Qi
Y Qi
Publication venue: Public Library of Science
Publication date: 01/07/2011
Field of study

Protein-protein interactions (PPIs) are essential to most fundamental cellular processes. There has been increasing interest in reconstructing PPIs networks. However, several critical difficulties exist in obtaining reliable predictions. Noticeably, false positive rates can be as high as >80%. Error correction from each generating source can be both time-consuming and inefficient due to the difficulty of covering the errors from multiple levels of data processing procedures within a single test. We propose a novel Bayesian integration method, deemed nonparametric Bayes ensemble learning (NBEL), to lower the misclassification rate (both false positives and negatives) through automatically up-weighting data sources that are most informative, while down-weighting less informative and biased sources. Extensive studies indicate that NBEL is significantly more robust than the classic naïve Bayes to unreliable, error-prone and contaminated data. On a large human data set our NBEL approach predicts many more PPIs than naïve Bayes. This suggests that previous studies may have large numbers of not only false positives but also false negatives. The validation on two human PPIs datasets having high quality supports our observations. Our experiments demonstrate that it is feasible to predict high-throughput PPIs computationally with substantially reduced false positives and false negatives. The ability of predicting large numbers of PPIs both reliably and automatically may inspire people to use computational approaches to correct data errors in general, and may speed up PPIs prediction with high quality. Such a reliable prediction may provide a solid platform to other studies such as protein functions prediction and roles of PPIs in disease susceptibility

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Machine learning for brain stroke: a review

Author: Câmara Joana
Fermé Eduardo
Sirsat Manisha Sanjay
Publication venue: 'Elsevier BV'
Publication date: 01/01/2020
Field of study

Machine Learning (ML) delivers an accurate and quick prediction outcome and it has become a powerful tool in health settings, offering personalized clinical care for stroke patients. An application of ML and Deep Learning in health care is growing however, some research areas do not catch enough attention for scientific investigation though there is real need of research. Therefore, the aim of this work is to classify state-of-arts on ML techniques for brain stroke into 4 categories based on their functionalities or similarity, and then review studies of each category systematically. A total of 39 studies were identified from the results of ScienceDirect web scientific database on ML for brain stroke from the year 2007 to 2019. Support Vector Machine (SVM) is obtained as optimal models in 10 studies for stroke problems. Besides, maximum studies are found in stroke diagnosis although number for stroke treatment is least thus, it identifies a research gap for further investigation. Similarly, CT images are a frequently used dataset in stroke. Finally SVM and Random Forests are efficient techniques used under each category. The present study showcases the contribution of various ML approaches applied to brain stroke.info:eu-repo/semantics/publishedVersio

Repositório Digital da Universidade da Madeira

Novel Methods for Approximate Bayesian Inference of Independent and Evolutionarily Dependent Data

Author: Fearn James A
Publication venue
Publication date: 02/12/2021
Field of study

Explore Bristol Research

Probabilistic multiple kernel learning

Author: Damoulas Theodoros
Publication venue
Publication date: 01/01/2009
Field of study

The integration of multiple and possibly heterogeneous information sources for an overall decision-making process has been an open and unresolved research direction in computing science since its very beginning. This thesis attempts to address parts of that direction by proposing probabilistic data integration algorithms for multiclass decisions where an observation of interest is assigned to one of many categories based on a plurality of information channels

Glasgow Theses Service