Search CORE

30,663 research outputs found

Machine Learning and Integrative Analysis of Biomedical Big Data.

Author: Choi Howard
Chung Neo Christopher
Mirza Bilal
Ping Peipei
Wang Jie
Wang Wei
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues

Multidisciplinary Digital Publishing Institute

Ezid

Directory of Open Access Journals

eScholarship - University of California

Supervised cross-modal factor analysis for multiple modal data classification

Author: Bensmail Halima
Duan Kanghong
Wang Jim Jing-Yan
Wang Jingbin
Zhou Yihua
Publication venue
Publication date: 18/08/2015
Field of study

In this paper we study the problem of learning from multiple modal data for purpose of document classification. In this problem, each document is composed two different modals of data, i.e., an image and a text. Cross-modal factor analysis (CFA) has been proposed to project the two different modals of data to a shared data space, so that the classification of a image or a text can be performed directly in this space. A disadvantage of CFA is that it has ignored the supervision information. In this paper, we improve CFA by incorporating the supervision information to represent and classify both image and text modals of documents. We project both image and text data to a shared data space by factor analysis, and then train a class label predictor in the shared space to use the class label information. The factor analysis parameter and the predictor parameter are learned jointly by solving one single objective function. With this objective function, we minimize the distance between the projections of image and text of the same document, and the classification error of the projection measured by hinge loss function. The objective function is optimized by an alternate optimization strategy in an iterative algorithm. Experiments in two different multiple modal document data sets show the advantage of the proposed algorithm over other CFA methods

arXiv.org e-Print Archive

CiteSeerX

Crossref

On Geometric Alignment in Low Doubling Dimension

Author: Ding Hu
Ye Mingquan
Publication venue
Publication date: 18/11/2018
Field of study

In real-world, many problems can be formulated as the alignment between two geometric patterns. Previously, a great amount of research focus on the alignment of 2D or 3D patterns, especially in the field of computer vision. Recently, the alignment of geometric patterns in high dimension finds several novel applications, and has attracted more and more attentions. However, the research is still rather limited in terms of algorithms. To the best of our knowledge, most existing approaches for high dimensional alignment are just simple extensions of their counterparts for 2D and 3D cases, and often suffer from the issues such as high complexities. In this paper, we propose an effective framework to compress the high dimensional geometric patterns and approximately preserve the alignment quality. As a consequence, existing alignment approach can be applied to the compressed geometric patterns and thus the time complexity is significantly reduced. Our idea is inspired by the observation that high dimensional data often has a low intrinsic dimension. We adopt the widely used notion "doubling dimension" to measure the extents of our compression and the resulting approximation. Finally, we test our method on both random and real datasets, the experimental results reveal that running the alignment algorithm on compressed patterns can achieve similar qualities, comparing with the results on the original patterns, but the running times (including the times cost for compression) are substantially lower

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

An interactome-centered protein discovery approach reveals novel components involved in mitosome function and homeostasis in giardia lamblia

Author: Samuel Rout
Jon Paulin Zumthor
Elisabeth M. Schraner
Carmen Faso
Adrian B. Hehl
Krishan Kumar
MW Gray
SB Hedges
D Yang
GJ Olsen
AM Viale
A Karnkowska
MF Bauer
RE Jensen
CM Koehler
A Matouschek
H Prokisch
J Reinders
A Sickmann
N Pfanner
T Gabaldon
A Cerkasovová
DG Lindmark
Z Mai
M Muller
J Tovar
AM Shiflett
SM Adl
A Regoes
J Tovar
CE Riordan
MD Katinka
J Jerlstrom-Hultqvist
MS Abrahamsen
G Leon-Avila
G Turner
M van der Giezen
P Dolezal
PL Jedelsky
R Lill
EA Craig
R Lill
J Ankarklev
HG Morrison
SR Davis-Hayman
SE Boucher
M Abodeely
S Stefanic
AV Goldberg
F Mi-ichi
L Putignani
SJ Sanderson
AD Tsaousis
PB Wampfler
E Martincova
MJ Dagley
AB Hehl
L Morf
J Zumthor
C Konrad
JA Vizcaino
V Gaechter
DC Chan
H Chen
E Martincova
CG England
GW Morgan
AL Chanez
Y Wexler-Cohen
N Rajala
EV Elias
S Sonda
HG Elmendorf
S Isenmann
JG Duman
S Bandyopadhyay
G Manning
E Martincova
M Eilers
F Mi-Ichi
J Dudek
F Xu
AI de Kroon
E Zinser
P Kumar
Y Tamura
K Yamano
T Tatsuta
JG Wideman
A Rigotti
J Zhao
AM van der Bliek
K Okamoto
K Elgass
M Marti
AG McArthur
E Smirnova
SY Miyagishima
K Nishida
R Pan
H Otera
H Lee
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/1995
Field of study

Protozoan parasites of the genus Giardia are highly prevalent globally, and infect a wide range of vertebrate hosts including humans, with proliferation and pathology restricted to the small intestine. This narrow ecological specialization entailed extensive structural and functional adaptations during host-parasite co-evolution. An example is the streamlined mitosomal proteome with iron-sulphur protein maturation as the only biochemical pathway clearly associated with this organelle. Here, we applied techniques in microscopy and protein biochemistry to investigate the mitosomal membrane proteome in association to mitosome homeostasis. Live cell imaging revealed a highly immobilized array of 30–40 physically distinct mitosome organelles in trophozoites. We provide direct evidence for the single giardial dynamin-related protein as a contributor to mitosomal morphogenesis and homeostasis. To overcome inherent limitations that have hitherto severely hampered the characterization of these unique organelles we applied a novel interaction-based proteome discovery strategy using forward and reverse protein co-immunoprecipitation. This allowed generation of organelle proteome data strictly in a protein-protein interaction context. We built an initial Tom40-centered outer membrane interactome by co-immunoprecipitation experiments, identifying small GTPases, factors with dual mitosome and endoplasmic reticulum (ER) distribution, as well as novel matrix proteins. Through iterative expansion of this protein-protein interaction network, we were able to i) significantly extend this interaction-based mitosomal proteome to include other membrane-associated proteins with possible roles in mitosome morphogenesis and connection to other subcellular compartments, and ii) identify novel matrix proteins which may shed light on mitosome-associated metabolic functions other than Fe-S cluster biogenesis. Functional analysis also revealed conceptual conservation of protein translocation despite the massive divergence and reduction of protein import machinery in Giardia mitosomes

Crossref

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Directory of Open Access Journals

ZORA

Bern Open Repository and Information System (BORIS)

International Migration, Integration and Social Cohesion online publications

FigShare

Dissertations of the University of Groningen