Search CORE

11 research outputs found

Federated Ensemble Regression Using Classification

Author: A Ahmad
A Ali
A Koleti
CN Silla
E Dolgin
J Mendes-Moreira
L Breiman
L Breiman
N Japkowicz
N Rooney
NV Chawla
OI Orhobor
PA Futreal
R Dash
R Ihaka
S Sonnenburg
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Ensemble learning has been shown to significantly improve predictive accuracy in a variety of machine learning problems. For a given predictive task, the goal of ensemble learning is to improve predictive accuracy by combining the predictive power of multiple models. In this paper, we present an ensemble learning algorithm for regression problems which leverages the distribution of the samples in a learning set to achieve improved performance. We apply the proposed algorithm to a problem in precision medicine where the goal is to predict drug perturbation effects on genes in cancer cell lines. The proposed approach significantly outperforms the base case

Crossref

Chalmers Research

Generating Explainable and Effective Data Descriptors Using Relational Learning: Application to Cancer Biology

Author: A Cherkasov
A Clare
A Gaulton
A Koleti
A Srinivasan
AE Hoerl
DS Wishart
EP Barracchia
I Olier
J Verma
JW Lloyd
L Breiman
L Dehaspe
M Ceci
M Zitnik
MP Menden
NP Tatonetti
R Tibshirani
RD King
RD King
S Fröhler
S Muggleton
S Sonnenburg
SJ Russell
T Dash
T Takeda
W Jeon
Y Chen
Y LeCun
Y Park
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

The key to success in machine learning is the use of effective data representations. The success of deep neural networks (DNNs) is based on their ability to utilize multiple neural network layers, and big data, to learn how to convert simple input representations into richer internal representations that are effective for learning. However, these internal representations are sub-symbolic and difficult to explain. In many scientific problems explainable models are required, and the input data is semantically complex and unsuitable for DNNs. This is true in the fundamental problem of understanding the mechanism of cancer drugs, which requires complex background knowledge about the functions of genes/proteins, their cells, and the molecular structure of the drugs. This background knowledge cannot be compactly expressed propositionally, and requires at least the expressive power of Datalog. Here we demonstrate the use of relational learning to generate new data descriptors in such semantically complex background knowledge. These new descriptors are effective: adding them to standard propositional learning methods significantly improves prediction accuracy. They are also explainable, and add to our understanding of cancer. Our approach can readily be expanded to include other complex forms of background knowledge, and combines the generality of relational learning with the efficiency of standard propositional learning

Crossref

Chalmers Research

Evolving BioAssay Ontology (BAO): modularization, integration and applications

Author: Abeyruwan Saminda
Bittker Joshua A
Brudz Steve
Bureeva Svetlana
Chung Caty
Clemons Paul A
Koleti Amar
Küçük-McGinty Hande
Lemmon Vance
Mir Ahsan
Morales Arturo J
Romacker Martin
Sakurai Kunie
Schürer Stephan C
Siripala Anosha
Twomey David
Vempati Uma D
Visser Ubbo
Publication venue: BioMed Central
Publication date: 01/01/2014
Field of study

The lack of established standards to describe and annotate biological assays and screening outcomes in the domain of drug and chemical probe discovery is a severe limitation to utilize public and proprietary drug screening data to their maximum potential. We have created the BioAssay Ontology (BAO) project ( http://bioassayontology.org ) to develop common reference metadata terms and definitions required for describing relevant information of low-and high-throughput drug and probe screening assays and results. The main objectives of BAO are to enable effective integration, aggregation, retrieval, and analyses of drug screening data. Since we first released BAO on the BioPortal in 2010 we have considerably expanded and enhanced BAO and we have applied the ontology in several internal and external collaborative projects, for example the BioAssay Research Database (BARD). We describe the evolution of BAO with a design that enables modeling complex assays including profile and panel assays such as those in the Library of Integrated Network-based Cellular Signatures (LINCS). One of the critical questions in evolving BAO is the following: how can we provide a way to efficiently reuse and share among various research projects specific parts of our ontologies without violating the integrity of the ontology and without creating redundancies. This paper provides a comprehensive answer to this question with a description of a methodology for ontology modularization using a layered architecture. Our modularization approach defines several distinct BAO components and separates internal from external modules and domain-level from structural components. This approach facilitates the generation/extraction of derived ontologies (or perspectives) that can suit particular use cases or software applications. We describe the evolution of BAO related to its formal structures, engineering approaches, and content to enable modeling of complex assays and integration with other ontologies and datasets

Crossref

Springer - Publisher Connector

PubMed Central

University of Miami: Scholarship Miami

Datasets2Tools, repository and search engine for bioinformatics datasets, tools and canned analyses

Author: A Koleti
AB Keenan
D Warde-Farley
E Chen
J Beel
KM Jagodnik
L Ohno-Machado
M Bostock
M Grinberg
MD Wilkinson
MV Kuleshov
N Clark
NF Fernandez
Q Duan
R Edgar
R Margolis
VJ Henry
Z Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 27/02/2018
Field of study

Biomedical data repositories such as the Gene Expression Omnibus (GEO) enable the search and discovery of relevant biomedical digital data objects. Similarly, resources such as OMICtools, index bioinformatics tools that can extract knowledge from these digital data objects. However, systematic access to pre-generated ‘canned’ analyses applied by bioinformatics tools to biomedical digital data objects is currently not available. Datasets2Tools is a repository indexing 31,473 canned bioinformatics analyses applied to 6,431 datasets. The Datasets2Tools repository also contains the indexing of 4,901 published bioinformatics software tools, and all the analyzed datasets. Datasets2Tools enables users to rapidly find datasets, tools, and canned analyses through an intuitive web interface, a Google Chrome extension, and an API. Furthermore, Datasets2Tools provides a platform for contributing canned analyses, datasets, and tools, as well as evaluating these digital objects according to their compliance with the findable, accessible, interoperable, and reusable (FAIR) principles. By incorporating community engagement, Datasets2Tools promotes sharing of digital resources to stimulate the extraction of knowledge from biomedical research data. Datasets2Tools is freely available from: http://amp.pharm.mssm.edu/datasets2tools

Crossref

University of Miami: Scholarship Miami

Validating Antibodies for Quantitative Western Blot Measurements with Microwestern Array

Author: A Koleti
B Haibe-Kains
BT Hennessy
J Bordeaux
J Bourbeillon
KA Janes
L Charboneau
M Bouhaddou
M Uhlen
M Zellner
MD Wilkinson
MF Ciaccio
MF Ciaccio
R Tibes
S Nishizuka
SC Taylor
SL Eaton
SM Corsello
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Recommended from our members

FAIR LINCS Data and Metadata powered by the CEDAR Framework

Author: Amar Koleti (3354098)
Avi Ma’ayan (581211)
Caroline Monteiro (3354107)
Caty Chung (3352190)
Christopher Mader (3354101)
Csongor I. Nyulas (3352646)
Daniel Cooper (3354113)
Dusica Vidovic (684990)
John Graybeal (3352631)
Mario Medvedovic (1779)
Mark A. Musen (107442)
Martin J. O'Connor (3352619)
Michele Forlin (1300680)
Rafael S. Gonçalves (3354116)
Stephan Schürer (3354080)
Vance Lemmon (515697)
Vasileios Stathias (680557)
Wen Niu (381056)
Publication venue
Publication date: 23/11/2016
Field of study

The Library of Integrated Network-based Signatures (LINCS) program generates a wide variety of cell-based perturbation-response signatures using diverse assay technologies. For example, LINCS includes large-scale transcriptional profiling of genetic and small molecule perturbations, and various proteomics and imaging datasets. We have developed data processing pipelines, and supporting informatics infrastructure to access, standardize and harmonize, register and publish LINCS datasets and metadata from all Data and Signature Generating Centers (DSGC’s). Metadata standards specifications provide a foundation for harmonizing and integrating LINCS data. Here we introduce a CEDAR-based LINCS Community Metadata Environment, to support end-to-end metadata management framework that supports authoring, curation, validation, management, and sharing of LINCS metadata, while building upon the existing LINCS metadata standards and data-release workflows. Following this initial validation, our goal is to create reusable metadata modules with user friendly templates for each of the LINCS metadata categories and to make our suite of tools compatible with the CEDAR metadata technologies. This should further simplify metadata handling in the LINCS consortium and facilitate a global metadata repository at CEDAR. As other projects apply the same approach, many more datasets will become cross-searchable and can be linked optimizing the metadata pathway from submission to discovery

University of Miami: Scholarship Miami

FigShare

Connecting omics signatures and revealing biological mechanisms with iLINCS

Author: Bennett Mark F
Biesiada Jacek
Clark Nicholas A
Clarke Daniel J. B
Davidson Sarah E
Fazel-Najafabadi Mehdi
Karim Rashid
Koleti Amar
Kouril Michal
Mahi Naim
Ma’ayan Avi
Medvedovic Mario
Meller Jarek
Niu Wen
Pilarczyk Marcin
Reichard John F
Ren Yan
Roberts Kurt
Schürer Stephan C
Shamsaei Behrouz
Stathias Vasileios
Vasiliauskas Juozas
Vidovic Dusica
White Shana
Xu Huan
Zhang Lixia
Publication venue: Nature Publishing Group
Publication date: 01/01/2022
Field of study

There are only a few platforms that integrate multiple omics data types, bioinformatics tools, and interfaces for integrative analyses and visualization that do not require programming skills. Here we present iLINCS (http://ilincs.org), an integrative web-based platform for analysis of omics data and signatures of cellular perturbations. The platform facilitates mining and re-analysis of the large collection of omics datasets (>34,000), pre-computed signatures (>200,000), and their connections, as well as the analysis of user-submitted omics signatures of diseases and cellular perturbations. iLINCS analysis workflows integrate vast omics data resources and a range of analytics and interactive visualization tools into a comprehensive platform for analysis of omics signatures. iLINCS user-friendly interfaces enable execution of sophisticated analyses of omics signatures, mechanism of action analysis, and signature-driven drug repositioning. We illustrate the utility of iLINCS with three use cases involving analysis of cancer proteogenomic signatures, COVID 19 transcriptomic signatures and mTOR signaling. There are only a few platforms that integrate multiple omics data types, bioinformatics tools, and interfaces for integrative analyses and visualization that do not require programming skills. Here the authors present an integrative web-based platform for analysis of omics data and signatures of cellular perturbations

PubMed Central

University of Miami: Scholarship Miami

Computational Drug Repositioning for Gastric Cancer using Reversal Gene Expression Profiles

Author: A Basu
A Gaulton
A Gschwind
A Koleti
A Ramasamy
AB Keenan
B Seashore-Ludlow
BA Kidd
CR Chong
DD Kang
F Iorio
G Kauselmann
GC Tseng
GJ Westen van
H Xue
HQ Zhang
J Barretina
J Lamb
J Li
J Zhang
JA DiMasi
JK Choi
JM Park
L Cheng
LJ He
M Sirota
M Zhang
MH Chen
N Dovrolis
NS Jahchan
Q Duan
Q Gao
R Huang
R Ren
S Lu
S Oliver
SS Wong
T Barrett
V Law
V Noort van
WF Anderson
X Wang
YJ Bang
YY Janjigian
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Drug and disease signature integration identifies synergistic combinations in glioblastoma

Author: A Ianevski
A Koleti
A Subramanian
AB Keenan
AM Kurimchak
AM Mohammadi
AP Patel
B Mukherjee
B Yadav
BL Carlson
C Berthon
C Bliss
C Pastori
C Pastori
C Penas
CO Groot de
CY Fong
E Morgan
E Raymond
GL Johnson
J Sigmond
J Wang
J Zhang
JS Duncan
JS Logue
KC Wei
KR Kampen
M Niepel
M Yemisci
MA Qazi
MI Love
MO Jacus
N Cancer Genome Atlas Research
N Mojas
ND Adams
P Ciceri
P Filippakopoulos
P Filippakopoulos
P Gautam
P Rathert
QT Ostrom
R Stupp
RGW Verhaak
S Loewe
S Shu
SK Carlsson
SK Tan
SY Lee
TJ Stuhlmiller
Y Zhu
Z An
Z Cheng
ZC Gersey
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Learning important features from multi-view data to predict drug side effects

Author: A Bonvin
A Koleti
A Mitchell
AJ Barsky
AP Davis
BE Trevor
C Ding
C Li
C Shi
C Xiong
D-S Cao
E Pauwels
F Nie
F Pedregosa
F Roubille
G Becker
H Iwata
H Jaeschke
H Luo
H Luo
H Yang
H Zou
J Dong
J Mojoo
J Yu
JJ Hornberg
Jun Li
K McArthur
KM Giacomini
Lingzhi Qu
M Belkin
M Belkin
M Kuhn
M Kuhn
M Liu
M Takeda
M Zhang
M-L Zhang
MG Coulthard
N Atias
P Langfelder
Pengfei Zhang
Q Wang
R Cerri
R-E Fan
RR Shah
S Lee
S Mizutani
S Modi
S Reid
S Wan
SM Ivanov
T Xia
U Consortium
V Law
X Chen
X Liang
X Wang
X Zhang
X-l Li
Xujun Liang
Y Nesterov
Y Wang
Y Xu
Y Yamanishi
Ying Fu
Yongheng Chen
Z Wang
Zhuchu Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref