Search CORE

170 research outputs found

Bayesian non parametric modelling of Higgs pair production

Author: Alwall
Bruno Scarpa
Chipman
de Florian
Friedman
Hastie
Ishwaran
Ishwaran
Ishwaran
Lang
N. Brambilla
Polson
Tommaso Dorigo
V. Kovalenko
Y. Foka
Publication venue: 'EDP Sciences'
Publication date: 01/01/2017
Field of study

A Recurrent Neural Network Survival Model: Predicting Web User Return Time

Author: A Graves
AG Hawkes
B Efron
DR Cox
DR Cox
DR Cox
FE Harrell
H Ishwaran
JD Kalbfleisch
JP Klein
M Han
N Breslow
R Chandra
S Hochreiter
X Cai
Y Bengio
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 11/07/2018
Field of study

The size of a website's active user base directly affects its value. Thus, it is important to monitor and influence a user's likelihood to return to a site. Essential to this is predicting when a user will return. Current state of the art approaches to solve this problem come in two flavors: (1) Recurrent Neural Network (RNN) based solutions and (2) survival analysis methods. We observe that both techniques are severely limited when applied to this problem. Survival models can only incorporate aggregate representations of users instead of automatically learning a representation directly from a raw time series of user actions. RNNs can automatically learn features, but can not be directly trained with examples of non-returning users who have no target value for their return time. We develop a novel RNN survival model that removes the limitations of the state of the art methods. We demonstrate that this model can successfully be applied to return time prediction on a large e-commerce dataset with a superior ability to discriminate between returning and non-returning users than either method applied in isolation.Comment: Accepted into ECML PKDD 2018; 8 figures and 1 tabl

arXiv.org e-Print Archive

Crossref

Statistical Relational Learning with Formal Ontologies

Author: C. Kiefer
H. Ishwaran
I. Davidson
L.D. Raedt
M. Richardson
N. Fanizzi
N. Fanizzi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

Crossref

Location Dependent Dirichlet Processes

Author: A Oliva
A Rodríguez
C Bishop
C Williams
CE Rasmussen
D Blei
D Blei
D Dunson
E Sudderth
F Zhu
H Ishwaran
J Duan
J Griffin
J Griffin
J Paisley
J Sethuraman
J Shi
L Ren
N Foti
P Orbanz
R Unnikrishnan
S Kumar
T Ferguson
X Sun
YW Teh
Publication venue
Publication date: 02/07/2017
Field of study

Dirichlet processes (DP) are widely applied in Bayesian nonparametric modeling. However, in their basic form they do not directly integrate dependency information among data arising from space and time. In this paper, we propose location dependent Dirichlet processes (LDDP) which incorporate nonparametric Gaussian processes in the DP modeling framework to model such dependencies. We develop the LDDP in the context of mixture modeling, and develop a mean field variational inference algorithm for this mixture model. The effectiveness of the proposed modeling framework is shown on an image segmentation task

arXiv.org e-Print Archive

Crossref

Overcoming data scarcity of Twitter: using tweets as bootstrap with application to autism-related topic content analysis

Author: Agarwal A.
Autism
Blei D.
Bollen J.
Chang J.
Danial J. T.
Harrington J. W.
Harshavardhan A.
Higashida N.
Himelboim I.
Hutchings C.
Hviid A.
Ishwaran H.
Jacobson J. W.
Jashinsky J.
Jiang L.
Paul M. J.
Paul M. J.
Robinson B.
Russell M. A.
Scanfeld D.
Teh Y. W.
Teh Y. W.
Trembath D.
Verma S.
Warren Z.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2015
Field of study

Notwithstanding recent work which has demonstrated the potential of using Twitter messages for content-specific data mining and analysis, the depth of such analysis is inherently limited by the scarcity of data imposed by the 140 character tweet limit. In this paper we describe a novel approach for targeted knowledge exploration which uses tweet content analysis as a preliminary step. This step is used to bootstrap more sophisticated data collection from directly related but much richer content sources. In particular we demonstrate that valuable information can be collected by following URLs included in tweets. We automatically extract content from the corresponding web pages and treating each web page as a document linked to the original tweet show how a temporal topic model based on a hierarchical Dirichlet process can be used to track the evolution of a complex topic structure of a Twitter community. Using autism-related tweets we demonstrate that our method is capable of capturing a much more meaningful picture of information exchange than user-chosen hashtags.Comment: IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, 201

arXiv.org e-Print Archive

Deakin Research Online

Crossref

Variable selection for large p small n regression models with incomplete data: Mapping QTL with epistases

Author: A Wagner
BS Yandell
CH Kao
CH Kao
CH Kao
d Carlborg
D Fourdrinier
Dabao Zhang
E Greenshtein
EI George
ES Lander
H Hastie
H Ishwaran
H Jeffreys
H Jeffreys
H Wang
J Fan
J Liu
JH Moore
JM Álvarez-Castro
KW Broman
LJ Leamy
M Bogdan
M Bogdan
M Zhang
M Zhang
M Żak
Martin T Wells
Min Zhang
N Yi
N Yi
N Yi
N Yi
P Huber
PJ Gaffney
R Sanjuán
RD Ball
RJ Tibshirani
RJA Little
RW Doerge
S Portnoy
S Xu
SD Tanksley
SM Williams
TJ Mitchell
W Bateson
W Shi
Y Eshed
YH Cui
ZB Zeng
ZB Zeng
ZB Zeng
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Identifying quantitative trait loci (QTL) for both additive and epistatic effects raises the statistical issue of selecting variables from a large number of candidates using a small number of observations. Missing trait and/or marker values prevent one from directly applying the classical model selection criteria such as Akaike's information criterion (AIC) and Bayesian information criterion (BIC). Results We propose a two-step Bayesian variable selection method which deals with the sparse parameter space and the small sample size issues. The regression coefficient priors are flexible enough to incorporate the characteristic of "large <it>p </it>small <it>n</it>" data. Specifically, sparseness and possible asymmetry of the significant coefficients are dealt with by developing a Gibbs sampling algorithm to stochastically search through low-dimensional subspaces for significant variables. The superior performance of the approach is demonstrated via simulation study. We also applied it to real QTL mapping datasets. Conclusion The two-step procedure coupled with Bayesian classification offers flexibility in modeling "large p small n" data, especially for the sparse and asymmetric parameter space. This approach can be extended to other settings characterized by high dimension and low sample size.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Assessing the clinical utility of cancer genomic and proteomic data across tumor types

Author: A Berchuck
A Margolin
Adam A Margolin
Ali Amin-Mansour
Artem Sokolov
BV Kallakury
C Mermel
C Sturgeon
DA Krueger
DR Johnson
DY Heng
E Bilal
Eliezer M Van Allen
EM Van Allen
EM Van Allen
F Harrell
G Iyer
Gad Getz
Gordon B Mills
H Faragalla
H Ishwaran
H Khella
Han Liang
J-P Brunet
John N Weinstein
Josh M Stuart
JS Falconer
JW Antoon
JY Douillard
K Ohashi
K Shih
KB Kim
Kenneth R Hess
L Garraway
L MacConaill
L Shi
Larsson Omberg
Lauren A Byers
Leng Han
Levi A Garraway
Lixia Diao
LM McShane
LM McShane
M Holdhoff
Michael S Lawrence
MS Zaman
MT Weigel
N Wagle
N Wagle
N Wagle
Nikhil Wagle
NL Henry
PA Jänne
R Tibshirani
S Liang
S Noguchi
T Sakurai
Xuelin Huang
Y Liu
Y Ni
Y Yuan
Yanxun Xu
Yuan Yuan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/07/2014
Field of study

Molecular profiling of tumors promises to advance the clinical management of cancer, but the benefits of integrating molecular data with traditional clinical variables have not been systematically studied. Here we retrospectively predict patient survival using diverse molecular data (somatic copy-number alteration, DNA methylation and mRNA, miRNA and protein expression) from 953 samples of four cancer types from The Cancer Genome Atlas project. We found that incorporating molecular data with clinical variables yielded statistically significantly improved predictions (FDR < 0.05) for three cancers but those quantitative gains were limited (2.2–23.9%). Additional analyses revealed little predictive power across tumor types except for one case. In clinically relevant genes, we identified 10,281 somatic alterations across 12 cancer types in 2,928 of 3,277 patients (89.4%), many of which would not be revealed in single-tumor analyses. Our study provides a starting point and resources, including an open-access model evaluation platform, for building reliable prognostic and therapeutic strategies that incorporate molecular data

Crossref

Harvard University - DASH

PubMed Central

eScholarship - University of California

Expectation-maximization algorithms for inference in Dirichlet processes mixture

Author: A. Doucet
AP Dempster
C Andrieu
DM Titterington
DM Titterington
H Ishwaran
M Sato
N Ueda
SS Dragomir
T. Kimura
T. Matsumoto
T. Nokajima
T. Tokuda
Y. Nakada
Z Liu
Z Zivkovic
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Satisfaction with web-based training in an integrated healthcare delivery network: do age, education, computer skills and attitudes matter?

Author: Andrew J Fishleder
Anil K Jain
AP Choules
Ashish Atreja
BP Kerfoot
C Urquhart
CM Harris
CP Friedman
DA Cook
DA Cook
DL Kirkpatrick
E Knebel
EA Nelson
G Singh
GA Debourgh
GS Letterie
Hemant Ishwaran
HS Chumley-Jones
J Davis
J Morrissey
JA Pereira
JC Anderson
JG Ruiz
JP Naidr
L Atack
L Breiman
L Howatson-Jones
LO Gostin
M Avital
M Hollander
MG Moore
Michel Avital
MJ Lewis
N Mehta
Neil B Mehta
PA Cohen
R Blair
R Ihaka
R Phipps
RA Kanten-McCoy
SG Lesh
TL Russell
TM Bishop
VR Curran
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Healthcare institutions spend enormous time and effort to train their workforce. Web-based training can potentially streamline this process. However the deployment of web-based training in a large-scale setting with a diverse healthcare workforce has not been evaluated. The aim of this study was to evaluate the satisfaction of healthcare professionals with web-based training and to determine the predictors of such satisfaction including age, education status and computer proficiency. Methods Observational, cross-sectional survey of healthcare professionals from six hospital systems in an integrated delivery network. We measured overall satisfaction to web-based training and response to survey items measuring Website Usability, Course Usefulness, Instructional Design Effectiveness, Computer Proficiency and Self-learning Attitude. Results A total of 17,891 healthcare professionals completed the web-based training on HIPAA Privacy Rule; and of these, 13,537 completed the survey (response rate 75.6%). Overall course satisfaction was good (median, 4; scale, 1 to 5) with more than 75% of the respondents satisfied with the training (rating 4 or 5) and 65% preferring web-based training over traditional instructor-led training (rating 4 or 5). Multivariable ordinal regression revealed 3 key predictors of satisfaction with web-based training: Instructional Design Effectiveness, Website Usability and Course Usefulness. Demographic predictors such as gender, age and education did not have an effect on satisfaction. Conclusion The study shows that web-based training when tailored to learners' background, is perceived as a satisfactory mode of learning by an interdisciplinary group of healthcare professionals, irrespective of age, education level or prior computer experience. Future studies should aim to measure the long-term outcomes of web-based training.</p

CiteSeerX

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

University of Miami: Scholarship Miami

UvA-DARE

International Migration, Integration and Social Cohesion online publications