Search CORE

302 research outputs found

Generalized extreme value regression for binary response data: An application to B2B electronic payments system adoption

Author: Dey Dipak K.
Wang Xia
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 07/01/2011
Field of study

In the information system research, a question of particular interest is to interpret and to predict the probability of a firm to adopt a new technology such that market promotions are targeted to only those firms that were more likely to adopt the technology. Typically, there exists significant difference between the observed number of ``adopters'' and ``nonadopters,'' which is usually coded as binary response. A critical issue involved in modeling such binary response data is the appropriate choice of link functions in a regression model. In this paper we introduce a new flexible skewed link function for modeling binary response data based on the generalized extreme value (GEV) distribution. We show how the proposed GEV links provide more flexible and improved skewed link regression models than the existing skewed links, especially when dealing with imbalance between the observed number of 0's and 1's in a data. The flexibility of the proposed model is illustrated through simulated data sets and a billing data set of the electronic payments system adoption from a Fortune 100 company in 2005.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS354 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Asymptotics of a Clustering Criterion for Smooth Distributions

Author: Bharath Karthik
Dey Dipak K
Pozdnyakov Vladimir
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2013
Field of study

We develop a clustering framework for observations from a population with a smooth probability distribution function and derive its asymptotic properties. A clustering criterion based on a linear combination of order statistics is proposed. The asymptotic behavior of the point at which the observations are split into two clusters is examined. The results obtained can then be utilized to construct an interval estimate of the point which splits the data and develop tests for bimodality and presence of clusters

arXiv.org e-Print Archive

Crossref

Model-Based Method for Social Network Clustering

Author: Dey Dipak K.
Ouyang Guang
Zhang Panpan
Publication venue
Publication date: 26/03/2018
Field of study

We propose a simple mixed membership model for social network clustering in this note. A flexible function is adopted to measure affinities among a set of entities in a social network. The model not only allows each entity in the network to possess more than one membership, but also provides accurate statistical inference about network structure. We estimate the membership parameters by using an MCMC algorithm. We evaluate the performance of the proposed algorithm by applying our model to two empirical social network data, the Zachary club data and the bottlenose dolphin network data. We also conduct some numerical studies for different types of simulated networks for assessing the effectiveness of our algorithm. In the end, some concluding remarks and future work are addressed briefly

arXiv.org e-Print Archive

Categorical data analysis using a skewed Weibull regression model

Author: Caron Renault
Dey Dipak
Polpo Adriano
Sinha Debajyoti
Publication venue: 'MDPI AG'
Publication date: 15/11/2017
Field of study

In this paper, we present a Weibull link (skewed) model for categorical response data arising from binomial as well as multinomial model. We show that, for such types of categorical data, the most commonly used models (logit, probit and complementary log-log) can be obtained as limiting cases. We further compare the proposed model with some other asymmetrical models. The Bayesian as well as frequentist estimation procedures for binomial and multinomial data responses are presented in details. The analysis of two data sets to show the efficiency of the proposed model is performed

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

Generalized Variable Selection Algorithms for Gaussian Process Models by LASSO-like Penalty

Author: Dey Dipak
Hu Zhiyong
Publication venue
Publication date: 08/09/2023
Field of study

With the rapid development of modern technology, massive amounts of data with complex pattern are generated. Gaussian process models that can easily fit the non-linearity in data become more and more popular nowadays. It is often the case that in some data only a few features are important or active. However, unlike classical linear models, it is challenging to identify active variables in Gaussian process models. One of the most commonly used methods for variable selection in Gaussian process models is automatic relevance determination, which is known to be open-ended. There is no rule of thumb to determine the threshold for dropping features, which makes the variable selection in Gaussian process models ambiguous. In this work, we propose two variable selection algorithms for Gaussian process models, which use the artificial nuisance columns as baseline for identifying the active features. Moreover, the proposed methods work for both regression and classification problems. The algorithms are demonstrated using comprehensive simulation experiments and an application to multi-subject electroencephalography data that studies alcoholic levels of experimental subjects

arXiv.org e-Print Archive

Wavelet modeling of priors on triangles

Author: Dey Dipak K.
Wang Yazhen
Publication venue: Elsevier Inc.
Publication date: 31/05/2004
Field of study

AbstractParameters in statistical problems often live in a geometry of certain shape. For example, count probabilities in a multinomial distribution belong to a simplex. For these problems, Bayesian analysis needs to model priors satisfying certain constraints imposed by the geometry. This paper investigates modeling of priors on triangles by use of wavelets constructed specifically for triangles. Theoretical analysis and numerical simulations show that our modeling is flexible and is superior to the commonly used Dirichlet prior

Elsevier - Publisher Connector