Search CORE

181 research outputs found

Likelihood Inference for Models with Unobservables: Another View

Author: Lee Youngjo
Nelder John A.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 05/10/2010
Field of study

There have been controversies among statisticians on (i) what to model and (ii) how to make inferences from models with unobservables. One such controversy concerns the difference between estimation methods for the marginal means not necessarily having a probabilistic basis and statistical models having unobservables with a probabilistic basis. Another concerns likelihood-based inference for statistical models with unobservables. This needs an extended-likelihood framework, and we show how one such extension, hierarchical likelihood, allows this to be done. Modeling of unobservables leads to rich classes of new probabilistic models from which likelihood-type inferences can be made naturally with hierarchical likelihood.Comment: This paper discussed in: [arXiv:1010.0804], [arXiv:1010.0807], [arXiv:1010.0810]. Rejoinder at [arXiv:1010.0814]. Published in at http://dx.doi.org/10.1214/09-STS277 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Resolving the induction problem: Can we state with complete confidence via induction that the sun rises forever?

Author: Lee Youngjo
Publication venue
Publication date: 13/01/2020
Field of study

Induction is a form of reasoning from the particular example to the general rule. However, establishing the truth of a general proposition is problematic, because it is always possible that a conflicting observation to occur. This problem is known as the induction problem. The sunrise problem is a quintessential example of the induction problem, which was first introduced by Laplace (1814). However, in Laplace's solution, a zero probability was assigned to the proposition that the sun will rise forever, regardless of the number of observations made. Therefore, it has often been stated that complete confidence regarding a general proposition can never be attained via induction. In this study, we attempted to overcome this skepticism by using a recently developed theoretically consistent procedure. The findings demonstrate that through induction, one can rationally gain complete confidence in propositions based on scientific theory

arXiv.org e-Print Archive

Estimation of multivariate normal mean and its application to mixed linear models

Author: Lee Youngjo
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/1983
Field of study

Let X = (x(,1),x(,2),...,x(,p))\u27 be a multivariate normal random variable with mean vector, (theta), in a space (THETA), and variance matrix I;From Strawderman\u27s (1971) class of estimators, we derive a minimax admissible estimator for (theta). It has a relatively simple form when p is greater than or equal to five. We also extend Stein\u27s (1973) technique to evaluate unbiased estimators of risks for discontinuous estimators. Then, we show the exact risks of a preliminary test estimator and of compromised or mixture estimators. We develop estimators that shrink towards some subspace of (THETA) and show the relationship between shrinkage functionals and variance component estimators in balanced mixed linear models. We also investigate the asymptotic behavior of shrinkage estimators. By choosing an appropriate subspace, we show that our estimator and ridge regression estimators achieve stability of prediction in a particular data example;References;Strawderman, W. E. 1971. Proper Bayes Minimax Estimators of the Multivariate Normal Mean. The Annals of Mathematical Statistics 42:385-388. Stein, C. 1973. Estimation of the Mean of a Multivariate Distribution Proceedings of the Prague Symposium on Asymptotic Statistics:345-387

Digital Repository @ Iowa State University (ISU)

μ-Oxido-bis[bis(pentafluorophenolato)(η5-pentamethylcyclopentadienyl)titanium(IV)]

Author: Kim Youngjo
Lee Junseong
Publication venue: International Union of Crystallography
Publication date: 01/08/2011
Field of study

The dinuclear title complex, [Ti2(C10H15)2(C6F5O)4O], features two TiIV atoms bridged by an O atom, which lies on an inversion centre. The TiIV atom is bonded to a η5-pentamethylcyclopentadienyl ring, two pentafluorophenolate anions and to the bridging O atom. The environment around the TiIV atom can be considered as a distorted tetrahedron. The cyclopentadienyl ring is disordered over two sets of sites [site occupancy = 0.824 (8) for the major component]

Crossref

Directory of Open Access Journals

PubMed Central

Deep Neural Networks for Semiparametric Frailty Models via H-likelihood

Author: HA IL DO
Lee Hangbin
Lee Youngjo
Publication venue
Publication date: 13/07/2023
Field of study

For prediction of clustered time-to-event data, we propose a new deep neural network based gamma frailty model (DNN-FM). An advantage of the proposed model is that the joint maximization of the new h-likelihood provides maximum likelihood estimators for fixed parameters and best unbiased predictors for random frailties. Thus, the proposed DNN-FM is trained by using a negative profiled h-likelihood as a loss function, constructed by profiling out the non-parametric baseline hazard. Experimental studies show that the proposed method enhances the prediction performance of the existing methods. A real data analysis shows that the inclusion of subject-specific frailties helps to improve prediction of the DNN based Cox model (DNN-Cox)

arXiv.org e-Print Archive

Super-sparse principal component analyses for high-throughput genomic data

Author: Lee Donghwan
Lee Woojoo
Lee Youngjo
Pawitan Yudi
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Principal component analysis (PCA) has gained popularity as a method for the analysis of high-dimensional genomic data. However, it is often difficult to interpret the results because the principal components are linear combinations of all variables, and the coefficients (loadings) are typically nonzero. These nonzero values also reflect poor estimation of the true vector loadings; for example, for gene expression data, biologically we expect only a portion of the genes to be expressed in any tissue, and an even smaller fraction to be involved in a particular process. Sparse PCA methods have recently been introduced for reducing the number of nonzero coefficients, but these existing methods are not satisfactory for high-dimensional data applications because they still give too many nonzero coefficients. Results Here we propose a new PCA method that uses two innovations to produce an extremely sparse loading vector: (i) a random-effect model on the loadings that leads to an unbounded penalty at the origin and (ii) shrinkage of the singular values obtained from the singular value decomposition of the data matrix. We develop a stable computing algorithm by modifying nonlinear iterative partial least square (NIPALS) algorithm, and illustrate the method with an analysis of the NCI cancer dataset that contains 21,225 genes. Conclusions The new method has better performance than several existing methods, particularly in the estimation of the loading vectors.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central