7,736 research outputs found

    Should Optimal Designers Worry About Consideration?

    Full text link
    Consideration set formation using non-compensatory screening rules is a vital component of real purchasing decisions with decades of experimental validation. Marketers have recently developed statistical methods that can estimate quantitative choice models that include consideration set formation via non-compensatory screening rules. But is capturing consideration within models of choice important for design? This paper reports on a simulation study of a vehicle portfolio design when households screen over vehicle body style built to explore the importance of capturing consideration rules for optimal designers. We generate synthetic market share data, fit a variety of discrete choice models to the data, and then optimize design decisions using the estimated models. Model predictive power, design "error", and profitability relative to ideal profits are compared as the amount of market data available increases. We find that even when estimated compensatory models provide relatively good predictive accuracy, they can lead to sub-optimal design decisions when the population uses consideration behavior; convergence of compensatory models to non-compensatory behavior is likely to require unrealistic amounts of data; and modeling heterogeneity in non-compensatory screening is more valuable than heterogeneity in compensatory trade-offs. This supports the claim that designers should carefully identify consideration behaviors before optimizing product portfolios. We also find that higher model predictive power does not necessarily imply better design decisions; that is, different model forms can provide "descriptive" rather than "predictive" information that is useful for design.Comment: 5 figures, 26 pages. In Press at ASME Journal of Mechanical Design (as of 3/17/15

    Finite mixture clustering of human tissues with different levels of IGF-1 splice variants mRNA transcripts

    Get PDF
    BACKGROUND: This study addresses a recurrent biological problem, that is to define a formal clustering structure for a set of tissues on the basis of the relative abundance of multiple alternatively spliced isoforms mRNAs generated by the same gene. To this aim, we have used a model-based clustering approach, based on a finite mixture of multivariate Gaussian densities. However, given we had more technical replicates from the same tissue for each quantitative measurement, we also employed a finite mixture of linear mixed models, with tissue-specific random effects. RESULTS: A panel of human tissues was analysed through quantitative real-time PCR methods, to quantify the relative amount of mRNA encoding different IGF-1 alternative splicing variants. After an appropriate, preliminary, equalization of the quantitative data, we provided an estimate of the distribution of the observed concentrations for the different IGF-1 mRNA splice variants in the cohort of tissues by employing suitable kernel density estimators. We observed that the analysed IGF-1 mRNA splice variants were characterized by multimodal distributions, which could be interpreted as describing the presence of several sub-population, i.e. potential tissue clusters. In this context, a formal clustering approach based on a finite mixture model (FMM) with Gaussian components is proposed. Due to the presence of potential dependence between the technical replicates (originated by repeated quantitative measurements of the same mRNA splice isoform in the same tissue) we have also employed the finite mixture of linear mixed models (FMLMM), which allowed to take into account this kind of within-tissue dependence. CONCLUSIONS: The FMM and the FMLMM provided a convenient yet formal setting for a model-based clustering of the human tissues in sub-populations, characterized by homogeneous values of concentrations of the mRNAs for one or multiple IGF-1 alternative splicing isoforms. The proposed approaches can be applied to any cohort of tissues expressing several alternatively spliced mRNAs generated by the same gene, and can overcome the limitations of clustering methods based on simple comparisons between splice isoform expression levels

    Computational aspects of DNA mixture analysis

    Full text link
    Statistical analysis of DNA mixtures is known to pose computational challenges due to the enormous state space of possible DNA profiles. We propose a Bayesian network representation for genotypes, allowing computations to be performed locally involving only a few alleles at each step. In addition, we describe a general method for computing the expectation of a product of discrete random variables using auxiliary variables and probability propagation in a Bayesian network, which in combination with the genotype network allows efficient computation of the likelihood function and various other quantities relevant to the inference. Lastly, we introduce a set of diagnostic tools for assessing the adequacy of the model for describing a particular dataset

    Latent Variable Models with Applications to Spectral Data Analysis

    Get PDF
    Recent technological advances in automatic data acquisition have created an ever increasing need to extract meaningful information from huge amount of data. Multivariate predictive models have become important statistical tools in solving modern engineering problems. The purpose of this thesis is to develop novel predictive methods based on latent variable models and validate these methods by applying them into spectral data analysis. In this thesis, hybrid models of principal components regression (PCR) and partial least squares regression (PLS) is proposed. The basic idea of hybrid models is to develop more accurate prediction techniques by combining the merits of PCR and PLS. In the hybrid models, both principal components in PCR and latent variables in PLS are involved in the common regression process. Another major contribution of this work is to propose the robust probabilistic multivariate calibration model (RPMC) to overcome the drawback of Gaussian assumption in most latent variable models. The RPMC was designed to be robust to outliers by adopting a Student-t distribution instead of the Gaussian distribution. An efficient Expectation- Maximization algorithm was derived for parameter estimation in the RPMC. It can also be shown that some popular latent variables such as probabilistic PCA (PPCA) and supervised probabilistic PCA (SPPCA) are special cases of the RPMC. Both the predictive models developed in this thesis were assessed on the real-life spectral data datasets. The hybrid models were applied into the shaft misalignment prediction problem and the RPMC are tested on the near-infrared (NIR) dataset. For the classification problem on the NIR data, the fusion of the regularized discriminant analysis (RDA) and principal components analysis (PCA) was also proposed. The experimental results have shown the effectiveness and efficiency of the proposed methods

    Detecting Differential Expression from RNA-seq Data with Expression Measurement Uncertainty

    Full text link
    High-throughput RNA sequencing (RNA-seq) has emerged as a revolutionary and powerful technology for expression profiling. Most proposed methods for detecting differentially expressed (DE) genes from RNA-seq are based on statistics that compare normalized read counts between conditions. However, there are few methods considering the expression measurement uncertainty into DE detection. Moreover, most methods are only capable of detecting DE genes, and few methods are available for detecting DE isoforms. In this paper, a Bayesian framework (BDSeq) is proposed to detect DE genes and isoforms with consideration of expression measurement uncertainty. This expression measurement uncertainty provides useful information which can help to improve the performance of DE detection. Three real RAN-seq data sets are used to evaluate the performance of BDSeq and results show that the inclusion of expression measurement uncertainty improves accuracy in detection of DE genes and isoforms. Finally, we develop a GamSeq-BDSeq RNA-seq analysis pipeline to facilitate users, which is freely available at the website http://parnec.nuaa.edu.cn/liux/GSBD/GamSeq-BDSeq.html.Comment: 20 pages, 9 figure

    Significance Regression: A Statistical Approach to Biased Linear Regression and Partial Least Squares

    Get PDF
    This paper first examines the properties of biased regressors that proceed by restricting the search for the optimal regressor to a subspace. These properties suggest features such biased regression methods should incorporate. Motivated by these observations, this work proposes a new formulation for biased regression derived from the principle of statistical significance. This new formulation, significance regression (SR), leads to partial least squares (PLS) under certain model assumptions and to more general methods under various other model kumptions. For models with multiple outputs, SR will be shown to have certain advantages over PLS. Using the new formulation a significance test is advanced for determining the number of directions to be used; for PLS, cross-validation has been the primary method for determining this quantity. The prediction and estimation properties of SR are discussed. A brief numerical example illustrates the relationship between SR and PLS
    • …
    corecore