1,083 research outputs found

    Variable selection for zero-inflated and overdispersed data with application to health care demand in Germany

    Get PDF
    In health services and outcome research, count outcomes are frequently encountered and often have a large proportion of zeros. The zero-inflated negative binomial (ZINB) regression model has important applications for this type of data. With many possible candidate risk factors, this paper proposes new variable selection methods for the ZINB model. We consider maximum likelihood function plus a penalty including the least absolute shrinkage and selection operator (LASSO), smoothly clipped absolute deviation (SCAD) and minimax concave penalty (MCP). An EM (expectation-maximization) algorithm is proposed for estimating the model parameters and conducting variable selection simultaneously. This algorithm consists of estimating penalized weighted negative binomial models and penalized logistic models via the coordinated descent algorithm. Furthermore, statistical properties including the standard error formula are provided. A simulation study shows that the new algorithm not only has more accurate or at least comparable estimation, also is more robust than the traditional stepwise variable selection. The application is illustrated with a data set on health care demand in Germany. The proposed techniques have been implemented in an open-source R package mpath

    Effect of variable shipping frequency on production-distribution policy in a vendor-buyer integrated system

    Get PDF
    This paper investigates the effect of variable shipping frequency on production-distribution policy in a vendor-buyer integrated system. In a recent article Chiu et al. [1] derived the optimal replenishment lot size for an economic production quantity problem with multi-delivery and quality assurance, based on an assumption that the number of shipment is a given constant. However, in a vendor-buyer integrated system in supply chain environment, joint determination of replenishment lot size and number of shipments may help such a system to gain significant competitive advantage in terms of becoming a low-cost producer as well as having tight linkage to customer. For this reason, the present study extends the work of Chiu et al. [1] by considering shipping frequency as one of the decision variables and incorporating customer’s stock holding cost into system cost analysis. Hessian matrix equations are employed to certify the convexity of cost function that contains two decision variables, and the effect of variable shipping frequency on production-distribution policy is investigated. A numerical example is provided to demonstrate practical usage of the research result

    Solving finite production rate model with scrap and multiple shipments using algebraic approach

    Get PDF
    This paper solves a finite production rate (FPR) model with scrap and multiple shipments using an algebraic method. Classic FPR model assumes a continuous inventory issuing policy to satisfy demand and perfect quality production for all items produced. However, in real life vendor-buyer integrated production-inventory system, multiple shipment policy is practically used in lieu of a continuous issuing policy and generation of defective items during production run is inevitable. In this study, it is assumed that all defective items are scrap and the perfect quality items can only be delivered to customers if the whole lot is quality assured at the end of the production run. A conventional approach for solving the FPR model is the use of differential calculus on the long-run average cost function with the need to prove optimality first. This paper demonstrates that optimal lot size and its overall costs for the aforementioned FPR model can be derived without derivatives. As a result, it enables students or practitioners who have little knowledge of calculus to understand and to handle with ease the real-life FPR model

    Chain of Natural Language Inference for Reducing Large Language Model Ungrounded Hallucinations

    Full text link
    Large language models (LLMs) can generate fluent natural language texts when given relevant documents as background context. This ability has attracted considerable interest in developing industry applications of LLMs. However, LLMs are prone to generate hallucinations that are not supported by the provided sources. In this paper, we propose a hierarchical framework to detect and mitigate such ungrounded hallucination. Our framework uses Chain of Natural Language Inference (CoNLI) for hallucination detection and hallucination reduction via post-editing. Our approach achieves state-of-the-art performance on hallucination detection and enhances text quality through rewrite, using LLMs without any fine-tuning or domain-specific prompt engineering. We show that this simple plug-and-play framework can serve as an effective choice for hallucination detection and reduction, achieving competitive performance across various contexts.Comment: The source code is available at https://github.com/microsoft/CoNLI_hallucinatio

    Solving finite production rate model with scrap and multiple shipments using algebraic approach

    Get PDF
    This paper solves a finite production rate (FPR) model with scrap and multiple shipments using an algebraic method. Classic FPR model assumes a continuous inventory issuing policy to satisfy demand and perfect quality production for all items produced. However, in real life vendor-buyer integrated production-inventory system, multiple shipment policy is practically used in lieu of a continuous issuing policy and generation of defective items during production run is inevitable. In this study, it is assumed that all defective items are scrap and the perfect quality items can only be delivered to customers if the whole lot is quality assured at the end of the production run. A conventional approach for solving the FPR model is the use of differential calculus on the long-run average cost function with the need to prove optimality first. This paper demonstrates that optimal lot size and its overall costs for the aforementioned FPR model can be derived without derivatives. As a result, it enables students or practitioners who have little knowledge of calculus to understand and to handle with ease the real-life FPR model

    An Investigation of Telecom Mobile Data Billing Plans

    Get PDF
    In the recent years, mobile operators have provided many billing alternatives such as limited and unlimited billing plans, and shared and non-shared data plans for the users with different needs. A non-shared data plan is designed for a single user with a limited monthly data allowance. On the other hand, the monthly data allowance of a shared data plan is shared by a group of users with multiple devices. The mobile operators often conduct the primary price study to compare their billing plans, which shows the relationship between the prices of the billing plans against the fixed amounts of data usage. Although the primary price study can easily and quickly draw the conclusions, it only provides rough billing plan suggestions. In reality, the amounts of data usage are not fixed, and therefore should be measured from commercial mobile networks to reflect the user behaviors on data usage. This paper proposes an analytical approach by using the measured data of Chunghwa Telecom Co., Ltd. (CHT), the largest telecommunications company in Taiwan, to derive the expected payments of various billing plans. The results of the analytical model are more accurate than those of the primary price study, and therefore provide better suggestions for billing plan selection. Other mobile operators can easily use our model to analyze the billing alternatives with their measured data

    Floating Point Arithmetic Protocols for Constructing Secure Data Analysis Application

    Get PDF
    AbstractA large variety of data mining and machine learning techniques are applied to a wide range of applications today. There- fore, there is a real need to develop technologies that allows data analysis while preserving the confidentiality of the data. Secure multi-party computation (SMC) protocols allows participants to cooperate on various computations while retaining the privacy of their own input data, which is an ideal solution to this issue. Although there is a number of frameworks developed in SMC to meet this challenge, but they are either tailored to perform only on specific tasks or provide very limited precision. In this paper, we have developed protocols for floating point arithmetic based on secure scalar product protocols, which is re- quired in many real world applications. Our protocols follow most of the IEEE-754 standard, supporting the four fundamental arithmetic operations, namely addition, subtraction, multiplication, and division. We will demonstrate the practicality of these protocols through performing various statistical calculations that is widely used in most data analysis tasks. Our experiments show the performance of our framework is both practical and promising

    A Model of Technological Imagination and Creativity: Cognitive Task Analysis

    Get PDF
    An integrated model of cognitive tasks involved in the process of a technological innovation was proposed based on these theories: 1. CDIO theory of technological innovation, 2. Wallas’s creative thinking processes, 3. Khalr & Simon’s theory of scientific discovery, and 4. the conceptual combination theory of imagination. The central theme of this model is the proposition that three cognitive conditions are necessary for technological imagination and innovation: 1. cross-domain knowledge, 2. simple heuristics, and 3. pattern recognition ability. Although the required domain knowledge and implementation methods are different across domains, heuristics that lead to a breakthrough at each phase of CDIO in a technological innovation are similar, with conceptual combination as the cognitive engine for generating original and imaginative ideas

    Sample-Specific Debiasing for Better Image-Text Models

    Full text link
    Self-supervised representation learning on image-text data facilitates crucial medical applications, such as image classification, visual grounding, and cross-modal retrieval. One common approach involves contrasting semantically similar (positive) and dissimilar (negative) pairs of data points. Drawing negative samples uniformly from the training data set introduces false negatives, i.e., samples that are treated as dissimilar but belong to the same class. In healthcare data, the underlying class distribution is nonuniform, implying that false negatives occur at a highly variable rate. To improve the quality of learned representations, we develop a novel approach that corrects for false negatives. Our method can be viewed as a variant of debiased constrastive learning that uses estimated sample-specific class probabilities. We provide theoretical analysis of the objective function and demonstrate the proposed approach on both image and paired image-text data sets. Our experiments demonstrate empirical advantages of sample-specific debiasing
    corecore