7,393 research outputs found

    The nested dirichlet distribution and incomplete categorical data analysis

    Get PDF
    The nested Dirichlet distribution (NDD) is an important distribution defined on the closed n-dimensional simplex. It includes the classical Dirichlet distribution and is useful in incomplete categorical data (ICD) analysis. In this article, we develop the distributional properties of NDD. New large-sample likelihood and small-sample Bayesian approaches for analyzing ICD are proposed and compared with existing likelihood/Bayesian strategies. We show that the new approaches have at least three advantages over existing approaches based on the traditional Dirichlet distribution in both frequentist and conjugate Bayesian inference for ICD. The new methods possess closed-form expressions for both the maximum likelihood and Bayes estimates when the likelihood function is in NDD form; produce computationally efficient EM and data augmentation algorithms when the likelihood is not in NDD form; and provide exact sampling procedures for some special cases. The methodologies are illustrated with simulated and real data.published_or_final_versio

    Crack propagation in brittle solid containing 3D surface fracture under uniaxial compression

    Get PDF
    2003-2004 > Academic research: refereed > Publication in refereed journalVersion of RecordPublishe

    Further properties and new applications of the nested Dirichlet distribution

    Get PDF
    Recently, Ng et al. (2009) studied a new family of distributions, namely the nested Dirichlet distributions. This family includes the traditional Dirichlet distribution as a special member and can be adopted to analyze incomplete categorical data. However, other important aspects of the family, such as marginal and conditional distributions and related properties are not yet available in the literature. Moreover, diverse applications of the family to the real world need to be further explored. In this paper, we first obtain the marginal and conditional distributions and other related properties of the nested Dirichlet distribution. We then present new applications of the family in fitting competing-risks model, analyzing incomplete categorical data and evaluating cancer diagnosis tests. Three real data involving failure times of radio transmitter receivers, attitude toward the death penalty and ultrasound ratings for breast cancer metastasis are provided. © 2009 Elsevier B.V. All rights reserved.postprin

    Confidence-interval construction for rate ratio in matched-pair studies with incomplete data

    Get PDF
    Matched-pair design is often used in clinical trials to increase the efficiency of establishing equivalence between two treatments with binary outcomes. In this article, we consider such a design based on rate ratio in the presence of incomplete data. The rate ratio is one of the most frequently used indices in comparing efficiency of two treatments in clinical trials. In this article, we propose 10 confidence-interval estimators for the rate ratio in incomplete matched-pair designs. A hybrid method that recovers variance estimates required for the rate ratio from the confidence limits for single proportions is proposed. It is noteworthy that confidence intervals based on this hybrid method have closed-form solution. The performance of the proposed confidence intervals is evaluated with respect to their exact coverage probability, expected confidence interval width, and distal and mesial noncoverage probability. The results show that the hybrid Agresti–Coull confidence interval based on Fieller’s theorem performs satisfactorily for small to moderate sample sizes. Two real examples from clinical trials are used to illustrate the proposed confidence intervals.postprin

    A robust computational algorithm for inverse photomask synthesis in optical projection lithography

    Get PDF
    Inverse lithography technology formulates the photomask synthesis as an inverse mathematical problem. To solve this, we propose a variational functional and develop a robust computational algorithm, where the proposed functional takes into account the process variations and incorporates several regularization terms that can control the mask complexity. We establish the existence of the minimizer of the functional, and in order to optimize it effectively, we adopt an alternating minimization procedure with Chambolle's fast duality projection algorithm. Experimental results show that our proposed algorithm is effective in synthesizing high quality photomasks as compared with existing methods.published_or_final_versio

    Cryo-EM structure of a helicase loading intermediate containing ORC-Cdc6-Cdt1-MCM2-7 bound to DNA

    Get PDF
    In eukaryotes, the Cdt1-bound replicative helicase core MCM2-7 is loaded onto DNA by the ORC-Cdc6 ATPase to form a prereplicative complex (pre-RC) with an MCM2-7 double hexamer encircling DNA. Using purified components in the presence of ATP-γS, we have captured in vitro an intermediate in pre-RC assembly that contains a complex between the ORC-Cdc6 and Cdt1-MCM2-7 heteroheptamers called the OCCM. Cryo-EM studies of this 14-subunit complex reveal that the two separate heptameric complexes are engaged extensively, with the ORC-Cdc6 N-terminal AAA+ domains latching onto the C-terminal AAA+ motor domains of the MCM2-7 hexamer. The conformation of ORC-Cdc6 undergoes a concerted change into a right-handed spiral with helical symmetry that is identical to that of the DNA double helix. The resulting ORC-Cdc6 helicase loader shows a notable structural similarity to the replication factor C clamp loader, suggesting a conserved mechanism of action

    SentiBench - a benchmark comparison of state-of-the-practice sentiment analysis methods

    Get PDF
    In the last few years thousands of scientific papers have investigated sentiment analysis, several startups that measure opinions on real data have emerged and a number of innovative products related to this theme have been developed. There are multiple methods for measuring sentiments, including lexical-based and supervised machine learning methods. Despite the vast interest on the theme and wide popularity of some methods, it is unclear which one is better for identifying the polarity (i.e., positive or negative) of a message. Accordingly, there is a strong need to conduct a thorough apple-to-apple comparison of sentiment analysis methods, \textit{as they are used in practice}, across multiple datasets originated from different data sources. Such a comparison is key for understanding the potential limitations, advantages, and disadvantages of popular methods. This article aims at filling this gap by presenting a benchmark comparison of twenty-four popular sentiment analysis methods (which we call the state-of-the-practice methods). Our evaluation is based on a benchmark of eighteen labeled datasets, covering messages posted on social networks, movie and product reviews, as well as opinions and comments in news articles. Our results highlight the extent to which the prediction performance of these methods varies considerably across datasets. Aiming at boosting the development of this research area, we open the methods' codes and datasets used in this article, deploying them in a benchmark system, which provides an open API for accessing and comparing sentence-level sentiment analysis methods
    corecore