815 research outputs found

    Sequential category aggregation and partitioning approaches for multi-way contingency tables based on survey and census data

    Full text link
    Large contingency tables arise in many contexts but especially in the collection of survey and census data by government statistical agencies. Because the vast majority of the variables in this context have a large number of categories, agencies and users need a systematic way of constructing tables which are summaries of such contingency tables. We propose such an approach in this paper by finding members of a class of restricted log-linear models which maximize the likelihood of the data and use this to find a parsimonious means of representing the table. In contrast with more standard approaches for model search in hierarchical log-linear models (HLLM), our procedure systematically reduces the number of categories of the variables. Through a series of examples, we illustrate the extent to which it can preserve the interaction structure found with HLLMs and be used as a data simplification procedure prior to HLL modeling. A feature of the procedure is that it can easily be applied to many tables with millions of cells, providing a new way of summarizing large data sets in many disciplines. The focus is on information and description rather than statistical testing. The procedure may treat each variable in the table in different ways, preserving full detail, treating it as fully nominal, or preserving ordinality.Comment: Published in at http://dx.doi.org/10.1214/08-AOAS175 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Solving the 100 Swiss Francs Problem

    Full text link
    Sturmfels offered 100 Swiss Francs in 2005 to a conjecture, which deals with a special case of the maximum likelihood estimation for a latent class model. This paper confirms the conjecture positively

    DRUG-NEM: Optimizing drug combinations using single-cell perturbation response to account for intratumoral heterogeneity.

    Get PDF
    An individual malignant tumor is composed of a heterogeneous collection of single cells with distinct molecular and phenotypic features, a phenomenon termed intratumoral heterogeneity. Intratumoral heterogeneity poses challenges for cancer treatment, motivating the need for combination therapies. Single-cell technologies are now available to guide effective drug combinations by accounting for intratumoral heterogeneity through the analysis of the signaling perturbations of an individual tumor sample screened by a drug panel. In particular, Mass Cytometry Time-of-Flight (CyTOF) is a high-throughput single-cell technology that enables the simultaneous measurements of multiple ([Formula: see text]40) intracellular and surface markers at the level of single cells for hundreds of thousands of cells in a sample. We developed a computational framework, entitled Drug Nested Effects Models (DRUG-NEM), to analyze CyTOF single-drug perturbation data for the purpose of individualizing drug combinations. DRUG-NEM optimizes drug combinations by choosing the minimum number of drugs that produce the maximal desired intracellular effects based on nested effects modeling. We demonstrate the performance of DRUG-NEM using single-cell drug perturbation data from tumor cell lines and primary leukemia samples

    Multiple‐systems analysis for the quantification of modern slavery: classical and Bayesian approaches

    Get PDF
    Multiple systems estimation is a key approach for quantifying hidden populations such as the number of victims of modern slavery. The UK Government published an estimate of 10,000 to 13,000 victims, constructed by the present author, as part of the strategy leading to the Modern Slavery Act 2015. This estimate was obtained by a stepwise multiple systems method based on six lists. Further investigation shows that a small proportion of the possible models give rather different answers, and that other model fitting approaches may choose one of these. Three data sets collected in the field of modern slavery, together with a data set about the death toll in the Kosovo conflict, are used to investigate the stability and robustness of various multiple systems estimate methods. The crucial aspect is the way that interactions between lists are modelled, because these can substantially affect the results. Model selection and Bayesian approaches are considered in detail, in particular to assess their stability and robustness when applied to real modern slavery data. A new Markov Chain Monte Carlo Bayesian approach is developed; overall, this gives robust and stable results at least for the examples considered. The software and datasets are freely and publicly available to facilitate wider implementation and further research

    Coded Parity Packet Transmission Method for Two Group Resource Allocation

    No full text
    Gap value control is investigated when the number of source and parity packets is adjusted in a concatenated coding scheme whilst keeping the overall coding rate fixed. Packet-based outer codes which are generated from bit-wise XOR combinations of the source packets are used to adjust the number of both source packets. Having the source packets, the number of parity packets, which are the bit-wise XOR combinations of the source packets can be adjusted such that the gap value, which measures the gap between the theoretical and the required signal-to-noise ratio (SNR), is controlled without changing the actual coding rate. Consequently, the required SNR reduces, yielding a lower required energy to realize the transmission data rate. Integrating this coding technique with a two-group resource allocation scheme renders efficient utilization of the total energy to further improve the data rates. With a relatively small-sized set of discrete data rates, the system throughput achieved by the proposed two-group loading scheme is observed to be approximately equal to that of the existing loading scheme, which is operated with a much larger set of discrete data rates. The gain obtained by the proposed scheme over the existing equal rate and equal energy loading scheme is approximately 5 dB. Furthermore, a successive interference cancellation scheme is also integrated with this coding technique, which can be used to decode and provide consecutive symbols for inter-symbol interference (ISI) and multiple access interference (MAI) mitigation. With this integrated scheme, the computational complexity is signi cantly reduced by eliminating matrix inversions. In the same manner, the proposed coding scheme is also incorporated into a novel fixed energy loading, which distributes packets over parallel channels, to control the gap value of the data rates although the SNR of each code channel varies from each other

    Test of candidate light distributors for the muon (g-2) laser calibration system

    Full text link
    The new muon (g-2) experiment E989 at Fermilab will be equipped with a laser calibration system for all the 1296 channels of the calorimeters. An integrating sphere and an alternative system based on an engineered diffuser have been considered as possible light distributors for the experiment. We present here a detailed comparison of the two based on temporal response, spatial uniformity, transmittance and time stability.Comment: accepted to Nucl.Instrum.Meth.
    corecore