815 research outputs found
Sequential category aggregation and partitioning approaches for multi-way contingency tables based on survey and census data
Large contingency tables arise in many contexts but especially in the
collection of survey and census data by government statistical agencies.
Because the vast majority of the variables in this context have a large number
of categories, agencies and users need a systematic way of constructing tables
which are summaries of such contingency tables. We propose such an approach in
this paper by finding members of a class of restricted log-linear models which
maximize the likelihood of the data and use this to find a parsimonious means
of representing the table. In contrast with more standard approaches for model
search in hierarchical log-linear models (HLLM), our procedure systematically
reduces the number of categories of the variables. Through a series of
examples, we illustrate the extent to which it can preserve the interaction
structure found with HLLMs and be used as a data simplification procedure prior
to HLL modeling. A feature of the procedure is that it can easily be applied to
many tables with millions of cells, providing a new way of summarizing large
data sets in many disciplines. The focus is on information and description
rather than statistical testing. The procedure may treat each variable in the
table in different ways, preserving full detail, treating it as fully nominal,
or preserving ordinality.Comment: Published in at http://dx.doi.org/10.1214/08-AOAS175 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Solving the 100 Swiss Francs Problem
Sturmfels offered 100 Swiss Francs in 2005 to a conjecture, which deals with
a special case of the maximum likelihood estimation for a latent class model.
This paper confirms the conjecture positively
DRUG-NEM: Optimizing drug combinations using single-cell perturbation response to account for intratumoral heterogeneity.
An individual malignant tumor is composed of a heterogeneous collection of single cells with distinct molecular and phenotypic features, a phenomenon termed intratumoral heterogeneity. Intratumoral heterogeneity poses challenges for cancer treatment, motivating the need for combination therapies. Single-cell technologies are now available to guide effective drug combinations by accounting for intratumoral heterogeneity through the analysis of the signaling perturbations of an individual tumor sample screened by a drug panel. In particular, Mass Cytometry Time-of-Flight (CyTOF) is a high-throughput single-cell technology that enables the simultaneous measurements of multiple ([Formula: see text]40) intracellular and surface markers at the level of single cells for hundreds of thousands of cells in a sample. We developed a computational framework, entitled Drug Nested Effects Models (DRUG-NEM), to analyze CyTOF single-drug perturbation data for the purpose of individualizing drug combinations. DRUG-NEM optimizes drug combinations by choosing the minimum number of drugs that produce the maximal desired intracellular effects based on nested effects modeling. We demonstrate the performance of DRUG-NEM using single-cell drug perturbation data from tumor cell lines and primary leukemia samples
Multiple‐systems analysis for the quantification of modern slavery: classical and Bayesian approaches
Multiple systems estimation is a key approach for quantifying hidden populations such as the number of victims of modern slavery. The UK Government published an estimate of 10,000 to 13,000 victims, constructed by the present author, as part of the strategy leading to the Modern Slavery Act 2015. This estimate was obtained by a stepwise multiple systems method based on six lists. Further investigation shows that a small proportion of the possible models give rather different answers, and that other model fitting approaches may choose one of these. Three data sets collected in the field of modern slavery, together with a data set about the death toll in the Kosovo conflict, are used to investigate the stability and robustness of various multiple systems estimate methods. The crucial aspect is the way that interactions between lists are modelled, because these can substantially affect the results. Model selection and Bayesian approaches are considered in detail, in particular to assess their stability and robustness when applied to real modern slavery data. A new Markov Chain Monte Carlo Bayesian approach is developed; overall, this gives robust and stable results at least for the examples considered. The software and datasets are freely and publicly available to facilitate wider implementation and further research
Coded Parity Packet Transmission Method for Two Group Resource Allocation
Gap value control is investigated when the number of source and parity packets
is adjusted in a concatenated coding scheme whilst keeping the overall coding
rate fixed. Packet-based outer codes which are generated from bit-wise XOR
combinations of the source packets are used to adjust the number of both source
packets. Having the source packets, the number of parity packets, which are the
bit-wise XOR combinations of the source packets can be adjusted such that the
gap value, which measures the gap between the theoretical and the required
signal-to-noise ratio (SNR), is controlled without changing the actual coding
rate. Consequently, the required SNR reduces, yielding a lower required energy
to realize the transmission data rate. Integrating this coding technique with
a two-group resource allocation scheme renders efficient utilization of the total
energy to further improve the data rates. With a relatively small-sized set of
discrete data rates, the system throughput achieved by the proposed two-group
loading scheme is observed to be approximately equal to that of the existing
loading scheme, which is operated with a much larger set of discrete data rates.
The gain obtained by the proposed scheme over the existing equal rate and
equal energy loading scheme is approximately 5 dB. Furthermore, a successive
interference cancellation scheme is also integrated with this coding technique,
which can be used to decode and provide consecutive symbols for inter-symbol
interference (ISI) and multiple access interference (MAI) mitigation. With this
integrated scheme, the computational complexity is signi cantly reduced by
eliminating matrix inversions. In the same manner, the proposed coding scheme
is also incorporated into a novel fixed energy loading, which distributes packets
over parallel channels, to control the gap value of the data rates although the
SNR of each code channel varies from each other
Test of candidate light distributors for the muon (g2) laser calibration system
The new muon (g-2) experiment E989 at Fermilab will be equipped with a laser
calibration system for all the 1296 channels of the calorimeters. An
integrating sphere and an alternative system based on an engineered diffuser
have been considered as possible light distributors for the experiment. We
present here a detailed comparison of the two based on temporal response,
spatial uniformity, transmittance and time stability.Comment: accepted to Nucl.Instrum.Meth.
- …