95 research outputs found

    Rediscovering a little known fact about the t-test and the F-test: Algebraic, Geometric, Distributional and Graphical Considerations

    Full text link
    We discuss the role that the null hypothesis should play in the construction of a test statistic used to make a decision about that hypothesis. To construct the test statistic for a point null hypothesis about a binomial proportion, a common recommendation is to act as if the null hypothesis is true. We argue that, on the surface, the one-sample t-test of a point null hypothesis about a Gaussian population mean does not appear to follow the recommendation. We show how simple algebraic manipulations of the usual t-statistic lead to an equivalent test procedure consistent with the recommendation. We provide geometric intuition regarding this equivalence and we consider extensions to testing nested hypotheses in Gaussian linear models. We discuss an application to graphical residual diagnostics where the form of the test statistic makes a practical difference. By examining the formulation of the test statistic from multiple perspectives in this familiar example, we provide simple, concrete illustrations of some important issues that can guide the formulation of effective solutions to more complex statistical problems.Comment: 22 pages, 5 figure

    On the variations of the principal eigenvalue with respect to a parameter in growth-fragmentation models

    Get PDF
    We study the variations of the principal eigenvalue associated to a growth-fragmentation-death equation with respect to a parameter acting on growth and fragmentation. To this aim, we use the probabilistic individual-based interpretation of the model. We study the variations of the survival probability of the stochastic model, using a generation by generation approach. Then, making use of the link between the survival probability and the principal eigenvalue established in a previous work, we deduce the variations of the eigenvalue with respect to the parameter of the model

    Generalised Reichenbachian common cause systems

    Get PDF
    The principle of the common cause claims that if an improbable coincidence has occurred, there must exist a common cause. This is generally taken to mean that positive correlations between non-causally related events should disappear when conditioning on the action of some underlying common cause. The extended interpretation of the principle, by contrast, urges that common causes should be called for in order to explain positive deviations between the estimated correlation of two events and the expected value of their correlation. The aim of this paper is to provide the extended reading of the principle with a general probabilistic model, capturing the simultaneous action of a system of multiple common causes. To this end, two distinct models are elaborated, and the necessary and sufficient conditions for their existence are determined

    Marginal AMP Chain Graphs

    Full text link
    We present a new family of models that is based on graphs that may have undirected, directed and bidirected edges. We name these new models marginal AMP (MAMP) chain graphs because each of them is Markov equivalent to some AMP chain graph under marginalization of some of its nodes. However, MAMP chain graphs do not only subsume AMP chain graphs but also multivariate regression chain graphs. We describe global and pairwise Markov properties for MAMP chain graphs and prove their equivalence for compositional graphoids. We also characterize when two MAMP chain graphs are Markov equivalent. For Gaussian probability distributions, we also show that every MAMP chain graph is Markov equivalent to some directed and acyclic graph with deterministic nodes under marginalization and conditioning on some of its nodes. This is important because it implies that the independence model represented by a MAMP chain graph can be accounted for by some data generating process that is partially observed and has selection bias. Finally, we modify MAMP chain graphs so that they are closed under marginalization for Gaussian probability distributions. This is a desirable feature because it guarantees parsimonious models under marginalization.Comment: Changes from v1 to v2: Discussion section got extended. Changes from v2 to v3: New Sections 3 and 5. Changes from v3 to v4: Example 4 added to discussion section. Changes from v4 to v5: None. Changes from v5 to v6: Some minor and major errors have been corrected. The latter include the definitions of descending route and pairwise separation base, and the proofs of Theorems 5 and

    Randomizing world trade. I. A binary network analysis

    Get PDF
    The international trade network (ITN) has received renewed multidisciplinary interest due to recent advances in network theory. However, it is still unclear whether a network approach conveys additional, nontrivial information with respect to traditional international-economics analyses that describe world trade only in terms of local (first-order) properties. In this and in a companion paper, we employ a recently proposed randomization method to assess in detail the role that local properties have in shaping higher-order patterns of the ITN in all its possible representations (binary or weighted, directed or undirected, aggregated or disaggregated by commodity) and across several years. Here we show that, remarkably, the properties of all binary projections of the network can be completely traced back to the degree sequence, which is therefore maximally informative. Our results imply that explaining the observed degree sequence of the ITN, which has not received particular attention in economic theory, should instead become one the main focuses of models of trade

    ceylon: An R package for plotting the maps of Sri Lanka

    Full text link
    The rapid evolution in the fields of computer science, data science, and artificial intelligence has significantly transformed the utilisation of data for decision-making. Data visualisation plays a critical role in any work that involves data. Visualising data on maps is frequently encountered in many fields. Visualising data on maps not only transforms raw data into visually comprehensible representations but also converts complex spatial information into simple, understandable form. Locating the data files necessary for map creation can be a challenging task. Establishing a centralised repository can alleviate the challenging task of finding shape files, allowing users to efficiently discover geographic data. The ceylon R package is designed to make simple feature data related to Sri Lanka's administrative boundaries and rivers and streams accessible for a diverse range of R users. With straightforward functionalities, this package allows users to quickly plot and explore administrative boundaries and rivers and streams in Sri Lanka.Comment:

    The loss value of multilinear regression

    Full text link
    A formula for the euclidean distance between a point and a linear subspace is presented. As a consequence a formula for determinants of positive semidefinite, hermitian matrices is derived, and a formula for the loss value of multilinear regression.Comment: 3 pages. arXiv admin note: text overlap with arXiv:1408.592

    Discussion of ‘Nonparametric generalized fiducial inference for survival functions under censoring’

    Get PDF
    The following discussion is inspired by the paper Nonparametric generalized fiducial inference for survival functions under censoring by Cui and Hannig. The discussion consists of comments on the results, but also indicates it’s importance more generally in the context of fiducial inference. A two page introduction to fiducial inference is given to provide a context.acceptedVersionLocked until 12.8.2020 due to copyright restrictions. This is a pre-copyedited, author-produced version of an article accepted for publication in [Biometrika] following peer review. The version of record is available online at: https://doi.org/10.1093/biomet/asz02
    corecore