Search CORE

1,255 research outputs found

The quantification of Simpsons paradox and other contributions to contingency table theory

Author: Teuscher Friedrich
Publication venue
Publication date: 07/04/2021
Field of study

The analysis of contingency tables is a powerful statistical tool used in experiments with categorical variables. This study improves parts of the theory underlying the use of contingency tables. Specifically, the linkage disequilibrium parameter as a measure of two-way interactions applied to three-way tables makes it possible to quantify Simpsons paradox by a simple formula. With tests on three-way interactions, there is only one that determines whether the partial interactions of all variables agree or whether there is at least one variable whose partial interactions disagree. To date, there has been no test available that determines whether the partial interactions of a certain variable agree or disagree, and the presented work closes this gap. This work reveals the relation of the multiplicative and the additive measure of a three-way interaction. Another contribution addresses the question of which cells in a contingency table are fixed when the first- and second-order marginal totals are given. The proposed procedure not only detects fixed zero counts but also fixed positive counts. This impacts the determination of the degrees of freedom. Furthermore, limitations of methods that simulate contingency tables with given pairwise associations are addressed.Comment: 36 page

arXiv.org e-Print Archive

PubMed Central

Linking, Extending, and Using Existing Software Platforms

Author: Thiele Jan C.
Publication venue
Publication date: 08/12/2014
Field of study

Georg-August-University Göttingen

Recent advances in directional statistics

Author: García-Portugués Eduardo
Pewsey Arthur
Publication venue
Publication date: 22/09/2020
Field of study

Mainstream statistical methodology is generally applicable to data observed in Euclidean space. There are, however, numerous contexts of considerable scientific interest in which the natural supports for the data under consideration are Riemannian manifolds like the unit circle, torus, sphere and their extensions. Typically, such data can be represented using one or more directions, and directional statistics is the branch of statistics that deals with their analysis. In this paper we provide a review of the many recent developments in the field since the publication of Mardia and Jupp (1999), still the most comprehensive text on directional statistics. Many of those developments have been stimulated by interesting applications in fields as diverse as astronomy, medicine, genetics, neurology, aeronautics, acoustics, image analysis, text mining, environmetrics, and machine learning. We begin by considering developments for the exploratory analysis of directional data before progressing to distributional models, general approaches to inference, hypothesis testing, regression, nonparametric curve estimation, methods for dimension reduction, classification and clustering, and the modelling of time series, spatial and spatio-temporal data. An overview of currently available software for analysing directional data is also provided, and potential future developments discussed.Comment: 61 page

arXiv.org e-Print Archive

Crossref

Universidad Carlos III de Madrid e-Archivo

Design of Experiments for Screening

Author: A Boukouvalas
A Marrel
A Miller
A Saltelli
A Saltelli
A.E. Vine
AB Owen
AC Atkinson
AM Dean
B Abraham
B Bettonvil
B Bettonvil
B. Tang
B. Tang
BA Jones
BA Jones
BA Jones
C Daniel
C Linkletter
C.F.J. Wu
C.F.J. Wu
CA Mauro
CE Rasmussen
CJ Marley
CR Rao
CS Cheng
D Draguljić
D Dupuy
D Scott-Drechsel
D. Xing
D.T. Voss
DA Bulutoglu
DJ Finney
DKJ Lin
DKJ Lin
EI George
F Campolongo
F Campolongo
F Satterthwaite
FKH Phoa
FKH Phoa
G Damblin
G Pujol
G.S. Watson
GEP Box
GEP Box
GEP Box
GEP Box
GM James
H Moon
H. Wan
H. Xu
H. Yang
H.B.E. Wan
HA Chipman
JL Loeppky
JPC Kleijnen
K.Q. Ye
KHV Booth
KJ Ryan
KP Burnham
KT Fang
L Pronzato
L. Xiao
M Claeys-Bruno
M Hamada
M Hamada
M Johnson
M Liu
M.A. Wolters
MD McKay
MD Morris
MD Morris
MD Morris
MJ Hall
N Durrande
NA Butler
NK Nguyen
NK Nguyen
NK Nguyen
PR Scinto
PZG Qian
PZG Qian
PZG Qian
R Dorfman
R Jin
R Joseph
RB Gramacy
RK Meyer
RL Iman
RL Plackett
RV Lenth
S Ba
SC Cotter
SG Gilmour
SM Lewis
TJ Santner
VE Bowman
W DuMouchel
W Li
W.J. Welch
WA Brenneman
WW Li
X Qu
Y Benjamini
Y Liu
Publication venue
Publication date: 18/10/2015
Field of study

The aim of this paper is to review methods of designing screening experiments, ranging from designs originally developed for physical experiments to those especially tailored to experiments on numerical models. The strengths and weaknesses of the various designs for screening variables in numerical models are discussed. First, classes of factorial designs for experiments to estimate main effects and interactions through a linear statistical model are described, specifically regular and nonregular fractional factorial designs, supersaturated designs and systematic fractional replicate designs. Generic issues of aliasing, bias and cancellation of factorial effects are discussed. Second, group screening experiments are considered including factorial group screening and sequential bifurcation. Third, random sampling plans are discussed including Latin hypercube sampling and sampling plans to estimate elementary effects. Fourth, a variety of modelling methods commonly employed with screening designs are briefly described. Finally, a novel study demonstrates six screening methods on two frequently-used exemplars, and their performances are compared

arXiv.org e-Print Archive

Crossref

Math inside : surprising mathematics

Author
Publication venue: Technische Universiteit Eindhoven
Publication date: 01/01/2008
Field of study

Pure OAI Repository

Math inside : surprising mathematics

Author
Publication venue: Technische Universiteit Eindhoven
Publication date: 01/01/2008
Field of study

Pure OAI Repository

A New Take on John Maynard Smith's Concept of Protein Space for Understanding Molecular Evolution

Author: Hartl Daniel
Ogbunugafor C. Brandon
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/10/2016
Field of study

Much of the public lacks a proper understanding of Darwinian evolution, a problem that can be addressed with new learning and teaching approaches to be implemented both inside the classroom and in less formal settings. Few analogies have been as successful in communicating the basics of molecular evolution as John Maynard Smith’s protein space analogy (1970), in which he compared protein evolution to the transition between the terms WORD and GENE, changing one letter at a time to yield a different, meaningful word (in his example, the preferred path was WORD → WORE → GORE → GONE → GENE). Using freely available computer science tools (Google Books Ngram Viewer), we offer an update to Maynard Smith’s analogy and explain how it might be developed into an exploratory and pedagogical device for understanding the basics of molecular evolution and, more specifically, the adaptive landscape concept. We explain how the device works through several examples and provide resources that might facilitate its use in multiple settings, ranging from public engagement activities to formal instruction in evolution, population genetics, and computational biology

DSpace@MIT

Harvard University - DASH

Directory of Open Access Journals

PubMed Central

FigShare

From Biology to Mathematical Models and Back: Teaching Modeling to Biology Students, and Biology to Math and Engineering Students

Author: Ashcraft M. H.
Beer R. D.
Bihl F.
Britton N. F.
Chiel H. J.
Dahari H.
Dreger R. M.
Friel D. D.
Froyd J. E.
Ideker T.
Inada Y.
Jeong H.
Kollias S.
Kruger J.
Menke N. B.
Osborne J.
Pearson P. D.
Rinzel J.
Riviere B.
Tibbles P. M.
Tobias S.
Tyson J. J.
Ysseldyke J.
Publication venue: American Society for Cell Biology
Publication date
Field of study

We describe the development of a course to teach modeling and mathematical analysis skills to students of biology and to teach biology to students with strong backgrounds in mathematics, physics, or engineering. The two groups of students have different ways of learning material and often have strong negative feelings toward the area of knowledge that they find difficult. To give students a sense of mastery in each area, several complementary approaches are used in the course: 1) a “live” textbook that allows students to explore models and mathematical processes interactively; 2) benchmark problems providing key skills on which students make continuous progress; 3) assignment of students to teams of two throughout the semester; 4) regular one-on-one interactions with instructors throughout the semester; and 5) a term project in which students reconstruct, analyze, extend, and then write in detail about a recently published biological model. Based on student evaluations and comments, an attitude survey, and the quality of the students' term papers, the course has significantly increased the ability and willingness of biology students to use mathematical concepts and modeling tools to understand biological systems, and it has significantly enhanced engineering students' appreciation of biology

Crossref

PubMed Central

Recommended from our members

Defining the Identity and Dynamics of Adult Gastric Isthmus Stem Cells.

Author: Andersson-Rolf Amanda
Basak Onur
Chatzeli Lemonia
Clevers Hans
Dabrowska Catherine
Fink Juergen
Han Seungmin
Josserand Manon
Jörg David J
Kim Hyunki
Kim Jong Kyoung
Koo Bon-Kyoung
Lee Eunmin
Lee Ji-Hyun
Merker Sebastian R
Mort Richard Lester
Naumann Ronald
Philpott Anna
Sasaki Nobuo
Simons Benjamin D
Stange Daniel E
Trendafilova Teodora
Yum Min Kyu
Publication venue: Cell Stem Cell
Publication date: 01/09/2019
Field of study

The gastric corpus epithelium is the thickest part of the gastrointestinal tract and is rapidly turned over. Several markers have been proposed for gastric corpus stem cells in both isthmus and base regions. However, the identity of isthmus stem cells (IsthSCs) and the interaction between distinct stem cell populations is still under debate. Here, based on unbiased genetic labeling and biophysical modeling, we show that corpus glands are compartmentalized into two independent zones, with slow-cycling stem cells maintaining the base and actively cycling stem cells maintaining the pit-isthmus-neck region through a process of "punctuated" neutral drift dynamics. Independent lineage tracing based on Stmn1 and Ki67 expression confirmed that rapidly cycling IsthSCs maintain the pit-isthmus-neck region. Finally, single-cell RNA sequencing (RNA-seq) analysis is used to define the molecular identity and lineage relationship of a single, cycling, IsthSC population. These observations define the identity and functional behavior of IsthSCs.Wellcome Trust Royal Societ

Apollo (Cambridge)

Lancaster E-Prints

MPG.PuRe

DGIST Library Institutional Repository

General methods for evolutionary quantitative genetic inference from generalized mixed models

Author: de Villemereuil Pierre
Morrissey Michael
Nakagawa Shinichi
Schielzeth Holger
Publication venue: 'Genetics Society of America'
Publication date: 02/09/2016
Field of study

P.d.V. was supported by a doctoral studentship from the French Ministère de la Recherche et de l’Enseignement Supérieur. H.S. was supported by an Emmy Noether fellowship from the German Research Foundation (SCHI 1188/1-1). S.N. is supported by a Future Fellowship, Australia (FT130100268). M.M. is supported by a University Research Fellowship from the Royal Society (London). The collection of the Soay sheep data is supported by the National Trust for Scotland and QinetQ, with funding from the Natural Environment Research Council, the Royal Society, and the Leverhulme Trust.Methods for inference and interpretation of evolutionary quantitative genetic parameters, and for prediction of the response to selection, are best developed for traits with normal distributions. Many traits of evolutionary interest, including many life history and behavioural traits, have inherently non-normal distributions. The generalised linear mixed model (GLMM) framework has become a widely used tool for estimating quantitative genetic parameters for non-normal traits. However, whereas GLMMs provide inference on a statistically-convenient latent scale, it is often desirable to express quantitative genetic parameters on the scale upon which traits are measured. The parameters of fitted GLMMs, despite being on a latent scale, fully determine all quantities of potential interest on the scale on which traits are expressed. We provide expressions for deriving each of such quantities, including population means, phenotypic (co)variances, variance components including additive genetic (co)variances, and parameters such as heritability. We demonstrate that fixed effects have a strong impact on those parameters and show how to deal with this by averaging or integrating over fixed effects. The expressions require integration of quantities determined by the link function, over distributions of latent values. In general cases, the required integrals must be solved numerically, but efficient methods are available and we provide an implementation in an R package, QGGLMM. We show that known formulae for quantities such as heritability of traits with Binomial and Poisson distributions are special cases of our expressions. Additionally, we show how fitted GLMM can be incorporated into existing methods for predicting evolutionary trajectories. We demonstrate the accuracy of the resulting method for evolutionary prediction by simulation, and apply our approach to data from a wild pedigreed vertebrate population.Publisher PDFPeer reviewe

Hal - Université Grenoble Alpes

PubMed Central

HAL Université de Savoie

University of St. Andrews - Pure

St Andrews Research Repository