1,213 research outputs found
Recommended from our members
Empowering statistical methods for cellular and molecular biologists.
We provide guidelines for using statistical methods to analyze the types of experiments reported in cellular and molecular biology journals such as Molecular Biology of the Cell. Our aim is to help experimentalists use these methods skillfully, avoid mistakes, and extract the maximum amount of information from their laboratory work. We focus on comparing the average values of control and experimental samples. A Supplemental Tutorial provides examples of how to analyze experimental data using R software
motifDiverge: a model for assessing the statistical significance of gene regulatory motif divergence between two DNA sequences
Next-generation sequencing technology enables the identification of thousands
of gene regulatory sequences in many cell types and organisms. We consider the
problem of testing if two such sequences differ in their number of binding site
motifs for a given transcription factor (TF) protein. Binding site motifs
impart regulatory function by providing TFs the opportunity to bind to genomic
elements and thereby affect the expression of nearby genes. Evolutionary
changes to such functional DNA are hypothesized to be major contributors to
phenotypic diversity within and between species; but despite the importance of
TF motifs for gene expression, no method exists to test for motif loss or gain.
Assuming that motif counts are Binomially distributed, and allowing for
dependencies between motif instances in evolutionarily related sequences, we
derive the probability mass function of the difference in motif counts between
two nucleotide sequences. We provide a method to numerically estimate this
distribution from genomic data and show through simulations that our estimator
is accurate. Finally, we introduce the R package {\tt motifDiverge} that
implements our methodology and illustrate its application to gene regulatory
enhancers identified by a mouse developmental time course experiment. While
this study was motivated by analysis of regulatory motifs, our results can be
applied to any problem involving two correlated Bernoulli trials
Failure of an Educational Intervention to Improve Consultation and Implications for Healthcare Consultation.
INTRODUCTION:
Consultation of another physician for his or her specialized expertise regarding a patient's care is a common occurrence in most physicians' daily practice, especially in the emergency department (ED). Therefore, the ability to communicate effectively with another physician during a patient consultation is an essential skill. However, there has been limited research on a standardized method for a physician to physician consultation with little guidance on teaching consultations to physicians in training. The objective of our study was to measure the effect of a structured consultation intervention on both content standardization and quality of medical student consultations.
METHODS:
Senior medical students were assessed on a required emergency medicine rotation with a physician phone consultation during a standardized, simulated chest pain case. The intervention groups received a standard consult checklist as part of their orientation to the rotation, followed by a video recording of a good consult call and a bad consult call with commentary from an emergency physician. The intervention was given to students every other month, alternating with a control group who received no additional education. Recordings were reviewed by three second-year internal medicine residents pursuing a fellowship in cardiology. Each recording was evaluated by two of the three reviewers and scored using a standardized checklist.
RESULTS:
Providing a standardized consultation intervention did not improve students' ability to communicate with consultants. In addition, there was variability between evaluators in regards to how they received the same information and how they perceived the quality of the same recorded consultation calls. Evaluator inter-rater reliability (IRR) was poor on the questions of 1) would you have any other questions of the student calling the consult and 2) did the student calling the consult provide an accurate account of information and case detail. The IRR was also poor on objective data such as whether the student stated their name.
CONCLUSIONS:
A brief intervention may not be enough to change complex behavior such as a physician to physician consultant communication. Importantly, despite consultants listening to the same audio recordings, the information was processed differently. Future investigations should focus on both those delivering as well as those receiving a consultation
SIRT1 and SIRT3 deacetylate homologous substrates: AceCS1,2 and HMGCS1,2.
SIRT1 and SIRT3 are NAD+-dependent protein deacetylases that are evolutionarily conserved across mammals. These proteins are located in the cytoplasm/nucleus and mitochondria, respectively. Previous reports demonstrated that human SIRT1 deacetylates Acetyl-CoA Synthase 1 (AceCS1) in the cytoplasm, whereas SIRT3 deacetylates the homologous Acetyl-CoA Synthase 2 (AceCS2) in the mitochondria. We recently showed that 3-hydroxy-3-methylglutaryl CoA synthase 2 (HMGCS2) is deacetylated by SIRT3 in mitochondria, and we demonstrate here that SIRT1 deacetylates the homologous 3-hydroxy-3-methylglutaryl CoA synthase 1 (HMGCS1) in the cytoplasm. This novel pattern of substrate homology between cytoplasmic SIRT1 and mitochondrial SIRT3 suggests that considering evolutionary relationships between the sirtuins and their substrates may help to identify and understand the functions and interactions of this gene family. In this perspective, we take a first step by characterizing the evolutionary history of the sirtuins and these substrate families
A New Partitioning Around Medoids Algorithm
Kaufman & Rousseeuw (1990) proposed a clustering algorithm Partitioning Around Medoids (PAM) which maps a distance matrix into a specified number of clusters. A particularly nice property is that PAM allows clustering with respect to any specified distance metric. In addition, the medoids are robust representations of the cluster centers, which is particularly important in the common context that many elements do not belong well to any cluster. Based on our experience in clustering gene expression data, we have noticed that PAM does have problems recognizing relatively small clusters in situations where good partitions around medoids clearly exist. In this note, we propose to partition around medoids by maximizing a criteria Average Silhouette\u27\u27 defined by Kaufman & Rousseeuw. We also propose a fast-to-compute approximation of Average Silhouette\u27\u27. We implement these two new partitioning around medoids algorithms and illustrate their performance relative to existing partitioning methods in simulations
Multiple Testing. Part II. Step-Down Procedures for Control of the Family-Wise Error Rate
The present article proposes two step-down multiple testing procedures for asymptotic control of the family-wise error rate (FWER): the first procedure is based on maxima of test statistics (step-down maxT), while the second relies on minima of unadjusted p-values (step-down minP). A key feature of our approach is the test statistics null distribution (rather than data generating null distribution) used to derive cut-offs (i.e., rejection regions) for these test statistics and the resulting adjusted p-values. For general null hypotheses, corresponding to submodels for the data generating distribution, we identify an asymptotic domination condition for a null distribution under which the step-down maxT and minP procedures asymptotically control the Type I error rate, for arbitrary data generating distributions, without the need for conditions such as subset pivotality. Inspired by this general characterization of a null distribution, we then propose as an explicit null distribution the asymptotic distribution of the vector of null-value shifted and scaled test statistics. Step-down procedures based on consistent estimators of the null distribution are shown to also provide asymptotic control of the Type I error rate. A general bootstrap algorithm is supplied to conveniently obtain consistent estimators of the null distribution
Resampling-based Multiple Testing: Asymptotic Control of Type I Error and Applications to Gene Expression Data
We define a general statistical framework for multiple hypothesis testing and show that the correct null distribution for the test statistics is obtained by projecting the true distribution of the test statistics onto the space of mean zero distributions. For common choices of test statistics (based on an asymptotically linear parameter estimator), this distribution is asymptotically multivariate normal with mean zero and the covariance of the vector influence curve for the parameter estimator. This test statistic null distribution can be estimated by applying the non-parametric or parametric bootstrap to correctly centered test statistics. We prove that this bootstrap estimated null distribution provides asymptotic control of most type I error rates. We show that obtaining a test statistic null distribution from a data null distribution, e.g. projecting the data generating distribution onto the space of all distributions satisfying the complete null), only provides the correct test statistic null distribution if the covariance of the vector influence curve is the same under the data null distribution as under the true data distribution. This condition is a weak version of the subset pivotality condition. We show that our multiple testing methodology controlling type I error is equivalent to constructing an error-specific confidence region for the true parameter and checking if it contains the hypothesized value. We also study the two sample problem and show that the permutation distribution produces an asymptotically correct null distribution if (i) the sample sizes are equal or (ii) the populations have the same covariance structure. We include a discussion of the application of multiple testing to gene expression data, where the dimension typically far exceeds the sample size. An analysis of a cancer gene expression data set illustrates the methodology
Supervised Distance Matrices: Theory and Applications to Genomics
We propose a new approach to studying the relationship between a very high dimensional random variable and an outcome. Our method is based on a novel concept, the supervised distance matrix, which quantifies pairwise similarity between variables based on their association with the outcome. A supervised distance matrix is derived in two stages. The first stage involves a transformation based on a particular model for association. In particular, one might regress the outcome on each variable and then use the residuals or the influence curve from each regression as a data transformation. In the second stage, a choice of distance measure is used to compute all pairwise distances between variables in this transformed data. When the outcome is right-censored, we show that the supervised distance matrix can be consistently estimated using inverse probability of censoring weighted (IPCW) estimators based on the mean and covariance of the transformed data. The proposed methodology is illustrated with examples of gene expression data analysis with a survival outcome. This approach is widely applicable in genomics and other fields where high-dimensional data is collected on each subject
- …
