6,235 research outputs found
A Bayesian measurement error model for two-channel cell-based RNAi data with replicates
RNA interference (RNAi) is an endogenous cellular process in which small
double-stranded RNAs lead to the destruction of mRNAs with complementary
nucleoside sequence. With the production of RNAi libraries, large-scale RNAi
screening in human cells can be conducted to identify unknown genes involved in
a biological pathway. One challenge researchers face is how to deal with the
multiple testing issue and the related false positive rate (FDR) and false
negative rate (FNR). This paper proposes a Bayesian hierarchical measurement
error model for the analysis of data from a two-channel RNAi high-throughput
experiment with replicates, in which both the activity of a particular
biological pathway and cell viability are monitored and the goal is to identify
short hair-pin RNAs (shRNAs) that affect the pathway activity without affecting
cell activity. Simulation studies demonstrate the flexibility and robustness of
the Bayesian method and the benefits of having replicates in the experiment.
This method is illustrated through analyzing the data from a RNAi
high-throughput screening that searches for cellular factors affecting HCV
replication without affecting cell viability; comparisons of the results from
this HCV study and some of those reported in the literature are included.Comment: Published in at http://dx.doi.org/10.1214/11-AOAS496 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Detecting Suspicious Behavior in Surveillance Images
We introduce a novel technique to detect anomalies in images. The notion of normalcy is given by a baseline of images, under the assumption that the majority of such images is normal. The key of our approach is a featureless probabilistic representation of images, based on the length of the codeword necessary to represent each image. Such codeword’s lengths are then used for anomaly detection based on statistical testing. Our techniques were tested on synthetic and real data sets. The results show that our approach can achieve high true positive and low false positive rates.
Recommended from our members
MPRAnalyze: statistical framework for massively parallel reporter assays.
Massively parallel reporter assays (MPRAs) can measure the regulatory function of thousands of DNA sequences in a single experiment. Despite growing popularity, MPRA studies are limited by a lack of a unified framework for analyzing the resulting data. Here we present MPRAnalyze: a statistical framework for analyzing MPRA count data. Our model leverages the unique structure of MPRA data to quantify the function of regulatory sequences, compare sequences' activity across different conditions, and provide necessary flexibility in an evolving field. We demonstrate the accuracy and applicability of MPRAnalyze on simulated and published data and compare it with existing methods
Multiple Instance Learning for Detecting Anomalies over Sequential Real-World Datasets
Detecting anomalies over real-world datasets remains a challenging task. Data
annotation is an intensive human labor problem, particularly in sequential
datasets, where the start and end time of anomalies are not known. As a result,
data collected from sequential real-world processes can be largely unlabeled or
contain inaccurate labels. These characteristics challenge the application of
anomaly detection techniques based on supervised learning. In contrast,
Multiple Instance Learning (MIL) has been shown effective on problems with
incomplete knowledge of labels in the training dataset, mainly due to the
notion of bags. While largely under-leveraged for anomaly detection, MIL
provides an appealing formulation for anomaly detection over real-world
datasets, and it is the primary contribution of this paper. In this paper, we
propose an MIL-based formulation and various algorithmic instantiations of this
framework based on different design decisions for key components of the
framework. We evaluate the resulting algorithms over four datasets that capture
different physical processes along different modalities. The experimental
evaluation draws out several observations. The MIL-based formulation performs
no worse than single instance learning on easy to moderate datasets and
outperforms single-instance learning on more challenging datasets. Altogether,
the results show that the framework generalizes well over diverse datasets
resulting from different real-world application domains.Comment: 9 pages,5 figures, Anomaly and Novelty Detection, Explanation and
Accommodation (ANDEA 2022
Genomic Selection Signatures In Sheep From The Western Pyrenees
Background: The current large spectrum of sheep phenotypic diversity results from the combined product of sheep selection for different production traits such as wool, milk and meat, and its natural adaptation to new environments. In this study, we scanned the genome of 25 Sasi Ardi and 75 Latxa sheep from the Western Pyrenees for three types of regions under selection: (1) regions underlying local adaptation of Sasi Ardi semi-feral sheep, (2) regions related to a long traditional dairy selection pressure in Latxa sheep, and (3) regions experiencing the specific effect of the modern genetic improvement program established for the Latxa breed during the last three decades.
Results: Thirty-two selected candidate regions including 147 annotated genes were detected by using three statistical parameters: pooled heterozygosity H, Tajima's D, and Wright's fixation index F-st. For Sasi Ardi sheep, chromosomes Ovis aries (OAR) 4, 6, and 22 showed the strongest signals and harbored several candidate genes related to energy metabolism and morphology (BBS9, ELOVL3 and LDB1), immunity (NFKB2), and reproduction (H2AFZ). The major genomic difference between Sasi Ardi and Latxa sheep was on OAR6, which is known to affect milk production, with highly selected regions around the ABCG2, SPP1, LAP3, NCAPG, LCORL, and MEPE genes in Latxa sheep. The effect of the modern genetic improvement program on Latxa sheep was also evident on OAR15, on which several olfactory genes are located. We also detected several genes involved in reproduction such as ESR1 and ZNF366 that were affected by this selection program.
Conclusions: Natural and artificial selection have shaped the genome of both Sasi Ardi and Latxa sheep. Our results suggest that Sasi Ardi traits related to energy metabolism, morphological, reproductive, and immunological features have been under positive selection to adapt this semi-feral sheep to its particular environment. The highly selected Latxa sheep for dairy production showed clear signatures of selection in genomic regions related to milk production. Furthermore, our data indicate that the selection criteria applied in the modern genetic improvement program affect immunity and reproduction traits.The authors gratefully acknowledge support from the University of the Basque Country (UPV/EHU) and the Conservatoire des Races d'Aquitaine (US13/29
- …