39,846 research outputs found
Pooling designs with surprisingly high degree of error correction in a finite vector space
Pooling designs are standard experimental tools in many biotechnical
applications. It is well-known that all famous pooling designs are constructed
from mathematical structures by the "containment matrix" method. In particular,
Macula's designs (resp. Ngo and Du's designs) are constructed by the
containment relation of subsets (resp. subspaces) in a finite set (resp. vector
space). Recently, we generalized Macula's designs and obtained a family of
pooling designs with more high degree of error correction by subsets in a
finite set. In this paper, as a generalization of Ngo and Du's designs, we
study the corresponding problems in a finite vector space and obtain a family
of pooling designs with surprisingly high degree of error correction. Our
designs and Ngo and Du's designs have the same number of items and pools,
respectively, but the error-tolerant property is much better than that of Ngo
and Du's designs, which was given by D'yachkov et al. \cite{DF}, when the
dimension of the space is large enough
Methodological Issues in Multistage Genome-Wide Association Studies
Because of the high cost of commercial genotyping chip technologies, many
investigations have used a two-stage design for genome-wide association
studies, using part of the sample for an initial discovery of ``promising''
SNPs at a less stringent significance level and the remainder in a joint
analysis of just these SNPs using custom genotyping. Typical cost savings of
about 50% are possible with this design to obtain comparable levels of overall
type I error and power by using about half the sample for stage I and carrying
about 0.1% of SNPs forward to the second stage, the optimal design depending
primarily upon the ratio of costs per genotype for stages I and II. However,
with the rapidly declining costs of the commercial panels, the generally low
observed ORs of current studies, and many studies aiming to test multiple
hypotheses and multiple endpoints, many investigators are abandoning the
two-stage design in favor of simply genotyping all available subjects using a
standard high-density panel. Concern is sometimes raised about the absence of a
``replication'' panel in this approach, as required by some high-profile
journals, but it must be appreciated that the two-stage design is not a
discovery/replication design but simply a more efficient design for discovery
using a joint analysis of the data from both stages. Once a subset of
highly-significant associations has been discovered, a truly independent
``exact replication'' study is needed in a similar population of the same
promising SNPs using similar methods.Comment: Published in at http://dx.doi.org/10.1214/09-STS288 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
A construction of pooling designs with surprisingly high degree of error correction
It is well-known that many famous pooling designs are constructed from
mathematical structures by the "containment matrix" method. In this paper, we
propose another method and obtain a family of pooling designs with surprisingly
high degree of error correction based on a finite set. Given the numbers of
items and pools, the error-tolerant property of our designs is much better than
that of Macula's designs when the size of the set is large enough
Gather-Excite: Exploiting Feature Context in Convolutional Neural Networks
While the use of bottom-up local operators in convolutional neural networks
(CNNs) matches well some of the statistics of natural images, it may also
prevent such models from capturing contextual long-range feature interactions.
In this work, we propose a simple, lightweight approach for better context
exploitation in CNNs. We do so by introducing a pair of operators: gather,
which efficiently aggregates feature responses from a large spatial extent, and
excite, which redistributes the pooled information to local features. The
operators are cheap, both in terms of number of added parameters and
computational complexity, and can be integrated directly in existing
architectures to improve their performance. Experiments on several datasets
show that gather-excite can bring benefits comparable to increasing the depth
of a CNN at a fraction of the cost. For example, we find ResNet-50 with
gather-excite operators is able to outperform its 101-layer counterpart on
ImageNet with no additional learnable parameters. We also propose a parametric
gather-excite operator pair which yields further performance gains, relate it
to the recently-introduced Squeeze-and-Excitation Networks, and analyse the
effects of these changes to the CNN feature activation statistics.Comment: NeurIPS 201
Effect of pooling samples on the efficiency of comparative studies using microarrays
Many biomedical experiments are carried out by pooling individual biological
samples. However, pooling samples can potentially hide biological variance and
give false confidence concerning the data significance. In the context of
microarray experiments for detecting differentially expressed genes, recent
publications have addressed the problem of the efficiency of sample-pooling,
and some approximate formulas were provided for the power and sample size
calculations. It is desirable to have exact formulas for these calculations and
have the approximate results checked against the exact ones. We show that the
difference between the approximate and exact results can be large. In this
study, we have characterized quantitatively the effect of pooling samples on
the efficiency of microarray experiments for the detection of differential gene
expression between two classes. We present exact formulas for calculating the
power of microarray experimental designs involving sample pooling and technical
replications. The formulas can be used to determine the total numbers of arrays
and biological subjects required in an experiment to achieve the desired power
at a given significance level. The conditions under which pooled design becomes
preferable to non-pooled design can then be derived given the unit cost
associated with a microarray and that with a biological subject. This paper
thus serves to provide guidance on sample pooling and cost effectiveness. The
formulation in this paper is outlined in the context of performing microarray
comparative studies, but its applicability is not limited to microarray
experiments. It is also applicable to a wide range of biomedical comparative
studies where sample pooling may be involved.Comment: 8 pages, 1 figure, 2 tables; to appear in Bioinformatic
Toolflows for Mapping Convolutional Neural Networks on FPGAs: A Survey and Future Directions
In the past decade, Convolutional Neural Networks (CNNs) have demonstrated
state-of-the-art performance in various Artificial Intelligence tasks. To
accelerate the experimentation and development of CNNs, several software
frameworks have been released, primarily targeting power-hungry CPUs and GPUs.
In this context, reconfigurable hardware in the form of FPGAs constitutes a
potential alternative platform that can be integrated in the existing deep
learning ecosystem to provide a tunable balance between performance, power
consumption and programmability. In this paper, a survey of the existing
CNN-to-FPGA toolflows is presented, comprising a comparative study of their key
characteristics which include the supported applications, architectural
choices, design space exploration methods and achieved performance. Moreover,
major challenges and objectives introduced by the latest trends in CNN
algorithmic research are identified and presented. Finally, a uniform
evaluation methodology is proposed, aiming at the comprehensive, complete and
in-depth evaluation of CNN-to-FPGA toolflows.Comment: Accepted for publication at the ACM Computing Surveys (CSUR) journal,
201
Efficient Two-Stage Group Testing Algorithms for Genetic Screening
Efficient two-stage group testing algorithms that are particularly suited for
rapid and less-expensive DNA library screening and other large scale biological
group testing efforts are investigated in this paper. The main focus is on
novel combinatorial constructions in order to minimize the number of individual
tests at the second stage of a two-stage disjunctive testing procedure.
Building on recent work by Levenshtein (2003) and Tonchev (2008), several new
infinite classes of such combinatorial designs are presented.Comment: 14 pages; to appear in "Algorithmica". Part of this work has been
presented at the ICALP 2011 Group Testing Workshop; arXiv:1106.368
- …