208 research outputs found
Improved Dropout for Shallow and Deep Learning
Dropout has been witnessed with great success in training deep neural
networks by independently zeroing out the outputs of neurons at random. It has
also received a surge of interest for shallow learning, e.g., logistic
regression. However, the independent sampling for dropout could be suboptimal
for the sake of convergence. In this paper, we propose to use multinomial
sampling for dropout, i.e., sampling features or neurons according to a
multinomial distribution with different probabilities for different
features/neurons. To exhibit the optimal dropout probabilities, we analyze the
shallow learning with multinomial dropout and establish the risk bound for
stochastic optimization. By minimizing a sampling dependent factor in the risk
bound, we obtain a distribution-dependent dropout with sampling probabilities
dependent on the second order statistics of the data distribution. To tackle
the issue of evolving distribution of neurons in deep learning, we propose an
efficient adaptive dropout (named \textbf{evolutional dropout}) that computes
the sampling probabilities on-the-fly from a mini-batch of examples. Empirical
studies on several benchmark datasets demonstrate that the proposed dropouts
achieve not only much faster convergence and but also a smaller testing error
than the standard dropout. For example, on the CIFAR-100 data, the evolutional
dropout achieves relative improvements over 10\% on the prediction performance
and over 50\% on the convergence speed compared to the standard dropout.Comment: In NIPS 201
VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentation
Rich and dense human labeled datasets are among the main enabling factors for
the recent advance on vision-language understanding. Many seemingly distant
annotations (e.g., semantic segmentation and visual question answering (VQA))
are inherently connected in that they reveal different levels and perspectives
of human understandings about the same visual scenes --- and even the same set
of images (e.g., of COCO). The popularity of COCO correlates those annotations
and tasks. Explicitly linking them up may significantly benefit both individual
tasks and the unified vision and language modeling. We present the preliminary
work of linking the instance segmentations provided by COCO to the questions
and answers (QAs) in the VQA dataset, and name the collected links visual
questions and segmentation answers (VQS). They transfer human supervision
between the previously separate tasks, offer more effective leverage to
existing problems, and also open the door for new research problems and models.
We study two applications of the VQS data in this paper: supervised attention
for VQA and a novel question-focused semantic segmentation task. For the
former, we obtain state-of-the-art results on the VQA real multiple-choice task
by simply augmenting the multilayer perceptrons with some attention features
that are learned using the segmentation-QA links as explicit supervision. To
put the latter in perspective, we study two plausible methods and compare them
to an oracle method assuming that the instance segmentations are given at the
test stage.Comment: To appear on ICCV 201
Computational grid generation for the design of free-form shells with complex boundary conditions
Free-form grid structures have been widely used in various public buildings, and many are bounded by complex curves including internal voids. Modern computational design software enables the rapid creation and exploration of such complex surface geometries for architectural design, but the resulting shapes lack an obvious way for engineers to create a discrete structural grid to support the surface that manifests the architect's intent. This paper presents an efficient design approach for the synthesis of free-form grid structures based on guideline and surface-flattening methods, which consider complex features and internal boundaries. The method employs a fast and straightforward approach, which achieves fluent lines with bars of balanced length. The parametric domain of a complete nonuniform rational basis spline (NURBS) surface is first divided into a number of patches, and a discrete free-form surface is formed by mapping dividing points onto the surface. The free-form surface is then flattened based on the principle of equal area. Accordingly, the flattened rectangular lattices are then fit to the two-dimensional (2D) surface, with grids formed by applying a guideline method. Subsequently, the intersections of the guidelines and the complex boundary are obtained, and the guidelines are divided equally between boundaries to produce grids connected at the dividing points. Finally, the 2D grids are mapped back onto the three-dimensional (3D) surface and a spring-mass relaxation method is employed to further improve the smoothness of the resulting grids. The paper concludes by presenting realistic examples to demonstrate the practical effectiveness of the proposed method.</p
- …
