636 research outputs found
Playing for Data: Ground Truth from Computer Games
Recent progress in computer vision has been driven by high-capacity models
trained on large datasets. Unfortunately, creating large datasets with
pixel-level labels has been extremely costly due to the amount of human effort
required. In this paper, we present an approach to rapidly creating
pixel-accurate semantic label maps for images extracted from modern computer
games. Although the source code and the internal operation of commercial games
are inaccessible, we show that associations between image patches can be
reconstructed from the communication between the game and the graphics
hardware. This enables rapid propagation of semantic labels within and across
images synthesized by the game, with no access to the source code or the
content. We validate the presented approach by producing dense pixel-level
semantic annotations for 25 thousand images synthesized by a photorealistic
open-world computer game. Experiments on semantic segmentation datasets show
that using the acquired data to supplement real-world images significantly
increases accuracy and that the acquired data enables reducing the amount of
hand-labeled real-world data: models trained with game data and just 1/3 of the
CamVid training set outperform models trained on the complete CamVid training
set.Comment: Accepted to the 14th European Conference on Computer Vision (ECCV
2016
ClassCut for Unsupervised Class Segmentation
Abstract. We propose a novel method for unsupervised class segmentation on a set of images. It alternates between segmenting object instances and learning a class model. The method is based on a segmentation energy defined over all images at the same time, which can be optimized efficiently by techniques used before in interactive segmentation. Over iterations, our method progressively learns a class model by integrating observations over all images. In addition to appearance, this model captures the location and shape of the class with respect to an automatically determined coordinate frame common across images. This frame allows us to build stronger shape and location models, similar to those used in object class detection. Our method is inspired by interactive segmentation methods [1], but it is fully automatic and learns models characteristic for the object class rather than specific to one particular object/image. We experimentally demonstrate on the Caltech4, Caltech101, and Weizmann horses datasets that our method (a) transfers class knowledge across images and this improves results compared to segmenting every image independently; (b) outperforms Grabcut [1] for the task of unsupervised segmentation; (c) offers competitive performance compared to the state-of-the-art in unsupervised segmentation and in particular it outperforms the topic model [2].
A Unified Nanopublication Model for Effective and User-Friendly Access to the Elements of Scientific Publishing
Scientific publishing is the means by which we communicate and share
scientific knowledge, but this process currently often lacks transparency and
machine-interpretable representations. Scientific articles are published in
long coarse-grained text with complicated structures, and they are optimized
for human readers and not for automated means of organization and access. Peer
reviewing is the main method of quality assessment, but these peer reviews are
nowadays rarely published and their own complicated structure and linking to
the respective articles is not accessible. In order to address these problems
and to better align scientific publishing with the principles of the Web and
Linked Data, we propose here an approach to use nanopublications as a unifying
model to represent in a semantic way the elements of publications, their
assessments, as well as the involved processes, actors, and provenance in
general. To evaluate our approach, we present a dataset of 627 nanopublications
representing an interlinked network of the elements of articles (such as
individual paragraphs) and their reviews (such as individual review comments).
Focusing on the specific scenario of editors performing a meta-review, we
introduce seven competency questions and show how they can be executed as
SPARQL queries. We then present a prototype of a user interface for that
scenario that shows different views on the set of review comments provided for
a given manuscript, and we show in a user study that editors find the interface
useful to answer their competency questions. In summary, we demonstrate that a
unified and semantic publication model based on nanopublications can make
scientific communication more effective and user-friendly
Method of manufacturing a light emitting, photovoltaic or other electronic apparatus and system
The present invention provides a method of manufacturing an electronic apparatus, such as a lighting device having light emitting diodes (LEDs) or a power generating device having photovoltaic diodes. The exemplary method includes depositing a first conductive medium within a plurality of channels of a base to form a plurality of first conductors; depositing within the plurality of channels a plurality of semiconductor substrate particles suspended in a carrier medium; forming an ohmic contact between each semiconductor substrate particle and a first conductor; converting the semiconductor substrate particles into a plurality of semiconductor diodes; depositing a second conductive medium to form a plurality of second conductors coupled to the plurality of semiconductor diodes; and depositing or attaching a plurality of lenses suspended in a first polymer over the plurality of diodes. In various embodiments, the depositing, forming, coupling and converting steps are performed by or through a printing process
Training CNNs with Low-Rank Filters for Efficient Image Classification.
We propose a new method for creating computationally efficient convolutional neural networks (CNNs) by using low-rank representations of convolutional filters. Rather than approximating filters in previously-trained networks with more efficient versions, we learn a set of small basis filters from scratch; during training, the network learns to combine these basis filters into more complex filters that are discriminative for image classification. To train such networks, a novel weight initialization scheme is used. This allows effective initialization of connection weights in convolutional layers composed of groups of differently-shaped filters. We validate our approach by applying it to several existing CNN architectures and training these networks from scratch using the CIFAR, ILSVRC and MIT Places datasets. Our results show similar or higher accuracy than conventional CNNs with much less compute. Applying our method to an improved version of VGG-11 network using global max-pooling, we achieve comparable validation accuracy using 41% less compute and only 24% of the original VGG-11 model parameters; another variant of our method gives a 1 percentage point increase in accuracy over our improved VGG-11 model, giving a top-5 center-crop validation accuracy of 89.7% while reducing computation by 16% relative to the original VGG-11 model. Applying our method to the GoogLeNet architecture for ILSVRC, we achieved comparable accuracy with 26% less compute and 41% fewer model parameters. Applying our method to a near state-of-the-art network for CIFAR, we achieved comparable accuracy with 46% less compute and 55% fewer parameters.Microsoft Research PhD Scholarshi
Light Emitting, Photovoltaic or Other Electronic Apparatus and System
The present invention provides an electronic apparatus, such as a lighting device comprised of light emitting diodes (LEDs) or a power generating apparatus comprising photovoltaic diodes, which may be created through a printing process, using a semiconductor or other substrate particle ink or suspension and using a lens particle ink or suspension. An exemplary apparatus comprises a base; at least one first conductor; a plurality of diodes coupled to the at least one first conductor; at least one second conductor coupled to the plurality of diodes; and a plurality of lenses suspended in a polymer deposited or attached over the diodes. The lenses and the suspending polymer have different indices of refraction. In some embodiments, the lenses and diodes are substantially spherical, and have a ratio of mean diameters or lengths between about 10:1 and 2:1. The diodes may be LEDs or photovoltaic diodes, and in some embodiments, have a junction formed at least partially as a hemispherical shell or cap
- …