4,960 research outputs found
Multi-task CNN Model for Attribute Prediction
This paper proposes a joint multi-task learning algorithm to better predict
attributes in images using deep convolutional neural networks (CNN). We
consider learning binary semantic attributes through a multi-task CNN model,
where each CNN will predict one binary attribute. The multi-task learning
allows CNN models to simultaneously share visual knowledge among different
attribute categories. Each CNN will generate attribute-specific feature
representations, and then we apply multi-task learning on the features to
predict their attributes. In our multi-task framework, we propose a method to
decompose the overall model's parameters into a latent task matrix and
combination matrix. Furthermore, under-sampled classifiers can leverage shared
statistics from other classifiers to improve their performance. Natural
grouping of attributes is applied such that attributes in the same group are
encouraged to share more knowledge. Meanwhile, attributes in different groups
will generally compete with each other, and consequently share less knowledge.
We show the effectiveness of our method on two popular attribute datasets.Comment: 11 pages, 3 figures, ieee transaction pape
Annotating Object Instances with a Polygon-RNN
We propose an approach for semi-automatic annotation of object instances.
While most current methods treat object segmentation as a pixel-labeling
problem, we here cast it as a polygon prediction task, mimicking how most
current datasets have been annotated. In particular, our approach takes as
input an image crop and sequentially produces vertices of the polygon outlining
the object. This allows a human annotator to interfere at any time and correct
a vertex if needed, producing as accurate segmentation as desired by the
annotator. We show that our approach speeds up the annotation process by a
factor of 4.7 across all classes in Cityscapes, while achieving 78.4% agreement
in IoU with original ground-truth, matching the typical agreement between human
annotators. For cars, our speed-up factor is 7.3 for an agreement of 82.2%. We
further show generalization capabilities of our approach to unseen datasets
- …