5,195 research outputs found
Reduced pattern training based on task decomposition using pattern distributor
Task Decomposition with Pattern Distributor (PD) is a new task decomposition method for multilayered feedforward neural networks. Pattern distributor network is proposed that implements this new task decomposition method. We propose a theoretical model to analyze the performance of pattern distributor network. A method named Reduced Pattern Training is also introduced, aiming to improve the performance of pattern distribution. Our analysis and the experimental results show that reduced pattern training improves the performance of pattern distributor network significantly. The distributor module’s classification accuracy dominates the whole network’s performance. Two combination methods, namely Cross-talk based combination and Genetic Algorithm based combination, are presented to find suitable grouping for the distributor module. Experimental results show that this new method can reduce training time and improve network generalization accuracy when compared to a conventional method such as constructive backpropagation or a task decomposition method such as Output Parallelism
Learning Graph Embeddings for Open World Compositional Zero-Shot Learning
Compositional Zero-Shot learning (CZSL) aims to recognize unseen compositions of state and object visual primitives seen during training. A problem with standard CZSL is the assumption of knowing which unseen compositions will be available at test time. In this work, we overcome this assumption operating on the open world setting, where no limit is imposed on the compositional space at test time, and the search space contains a large number of unseen compositions. To address this problem, we propose a new approach, Compositional Cosine Graph Embeddings (Co-CGE), based on two principles. First, Co-CGE models the dependency between states, objects and their compositions through a graph convolutional neural network. The graph propagates information from seen to unseen concepts, improving their representations. Second, since not all unseen compositions are equally feasible, and less feasible ones may damage the learned representations, Co-CGE estimates a feasibility score for each unseen composition, using the scores as margins in a cosine similarity-based loss and as weights in the adjacency matrix of the graphs. Experiments show that our approach achieves state-of-the-art performances in standard CZSL while outperforming previous methods in the open world scenario
Visual Compositional Learning for Human-Object Interaction Detection
Human-Object interaction (HOI) detection aims to localize and infer
relationships between human and objects in an image. It is challenging because
an enormous number of possible combinations of objects and verbs types forms a
long-tail distribution. We devise a deep Visual Compositional Learning (VCL)
framework, which is a simple yet efficient framework to effectively address
this problem. VCL first decomposes an HOI representation into object and verb
specific features, and then composes new interaction samples in the feature
space via stitching the decomposed features. The integration of decomposition
and composition enables VCL to share object and verb features among different
HOI samples and images, and to generate new interaction samples and new types
of HOI, and thus largely alleviates the long-tail distribution problem and
benefits low-shot or zero-shot HOI detection. Extensive experiments demonstrate
that the proposed VCL can effectively improve the generalization of HOI
detection on HICO-DET and V-COCO and outperforms the recent state-of-the-art
methods on HICO-DET. Code is available at https://github.com/zhihou7/VCL.Comment: Accepted in ECCV202
Learning Conditional Attributes for Compositional Zero-Shot Learning
Compositional Zero-Shot Learning (CZSL) aims to train models to recognize
novel compositional concepts based on learned concepts such as attribute-object
combinations. One of the challenges is to model attributes interacted with
different objects, e.g., the attribute ``wet" in ``wet apple" and ``wet cat" is
different. As a solution, we provide analysis and argue that attributes are
conditioned on the recognized object and input image and explore learning
conditional attribute embeddings by a proposed attribute learning framework
containing an attribute hyper learner and an attribute base learner. By
encoding conditional attributes, our model enables to generate flexible
attribute embeddings for generalization from seen to unseen compositions.
Experiments on CZSL benchmarks, including the more challenging C-GQA dataset,
demonstrate better performances compared with other state-of-the-art approaches
and validate the importance of learning conditional attributes. Code is
available at https://github.com/wqshmzh/CANet-CZSLComment: 10 pages, 4 figures, accepted in CVPR202
- …