1,365 research outputs found
Context-Patch Face Hallucination Based on Thresholding Locality-Constrained Representation and Reproducing Learning
Face hallucination is a technique that reconstruct high-resolution (HR) faces from low-resolution (LR) faces, by using the prior knowledge learned from HR/LR face pairs. Most state-of-the-arts leverage position-patch prior knowledge of human face to estimate the optimal representation coefficients for each image patch. However, they focus only the position information and usually ignore the context information of image patch. In addition, when they are confronted with misalignment or the Small Sample Size (SSS) problem, the hallucination performance is very poor. To this end, this study incorporates the contextual information of image patch and proposes a powerful and efficient context-patch based face hallucination approach, namely Thresholding Locality-constrained Representation and Reproducing learning (TLcR-RL). Under the context-patch based framework, we advance a thresholding based representation method to enhance the reconstruction accuracy and reduce the computational complexity. To further improve the performance of the proposed algorithm, we propose a promotion strategy called reproducing learning. By adding the estimated HR face to the training set, which can simulates the case that the HR version of the input LR face is present in the training set, thus iteratively enhancing the final hallucination result. Experiments demonstrate that the proposed TLcR-RL method achieves a substantial increase in the hallucinated results, both subjectively and objectively. Additionally, the proposed framework is more robust to face misalignment and the SSS problem, and its hallucinated HR face is still very good when the LR test face is from the real-world. The MATLAB source code is available at https://github.com/junjun-jiang/TLcR-RL
DEFINE: friendship detection based on node enhancement
Network representation learning (NRL) is a matter of importance to a variety of tasks such as link prediction. Learning low-dimensional vector representations for node enhancement based on nodes attributes and network structures can improve link prediction performance. Node attributes are important factors in forming networks, like psychological factors and appearance features affecting friendship networks. However, little to no work has detected friendship using the NRL technique, which combines studentsā psychological features and perceived traits based on facial appearance. In this paper, we propose a framework named DEFINE (No enhancement based r e dship D tection) to detect studentsā friend relationships, which combines with studentsā psychological factors and facial perception information. To detect friend relationships accurately, DEFINE uses the NRL technique, which considers network structure and the additional attributes information for nodes. DEFINE transforms them into low-dimensional vector spaces while preserving the inherent properties of the friendship network. Experimental results on real-world friendship network datasets illustrate that DEFINE outperforms other state-of-art methods. Ā© 2020, Springer Nature Switzerland AG.E
Attend to You: Personalized Image Captioning with Context Sequence Memory Networks
We address personalization issues of image captioning, which have not been
discussed yet in previous research. For a query image, we aim to generate a
descriptive sentence, accounting for prior knowledge such as the user's active
vocabularies in previous documents. As applications of personalized image
captioning, we tackle two post automation tasks: hashtag prediction and post
generation, on our newly collected Instagram dataset, consisting of 1.1M posts
from 6.3K users. We propose a novel captioning model named Context Sequence
Memory Network (CSMN). Its unique updates over previous memory network models
include (i) exploiting memory as a repository for multiple types of context
information, (ii) appending previously generated words into memory to capture
long-term information without suffering from the vanishing gradient problem,
and (iii) adopting CNN memory structure to jointly represent nearby ordered
memory slots for better context understanding. With quantitative evaluation and
user studies via Amazon Mechanical Turk, we show the effectiveness of the three
novel features of CSMN and its performance enhancement for personalized image
captioning over state-of-the-art captioning models.Comment: Accepted paper at CVPR 201
End-to-End Photo-Sketch Generation via Fully Convolutional Representation Learning
Sketch-based face recognition is an interesting task in vision and multimedia
research, yet it is quite challenging due to the great difference between face
photos and sketches. In this paper, we propose a novel approach for
photo-sketch generation, aiming to automatically transform face photos into
detail-preserving personal sketches. Unlike the traditional models synthesizing
sketches based on a dictionary of exemplars, we develop a fully convolutional
network to learn the end-to-end photo-sketch mapping. Our approach takes whole
face photos as inputs and directly generates the corresponding sketch images
with efficient inference and learning, in which the architecture are stacked by
only convolutional kernels of very small sizes. To well capture the person
identity during the photo-sketch transformation, we define our optimization
objective in the form of joint generative-discriminative minimization. In
particular, a discriminative regularization term is incorporated into the
photo-sketch generation, enhancing the discriminability of the generated person
sketches against other individuals. Extensive experiments on several standard
benchmarks suggest that our approach outperforms other state-of-the-art methods
in both photo-sketch generation and face sketch verification.Comment: 8 pages, 6 figures. Proceeding in ACM International Conference on
Multimedia Retrieval (ICMR), 201
- ā¦