7 research outputs found
Heterogeneous information network embedding based personalized query-focused astronomy reference paper recommendation
© 2018, the Authors. Fast-growing scientific papers bring the problem of rapidly and accurately finding a list of reference papers for a given manuscript. Reference paper recommendation is an essential technology to overcome this obstacle. In this paper, we study the problem of personalized query-focused astronomy reference paper recommendation and propose a heterogeneous information network embedding based recommendation approach. In particular, we deem query researchers, query text, papers and authors of the papers as vertices and construct a heterogeneous information network based on these vertices. Then we propose a heterogeneous information network embedding (HINE) approach, which simultaneously captures intra-relationships among homogeneous vertices, inter-relationships among heterogeneous vertices and correlations between vertices and text contents, to model different types of vertices as vector formats in a unified vector space. The relevance of the query, the papers and the authors of the papers are then measured by the distributed representations. Finally, the papers which have high relevance scores are presented to the researcher as recommendation list. The effectiveness of the proposed HINE based recommendation approach is demonstrated by the recommendation evaluation conducted on the IOP astronomy journal database
Deep Architectures and Ensembles for Semantic Video Classification
This work addresses the problem of accurate semantic labelling of short
videos. To this end, a multitude of different deep nets, ranging from
traditional recurrent neural networks (LSTM, GRU), temporal agnostic networks
(FV,VLAD,BoW), fully connected neural networks mid-stage AV fusion and others.
Additionally, we also propose a residual architecture-based DNN for video
classification, with state-of-the art classification performance at
significantly reduced complexity. Furthermore, we propose four new approaches
to diversity-driven multi-net ensembling, one based on fast correlation measure
and three incorporating a DNN-based combiner. We show that significant
performance gains can be achieved by ensembling diverse nets and we investigate
factors contributing to high diversity. Based on the extensive YouTube8M
dataset, we provide an in-depth evaluation and analysis of their behaviour. We
show that the performance of the ensemble is state-of-the-art achieving the
highest accuracy on the YouTube-8M Kaggle test data. The performance of the
ensemble of classifiers was also evaluated on the HMDB51 and UCF101 datasets,
and show that the resulting method achieves comparable accuracy with
state-of-the-art methods using similar input features
Precise measurement of position and attitude based on convolutional neural network and visual correspondence relationship
Accurate measurement of position and attitude
information is particularly important. Traditional measurement
methods generally require high-precision measurement equipment for analysis, leading to high costs and limited applicability.
Vision-based measurement schemes need to solve complex visual
relationships. With the extensive development of neural networks
in related fields, it has become possible to apply them to
the object position and attitude. In this paper, we propose
an object pose measurement scheme based on convolutional
neural network and we have successfully implemented end-toend position and attitude detection. Furthermore, to effectively
expand the measurement range and reduce the number of
training samples, we demonstrated the independence of objects
in each dimension and proposed subadded training programs.
At the same time, we generated generating image encoder to
guarantee the detection performance of the training model in
practical applications
Deep Metric Learning Based on Scalable Neighborhood Components for Remote Sensing Scene Characterization
With the development of convolutional neural networks (CNNs), the semantic understanding of remote sensing (RS) scenes has been significantly improved based on their prominent feature encoding capabilities. While many existing deep-learning models focus on designing different architectures, only a few works in the RS field have focused on investigating the performance of the learned feature embeddings and the associated metric space. In particular, two main loss functions have been exploited: the contrastive and the triplet loss. However, the straightforward application of these techniques to RS images may not be optimal in order to capture their neighborhood structures in the metric space due to the insufficient sampling of image pairs or triplets during the training stage and to the inherent semantic complexity of remotely sensed data. To solve these problems, we propose a new deep metric learning approach, which overcomes the limitation on the class discrimination by means of two different components: 1) scalable neighborhood component analysis (SNCA) that aims at discovering the neighborhood structure in the metric space and 2) the cross-entropy loss that aims at preserving the class discrimination capability based on the learned class prototypes. Moreover, in order to preserve feature consistency among all the minibatches during training, a novel optimization mechanism based on momentum update is introduced for minimizing the proposed loss. An extensive experimental comparison (using several state-of-the-art models and two different benchmark data sets) has been conducted to validate the effectiveness of the proposed method from different perspectives, including: 1) classification; 2) clustering; and 3) image retrieval. The related codes of this article will be made publicly available for reproducible research by the community