Search CORE

9,982 research outputs found

LRMM: Learning to Recommend with Missing Modalities

Author: Li Hui
Niepert Mathias
Wang Cheng
Publication venue
Publication date: 01/01/2018
Field of study

Multimodal learning has shown promising performance in content-based recommendation due to the auxiliary user and item information of multiple modalities such as text and images. However, the problem of incomplete and missing modality is rarely explored and most existing methods fail in learning a recommendation model with missing or corrupted modalities. In this paper, we propose LRMM, a novel framework that mitigates not only the problem of missing modalities but also more generally the cold-start problem of recommender systems. We propose modality dropout (m-drop) and a multimodal sequential autoencoder (m-auto) to learn multimodal representations for complementing and imputing missing modalities. Extensive experiments on real-world Amazon data show that LRMM achieves state-of-the-art performance on rating prediction tasks. More importantly, LRMM is more robust to previous methods in alleviating data-sparsity and the cold-start problem.Comment: 11 pages, EMNLP 201

arXiv.org e-Print Archive

Crossref

Enabling viewpoint learning through dynamic label generation

Author: Hermosilla Casajús Pedro
Ropinski Timo
Schelling Michael
Vázquez Alcocer Pere Pau
Publication venue: 'Wiley'
Publication date: 01/05/2021
Field of study

Optimal viewpoint prediction is an essential task in many computer graphics applications. Unfortunately, common viewpointqualities suffer from two major drawbacks: dependency on clean surface meshes, which are not always available, and the lack ofclosed-form expressions, which requires a costly search involving rendering. To overcome these limitations we propose to sepa-rate viewpoint selection from rendering through an end-to-end learning approach, whereby we reduce the in¿uence of the meshquality by predicting viewpoints from unstructured point clouds instead of polygonal meshes. While this makes our approachinsensitive to the mesh discretization during evaluation, it only becomes possible when resolving label ambiguities that arise inthis context. Therefore, we additionally propose to incorporate the label generation into the training procedure, making the labeldecision adaptive to the current network predictions. We show how our proposed approach allows for learning viewpoint pre-dictions for models from different object categories and for different viewpoint qualities. Additionally, we show that predictiontimes are reduced from several minutes to a fraction of a second, as compared to state-of-the-art (SOTA) viewpoint quality eval-uation. Code and training data is available at https://github.com/schellmi42/viewpoint_learning, whichis to our knowledge the biggest viewpoint quality dataset available.This work was supported in part by project TIN2017-88515-C2-1-R(GEN3DLIVE), from the Spanish Ministerio de Economía yCompetitividad, by 839 FEDER (EU) funds.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation

Author
Publication venue: The Association for Computational Linguistics
Publication date: 19/04/2021
Field of study

Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Multi-task super resolution method for vector field critical points enhancement

Author: Fang Meie
Hao Zhou
Peng Weilong
Tang Keke
Yang Yilun
Publication venue: Asia Pacific Academy of Science Pte. Ltd.
Publication date: 21/06/2022
Field of study

It is a challenging task to handle the vector field visualization at local critical points. Generally, topological based methods firstly divide critical regions into different categories, and then process the different types of critical regions to improve the effect, which pipeline is complex. In the paper, a learning based multi-task super resolution (SR) method is proposed to improve the refinement of vector field, and enhance the visualization effect, especially at the critical region. In detail, the multi-task model consists of two important designs on task branches: one task is to simulate the interpolation of discrete vector fields based on an improved super-resolution network; and the other is a classification task to identify the types of critical vector fields. It is an efficient end-to-end architecture for both training and inferencing stages, which simplifies the pipeline of critical vector field visualization and improves the visualization effect. In experiment, we compare our method with both traditional interpolation and pure SR network on both simulation data and real data, and the reported results indicate our method lower the error and improve PSNR significantly

Asia Pacific Academy of Science Pte. Ltd.

SciTech News Volume 71, No. 1 (2017)

Author
Publication venue: Jefferson Digital Commons
Publication date: 21/02/2017
Field of study

Columns and Reports From the Editor 3 Division News Science-Technology Division 5 Chemistry Division 8 Engineering Division Aerospace Section of the Engineering Division 9 Architecture, Building Engineering, Construction and Design Section of the Engineering Division 11 Reviews Sci-Tech Book News Reviews 12 Advertisements IEEE

Jefferson Digital Commons

Enabling Viewpoint Learning through Dynamic Label Generation

Author: Blanz V.
Feixas M.
Freitag S.
Gao B.-B.
González Á.
Jacobs R. A.
Jordan M. I.
Lee C. H.
Lino C.
Liu Z.
Marchand E.
Meuschke M.
Monclús E.
Nguyen-Phuoc T. H.
Nimier-David M.
Secord A.
Shi N.
Song R.
Srivastava N.
Viola I.
Vázquez P.-P.
Yao W. Y. Z.
Zhang Y.
Publication venue
Publication date: 09/02/2021
Field of study

Optimal viewpoint prediction is an essential task in many computer graphics applications. Unfortunately, common viewpoint qualities suffer from two major drawbacks: dependency on clean surface meshes, which are not always available, and the lack of closed-form expressions, which requires a costly search involving rendering. To overcome these limitations we propose to separate viewpoint selection from rendering through an end-to-end learning approach, whereby we reduce the influence of the mesh quality by predicting viewpoints from unstructured point clouds instead of polygonal meshes. While this makes our approach insensitive to the mesh discretization during evaluation, it only becomes possible when resolving label ambiguities that arise in this context. Therefore, we additionally propose to incorporate the label generation into the training procedure, making the label decision adaptive to the current network predictions. We show how our proposed approach allows for learning viewpoint predictions for models from different object categories and for different viewpoint qualities. Additionally, we show that prediction times are reduced from several minutes to a fraction of a second, as compared to state-of-the-art (SOTA) viewpoint quality evaluation. We will further release the code and training data, which will to our knowledge be the biggest viewpoint quality dataset available

arXiv.org e-Print Archive

Crossref

UPCommons. Portal del coneixement obert de la UPC