Search CORE

15,308 research outputs found

Learning List-wise Representation in Reinforcement Learning for Ads Allocation with Multiple Auxiliary Tasks

Author: Liao Guogang
Shi Xiaowen
Wang Dong
Wang Xingxing
Wang Yongkang
Wang Ze
Wu Xiaoxu
Zhang Chuheng
Publication venue
Publication date: 02/04/2022
Field of study

With the recent prevalence of reinforcement learning (RL), there have been tremendous interests in utilizing RL for ads allocation in recommendation platforms (e.g., e-commerce and news feed sites). For better performance, recent RL-based ads allocation agent makes decisions based on representations of list-wise item arrangement. This results in a high-dimensional state-action space, which makes it difficult to learn an efficient and generalizable list-wise representation. To address this problem, we propose a novel algorithm to learn a better representation by leveraging task-specific signals on Meituan food delivery platform. Specifically, we propose three different types of auxiliary tasks that are based on reconstruction, prediction, and contrastive learning respectively. We conduct extensive offline experiments on the effectiveness of these auxiliary tasks and test our method on real-world food delivery platform. The experimental results show that our method can learn better list-wise representations and achieve higher revenue for the platform.Comment: arXiv admin note: text overlap with arXiv:2109.04353, arXiv:2204.0037

arXiv.org e-Print Archive

Implementation of a Human-Computer Interface for Computer Assisted Translation and Handwritten Text Recognition

Author: Ocampo Sepúlveda Jorge Carlos
Publication venue: 'Universitat Politecnica de Valencia'
Publication date: 11/01/2012
Field of study

A human-computer interface is developed to provide services of computer assisted machine translation (CAT) and computer assisted transcription of handwritten text images (CATTI). The back-end machine translation (MT) and handwritten text recognition (HTR) systems are provided by the Pattern Recognition and Human Language Technology (PRHLT) research group. The idea is to provide users with easy to use tools to convert interactive translation and transcription feasible tasks. The assisted service is provided by remote servers with CAT or CATTI capabilities. The interface supplies the user with tools for efficient local edition: deletion, insertion and substitution.Ocampo Sepúlveda, JC. (2009). Implementation of a Human-Computer Interface for Computer Assisted Translation and Handwritten Text Recognition. http://hdl.handle.net/10251/14318Archivo delegad

RiuNet

Learning Semantic Information from Multimodal Data using Deep Neural Networks

Author: Pugdeethosapol Krittaphat
Publication venue: SURFACE at Syracuse University
Publication date: 22/12/2020
Field of study

During the last decades, most collective information has been digitized to form an immense database distributed across the Internet. This can also be referred to as Big data, a collection of data that is vast in volume and still growing with time. Nowadays, we can say that Big data is everywhere. We might not even realize how much it affects our daily life as it is applied in many ways, ranging from online shopping, music streaming, TV streaming, travel and transportation, energy, fighting crime, to health care. Many organizations and companies have been collecting and analyzing large volumes of data to solve domain-specific problems or making business decisions. One of the powerful tools that can be used to extract value from Big data is Deep learning, a type of machine learning algorithm inspired by the structure and function of the human brain called artificial neural networks that learn from large amounts of data. Deep learning has been widely used and applied in many research fields such as natural language processing, IoT applications, and computer vision. In this thesis, we introduce three Deep Neural Networks that used to learn semantic information from different types of data and a design guideline to accelerate Neural Network Layer on a general propose computing platform. First, we focus on the text type data. We proposed a new feature extraction technique to preprocess the dataset and optimize the original Restricted Boltzmann Machine (RBM) model to generate the more meaningful topic that better represents the given document. Our proposed method can improve the generated topic accuracy by up to 12.99% on Open Movie, Reuters, and 20NewsGroup datasets. Moving from text to image type data and with additional click locations, we proposed a human in a loop automatic image labeling framework focusing on aerial images with fewer features for detection. The proposed model consists of two main parts, a prediction model and an adjustment model. The user first provides click locations to the prediction model to generate a bounding box of a specific object. The bounding box is then fine-tuned by the adjustment model for more accurate size and location. A feedback and retrain mechanism is implemented that allows the users to manually adjust the generated bounding box and provide feedback to incrementally train the adjustment network during runtime. This unique online learning feature enables the user to generalize the existing model to target classes not initially presented in the training set, and gradually improves the specificity of the model to those new targets during online learning. Combining text and image type data, we proposed a Multi-region Attention-assisted Grounding network (MAGNet) framework that utilizes spatial attention networks for image-level visual-textual fusion preserving local (word) and global (phrase) information to refine region proposals with an in-network Region Proposal Network (RPN) and detect single or multiple regions for a phrase query. Our framework is independent of external proposal generation systems and without additional information, it can develop an understanding of the query phrase in relation to the image to achieve respectable results in Flickr30k entities and 12% improvement over the state-of-the-art in ReferIt game. Additionally, our model is capable of grounding multiple regions for a query phrase, which is more suitable for real-life applications. Although Deep neural networks (DNNs) have become a powerful tool, it is highly expensive in both computational time and storage cost. To optimize and improve the performance of the network while maintaining the accuracy, the block-circulant matrix-based (BCM) algorithm has been introduced. It has been proven to be highly effective when implemented using customized hardware, such as FPGAs. However, its performance suffers on general purpose computing platforms. In certain cases, using the BCM does not improve the total computation time of the networks at all. With this problem, we proposed a parallel implementation of the BCM layer, and guidelines that generally lead to better implementation practice is provided. The guidelines run across popular implementation language and packages including Python, numpy, intel-numpy, tensorflow, and nGraph

Syracuse University Research Facility and Collaborative Environment

Deep Learning Applications in Chest Radiography and Computed Tomography:Current state of the Art

Author: Aberle
Aerts
Anthimopoulos
Armato
Boedeker
Charbonnier
Cicero
Ciompi
Ehteshami Bejnordi
Esteva
Finigan
Gao
Gonzalez
Goo
Gulshan
Hamidian
Harari
Hinton
Hua
Huang
Hubel
Humphries
Jiang
Kamel
Kim
Krizhevsky
Lakhani
Lao
LeCun
Lee
Li
Liang
Lo
Masood
McAdams
Melendez
Miller
Nibali
Oudkerk
Pande
Paul
Prevedello
Pu
Quekel
Ren
Ross
Schalekamp
Schaller
Setio
Shen
Siegel
Suk
Ting
van Beek
van Riel
Vovk
Wang
Watadani
Zhao
Publication venue: 'Ovid Technologies (Wolters Kluwer Health)'
Publication date: 05/03/2019
Field of study

Crossref

Edinburgh Research Explorer