Search CORE

723 research outputs found

Belgian Antarctic Expeditions: bibliography

Author: De Broyer C.
Doyen P.
Van Autenboer T.
Publication venue
Publication date: 01/01/1984
Field of study

This bibliography is based upon earlier working documents by T. Van Autenboer and P. Doyen. It was completed with data contained in an internal report on marine biology by C. de Broyer (1982). The compilers at tempt to present a list as complete as possible of the scientific papers and data reports related to Belgian Antarctic Research starting with the I.G.Y. A selection of articles of general interest or of narrative value is included

Open Marine Archive

Recent advances in deep learning for object detection

Author: HOI Steven C. H.
SAHOO Doyen
WU Xiongwei
Publication venue: 'Elsevier BV'
Publication date: 09/08/2019
Field of study

Object detection is a fundamental visual recognition problem in computer vision and has been widely studied in the past decades. Visual object detection aims to find objects of certain target classes with precise localization in a given image and assign each object instance a corresponding class label. Due to the tremendous successes of deep learning based image classification, object detection techniques using deep learning have been actively studied in recent years. In this paper, we give a comprehensive survey of recent advances in visual object detection with deep learning. By reviewing a large body of recent related work in literature, we systematically analyze the existing object detection frameworks and organize the survey into three major parts: (i) detection components, (ii) learning strategies, and (iii) applications & benchmarks. In the survey, we cover a variety of factors affecting the detection performance in detail, such as detector architectures, feature learning, proposal generation, sampling strategies, etc. Finally, we discuss several future directions to facilitate and spur future research for visual object detection with deep learning. Keywords: Object Detection, Deep Learning, Deep Convolutional Neural Network

arXiv.org e-Print Archive

Institutional Knowledge at Singapore Management University

Limit Synchronization in Markov Decision Processes

Author: C. Baier
C. Baier
H. Gimbert
J. Aspnes
K. Chatterjee
L. Alfaro de
L. Doyen
M.V. Volkov
P. Jancar
R. Baldoni
T.A. Henzinger
W. Fokkink
Publication venue
Publication date: 31/10/2013
Field of study

Markov decision processes (MDP) are finite-state systems with both strategic and probabilistic choices. After fixing a strategy, an MDP produces a sequence of probability distributions over states. The sequence is eventually synchronizing if the probability mass accumulates in a single state, possibly in the limit. Precisely, for 0 <= p <= 1 the sequence is p-synchronizing if a probability distribution in the sequence assigns probability at least p to some state, and we distinguish three synchronization modes: (i) sure winning if there exists a strategy that produces a 1-synchronizing sequence; (ii) almost-sure winning if there exists a strategy that produces a sequence that is, for all epsilon > 0, a (1-epsilon)-synchronizing sequence; (iii) limit-sure winning if for all epsilon > 0, there exists a strategy that produces a (1-epsilon)-synchronizing sequence. We consider the problem of deciding whether an MDP is sure, almost-sure, limit-sure winning, and we establish the decidability and optimal complexity for all modes, as well as the memory requirements for winning strategies. Our main contributions are as follows: (a) for each winning modes we present characterizations that give a PSPACE complexity for the decision problems, and we establish matching PSPACE lower bounds; (b) we show that for sure winning strategies, exponential memory is sufficient and may be necessary, and that in general infinite memory is necessary for almost-sure winning, and unbounded memory is necessary for limit-sure winning; (c) along with our results, we establish new complexity results for alternating finite automata over a one-letter alphabet

arXiv.org e-Print Archive

CiteSeerX

Crossref

DI-fusion

OTW: Optimal Transport Warping for Time Series

Author: Hoi Steven C. H.
Latorre Fabian
Liu Chenghao
Sahoo Doyen
Publication venue
Publication date: 01/06/2023
Field of study

Dynamic Time Warping (DTW) has become the pragmatic choice for measuring distance between time series. However, it suffers from unavoidable quadratic time complexity when the optimal alignment matrix needs to be computed exactly. This hinders its use in deep learning architectures, where layers involving DTW computations cause severe bottlenecks. To alleviate these issues, we introduce a new metric for time series data based on the Optimal Transport (OT) framework, called Optimal Transport Warping (OTW). OTW enjoys linear time/space complexity, is differentiable and can be parallelized. OTW enjoys a moderate sensitivity to time and shape distortions, making it ideal for time series. We show the efficacy and efficiency of OTW on 1-Nearest Neighbor Classification and Hierarchical Clustering, as well as in the case of using OTW instead of DTW in Deep Learning architectures.Comment: This is an extended version of an ICASSP 2023 accepted paper https://ieeexplore.ieee.org/document/1009591

arXiv.org e-Print Archive

Multimodal Transformer Networks for End-to-End Video-Grounded Dialogue Systems

Author: Chen Nancy F.
Hoi Steven C. H.
Le Hung
Sahoo Doyen
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2019
Field of study

Developing Video-Grounded Dialogue Systems (VGDS), where a dialogue is conducted based on visual and audio aspects of a given video, is significantly more challenging than traditional image or text-grounded dialogue systems because (1) feature space of videos span across multiple picture frames, making it difficult to obtain semantic information; and (2) a dialogue agent must perceive and process information from different modalities (audio, video, caption, etc.) to obtain a comprehensive understanding. Most existing work is based on RNNs and sequence-to-sequence architectures, which are not very effective for capturing complex long-term dependencies (like in videos). To overcome this, we propose Multimodal Transformer Networks (MTN) to encode videos and incorporate information from different modalities. We also propose query-aware attention through an auto-encoder to extract query-aware features from non-text modalities. We develop a training procedure to simulate token-level decoding to improve the quality of generated responses during inference. We get state of the art performance on Dialogue System Technology Challenge 7 (DSTC7). Our model also generalizes to another multimodal visual-grounded dialogue task, and obtains promising performance. We implemented our models using PyTorch and the code is released at https://github.com/henryhungle/MTN.Comment: Accepted at ACL 2019 (Long Paper

arXiv.org e-Print Archive

Crossref