Search CORE

9,445 research outputs found

3D Tracking Using Multi-view Based Particle Filters

Author: García Santos Narciso
Jaureguizar Núñez Fernando
Mohedano del Pozo Raúl
Salgado Álvarez de Sotomayor Luis
Publication venue: E.T.S.I. Telecomunicación (UPM)
Publication date: 01/01/2009
Field of study

Visual surveillance and monitoring of indoor environments using multiple cameras has become a field of great activity in computer vision. Usual 3D tracking and positioning systems rely on several independent 2D tracking modules applied over individual camera streams, fused using geometrical relationships across cameras. As 2D tracking systems suffer inherent difficulties due to point of view limitations (perceptually similar foreground and background regions causing fragmentation of moving objects, occlusions), 3D tracking based on partially erroneous 2D tracks are likely to fail when handling multiple-people interaction. To overcome this problem, this paper proposes a Bayesian framework for combining 2D low-level cues from multiple cameras directly into the 3D world through 3D Particle Filters. This method allows to estimate the probability of a certain volume being occupied by a moving object, and thus to segment and track multiple people across the monitored area. The proposed method is developed on the basis of simple, binary 2D moving region segmentation on each camera, considered as different state observations. In addition, the method is proved well suited for integrating additional 2D low-level cues to increase system robustness to occlusions: in this line, a naïve color-based (HSI) appearance model has been integrated, resulting in clear performance improvements when dealing with complex scenarios

Archivo Digital UPM

Seeing Tree Structure from Vibration

Author: A French
AD Jepson
B Bascle
B Zhou
DC Knill
DJ Fleet
DM Blei
E Türetken
E Türetken
ES Spelke
EW Dijkstra
HY Wu
J Canny
JF Henriques
JR Moore
K-K Maninis
KD Murphy
KR James
KR James
LA Miller
MM Fraz
MN Davies
N Wiener
O Braddick
R Moreno-Bote
S Geman
S Hare
SJ Farlow
SJ Gershman
T Furoh
T Gautama
T Xue
TC Lee
TS Lee
Publication venue
Publication date: 13/09/2018
Field of study

Humans recognize object structure from both their appearance and motion; often, motion helps to resolve ambiguities in object structure that arise when we observe object appearance only. There are particular scenarios, however, where neither appearance nor spatial-temporal motion signals are informative: occluding twigs may look connected and have almost identical movements, though they belong to different, possibly disconnected branches. We propose to tackle this problem through spectrum analysis of motion signals, because vibrations of disconnected branches, though visually similar, often have distinctive natural frequencies. We propose a novel formulation of tree structure based on a physics-based link model, and validate its effectiveness by theoretical analysis, numerical simulation, and empirical experiments. With this formulation, we use nonparametric Bayesian inference to reconstruct tree structure from both spectral vibration signals and appearance cues. Our model performs well in recognizing hierarchical tree structure from real-world videos of trees and vessels.Comment: ECCV 2018. The first two authors contributed equally to this work. Project page: http://tree.csail.mit.edu

arXiv.org e-Print Archive

DSpace@MIT

Crossref

Nonparametric Bayesian Double Articulation Analyzer for Direct Language Acquisition from Continuous Speech Signals

Author: Nagasaka Shogo
Nakashima Ryo
Taniguchi Tadahiro
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 09/03/2016
Field of study

Human infants can discover words directly from unsegmented speech signals without any explicitly labeled data. In this paper, we develop a novel machine learning method called nonparametric Bayesian double articulation analyzer (NPB-DAA) that can directly acquire language and acoustic models from observed continuous speech signals. For this purpose, we propose an integrative generative model that combines a language model and an acoustic model into a single generative model called the "hierarchical Dirichlet process hidden language model" (HDP-HLM). The HDP-HLM is obtained by extending the hierarchical Dirichlet process hidden semi-Markov model (HDP-HSMM) proposed by Johnson et al. An inference procedure for the HDP-HLM is derived using the blocked Gibbs sampler originally proposed for the HDP-HSMM. This procedure enables the simultaneous and direct inference of language and acoustic models from continuous speech signals. Based on the HDP-HLM and its inference procedure, we developed a novel double articulation analyzer. By assuming HDP-HLM as a generative model of observed time series data, and by inferring latent variables of the model, the method can analyze latent double articulation structure, i.e., hierarchically organized latent words and phonemes, of the data in an unsupervised manner. The novel unsupervised double articulation analyzer is called NPB-DAA. The NPB-DAA can automatically estimate double articulation structure embedded in speech signals. We also carried out two evaluation experiments using synthetic data and actual human continuous speech signals representing Japanese vowel sequences. In the word acquisition and phoneme categorization tasks, the NPB-DAA outperformed a conventional double articulation analyzer (DAA) and baseline automatic speech recognition system whose acoustic model was trained in a supervised manner.Comment: 15 pages, 7 figures, Draft submitted to IEEE Transactions on Autonomous Mental Development (TAMD

arXiv.org e-Print Archive

Modeling Human Performance on Statistical Word Segmentation Tasks

Author: Frank Michael C.
Goldwater Sharon
Griffiths Thomas L.
Mansinghka Vikash
Tenenbaum Joshua B.
Publication venue
Publication date: 01/01/2007
Field of study

Edinburgh Research Explorer

Modeling Human Performance on Statistical Word Segmentation Tasks

Author: Cai Xinlun
Chen Lifeng
Ho Daniel
Li Huanlu
Phillips David B.
Wang Xuyang
Yu Siyuan
Zhou Xiaoqi
Zhu Jiangbo
Publication venue
Publication date: 01/01/2007
Field of study

Harnessing the orbital angular momentum (OAM) of light is an appealing approach to developing photonic technologies for future applications in optical communications and high-dimensional quantum key distribution (QKD) systems. An outstanding challenge to the widespread uptake of the OAM resource is its efficient generation. In this work we design a new device that can directly emit an OAM-carrying light beam from a low-cost semiconductor laser. By fabricating micro-scale spiral phase plates within the aperture of a vertical-cavity surface-emitting laser (VCSEL), the linearly polarized Gaussian beam emitted by the VCSEL is converted into a beam carrying specific OAM modes and their superposition states, with high efficiency and high beam quality. This new approach to OAM generation may be particularly useful in the field of OAM-based optical and quantum communications, especially for short-reach data interconnects and QKD

Northumbria Research Link

Edinburgh Research Explorer

Enlighten

Explore Bristol Research

Addressee Identification In Face-to-Face Meetings

Author: Akker Rieks op den
Jovanović Natasa
Nijholt Anton
Publication venue: Association for Computational Linguistics
Publication date: 01/01/2006
Field of study

We present results on addressee identification in four-participants face-to-face meetings using Bayesian Network and Naive Bayes classifiers. First, we investigate how well the addressee of a dialogue act can be predicted based on gaze, utterance and conversational context features. Then, we explore whether information about meeting context can aid classifiers’ performances. Both classifiers perform the best when conversational context and utterance features are combined with speaker’s gaze information. The classifiers show little gain from information about meeting context

CiteSeerX

University of Twente Research Information

Object-based 2D-to-3D video conversion for effective stereoscopic content generation in 3D-TV applications

Author: Feng Yue
Jiang Jianmin
Ren Jinchang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

Three-dimensional television (3D-TV) has gained increasing popularity in the broadcasting domain, as it enables enhanced viewing experiences in comparison to conventional two-dimensional (2D) TV. However, its application has been constrained due to the lack of essential contents, i.e., stereoscopic videos. To alleviate such content shortage, an economical and practical solution is to reuse the huge media resources that are available in monoscopic 2D and convert them to stereoscopic 3D. Although stereoscopic video can be generated from monoscopic sequences using depth measurements extracted from cues like focus blur, motion and size, the quality of the resulting video may be poor as such measurements are usually arbitrarily defined and appear inconsistent with the real scenes. To help solve this problem, a novel method for object-based stereoscopic video generation is proposed which features i) optical-flow based occlusion reasoning in determining depth ordinal, ii) object segmentation using improved region-growing from masks of determined depth layers, and iii) a hybrid depth estimation scheme using content-based matching (inside a small library of true stereo image pairs) and depth-ordinal based regularization. Comprehensive experiments have validated the effectiveness of our proposed 2D-to-3D conversion method in generating stereoscopic videos of consistent depth measurements for 3D-TV applications

University of Strathclyde Institutional Repository

Surrey Research Insight

Computational and Robotic Models of Early Language Development: A Review

Author: Kachergis George
Oudeyer Pierre-Yves
Schueller William
Publication venue
Publication date: 25/03/2019
Field of study

We review computational and robotics models of early language learning and development. We first explain why and how these models are used to understand better how children learn language. We argue that they provide concrete theories of language learning as a complex dynamic system, complementing traditional methods in psychology and linguistics. We review different modeling formalisms, grounded in techniques from machine learning and artificial intelligence such as Bayesian and neural network approaches. We then discuss their role in understanding several key mechanisms of language development: cross-situational statistical learning, embodiment, situated social interaction, intrinsically motivated learning, and cultural evolution. We conclude by discussing future challenges for research, including modeling of large-scale empirical data about language acquisition in real-world environments. Keywords: Early language learning, Computational and robotic models, machine learning, development, embodiment, social interaction, intrinsic motivation, self-organization, dynamical systems, complexity.Comment: to appear in International Handbook on Language Development, ed. J. Horst and J. von Koss Torkildsen, Routledg

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server