Search CORE

1,098 research outputs found

A mosaic of eyes

Author: Baruch John
Cheng Yongqiang
Feng Zuren
Hu Fun
Ji Yuanxiang
Jiang Ping
Tian Feng
Wang Xiaonian
Zhu Jin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/2011
Field of study

Autonomous navigation is a traditional research topic in intelligent robotics and vehicles, which requires a robot to perceive its environment through onboard sensors such as cameras or laser scanners, to enable it to drive to its goal. Most research to date has focused on the development of a large and smart brain to gain autonomous capability for robots. There are three fundamental questions to be answered by an autonomous mobile robot: 1) Where am I going? 2) Where am I? and 3) How do I get there? To answer these basic questions, a robot requires a massive spatial memory and considerable computational resources to accomplish perception, localization, path planning, and control. It is not yet possible to deliver the centralized intelligence required for our real-life applications, such as autonomous ground vehicles and wheelchairs in care centers. In fact, most autonomous robots try to mimic how humans navigate, interpreting images taken by cameras and then taking decisions accordingly. They may encounter the following difficulties

Repository@Hull - Worktribe

Crossref

Supervised coordinate descent method with a 3D bilinear model for face alignment and tracking

Author: Cao
Cao
Cao
Cao
Cootes
Cootes
Cootes
Cristinacce
Daming Shi
Danelljan
Donner
Feng
Hartley
Jianjun Zhang
Lee
Liu
Matthews
Milborrow
Shuang Liu
Xiaosong Yang
Yongqiang Zhang
Zhang
Publication venue: 'Wiley'
Publication date: 01/05/2017
Field of study

Face alignment and tracking play important roles in facial performance capture. Existing data-driven methods for monocular videos suffer from large variations of pose and expression. In this paper, we propose an efficient and robust method for this task by introducing a novel supervised coordinate descent method with 3D bilinear representation. Instead of learning the mapping between the whole parameters and image features directly with a cascaded regression framework in current methods, we learn individual sets of parameters mappings separately step by step by a coordinate descent mean. Because different parameters make different contributions to the displacement of facial landmarks, our method is more discriminative to current whole-parameter cascaded regression methods. Benefiting from a 3D bilinear model learned from public databases, the proposed method can handle the head pose changes and extreme expressions out of plane better than other 2D-based methods. We present the reliable result of face tracking under various head poses and facial expressions on challenging video sequences collected online. The experimental results show that our method outperforms state-of-art data-driven methods

Crossref

Bournemouth University Research Online

The sky brightness and transparency in i-band at Dome A, Antarctica

Author: A. M. Moore
Bo Sun
C. A. Kulesa
C. K. Walker
C. R. Pennypacker
D. G. York
D. M. Luong-Van
Genrong Liu
Hu Zou
Huigen Yang
J. S. Lawrence
J. W. V. Storey
Ji Yang
Jiali Wang
Jianghua Wu
Jingyao Hu
Jun Ma
Jun Yan
Lifan Wang
Lirong Xia
Longlong Feng
M. C. B. Ashley
T. Travouillon
Weijia Qin
Xiangqun Cui
Xiangyan Yuan
Xu Zhou
Xuefei Gong
Yongqiang Yao
Zhanhai Zhang
Zhaohui Shang
Zhaoji Jiang
Zhenxi Zhu
Zhenyu Wu
Publication venue: 'IOP Publishing'
Publication date: 07/01/2011
Field of study

The i-band observing conditions at Dome A on the Antarctic plateau have been investigated using data acquired during 2008 with the Chinese Small Telescope ARray. The sky brightness, variations in atmospheric transparency, cloud cover, and the presence of aurorae are obtained from these images. The median sky brightness of moonless clear nights is 20.5 mag arcsec^{-2} in the SDSS

i

band at the South Celestial Pole (which includes a contribution of about 0.06 mag from diffuse Galactic light). The median over all Moon phases in the Antarctic winter is about 19.8 mag arcsec^{-2}. There were no thick clouds in 2008. We model contributions of the Sun and the Moon to the sky background to obtain the relationship between the sky brightness and transparency. Aurorae are identified by comparing the observed sky brightness to the sky brightness expected from this model. About 2% of the images are affected by relatively strong aurorae.Comment: There are 1 Latex file and 14 figures accepted by A

arXiv.org e-Print Archive

Crossref

Unfalsified visual servoing for simultaneous object recognition and pose tracking

Author: Cheng Yongqiang
Feng Zuren
Jiang Ping
Wang Xiaonian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 05/10/2016
Field of study

In a complex environment, simultaneous object recognition and tracking has been one of the challenging topics in computer vision and robotics. Current approaches are usually fragile due to spurious feature matching and local convergence for pose determination. Once a failure happens, these approaches lack a mechanism to recover automatically. In this paper, data-driven unfalsified control is proposed for solving this problem in visual servoing. It recognizes a target through matching image features with a 3-D model and then tracks them through dynamic visual servoing. The features can be falsified or unfalsified by a supervisory mechanism according to their tracking performance. Supervisory visual servoing is repeated until a consensus between the model and the selected features is reached, so that model recognition and object tracking are accomplished. Experiments show the effectiveness and robustness of the proposed algorithm to deal with matching and tracking failures caused by various disturbances, such as fast motion, occlusions, and illumination variation

Repository@Hull - Worktribe

Crossref

Enhancing Subtask Performance of Multi-modal Large Language Model

Author: Li Zhenyu
Liu Donghong
Xu Xinhai
Zhang Feng
Zhao Yongqiang
Publication venue
Publication date: 31/08/2023
Field of study

Multi-modal Large Language Model (MLLM) refers to a model expanded from a Large Language Model (LLM) that possesses the capability to handle and infer multi-modal data. Current MLLMs typically begin by using LLMs to decompose tasks into multiple subtasks, then employing individual pre-trained models to complete specific subtasks, and ultimately utilizing LLMs to integrate the results of each subtasks to obtain the results of the task. In real-world scenarios, when dealing with large projects, it is common practice to break down the project into smaller sub-projects, with different teams providing corresponding solutions or results. The project owner then decides which solution or result to use, ensuring the best possible outcome for each subtask and, consequently, for the entire project. Inspired by this, this study considers selecting multiple pre-trained models to complete the same subtask. By combining the results from multiple pre-trained models, the optimal subtask result is obtained, enhancing the performance of the MLLM. Specifically, this study first selects multiple pre-trained models focused on the same subtask based on distinct evaluation approaches, and then invokes these models in parallel to process input data and generate corresponding subtask results. Finally, the results from multiple pre-trained models for the same subtask are compared using the LLM, and the best result is chosen as the outcome for that subtask. Extensive experiments are conducted in this study using GPT-4 annotated datasets and human-annotated datasets. The results of various evaluation metrics adequately demonstrate the effectiveness of the proposed approach in this paper

arXiv.org e-Print Archive