Search CORE

38,954 research outputs found

DramaQA: Character-Centered Video Story Understanding with Hierarchical QA

Author: Choi Seongho
Heo Yu-Jung
Jang Youwon
Lee Minsu
Lee Seungchan
On Kyoung-Woon
Seo Ahjeong
Zhang Byoung-Tak
Publication venue
Publication date: 07/05/2020
Field of study

Despite recent progress on computer vision and natural language processing, developing video understanding intelligence is still hard to achieve due to the intrinsic difficulty of story in video. Moreover, there is not a theoretical metric for evaluating the degree of video understanding. In this paper, we propose a novel video question answering (Video QA) task, DramaQA, for a comprehensive understanding of the video story. The DramaQA focused on two perspectives: 1) hierarchical QAs as an evaluation metric based on the cognitive developmental stages of human intelligence. 2) character-centered video annotations to model local coherence of the story. Our dataset is built upon the TV drama "Another Miss Oh" and it contains 16,191 QA pairs from 23,928 various length video clips, with each QA pair belonging to one of four difficulty levels. We provide 217,308 annotated images with rich character-centered annotations, including visual bounding boxes, behaviors, and emotions of main characters, and coreference resolved scripts. Additionally, we provide analyses of the dataset as well as Dual Matching Multistream model which effectively learns character-centered representations of video to answer questions about the video. We are planning to release our dataset and model publicly for research purposes and expect that our work will provide a new perspective on video story understanding research.Comment: 21 pages, 10 figures, submitted to ECCV 202

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Gaussian Processes with Context-Supported Priors for Active Object Localization

Author: Jedynak Bruno
Mitchell Melanie
Rhodes Anthony D.
Witte Jordan
Publication venue
Publication date: 20/09/2017
Field of study

We devise an algorithm using a Bayesian optimization framework in conjunction with contextual visual data for the efficient localization of objects in still images. Recent research has demonstrated substantial progress in object localization and related tasks for computer vision. However, many current state-of-the-art object localization procedures still suffer from inaccuracy and inefficiency, in addition to failing to provide a principled and interpretable system amenable to high-level vision tasks. We address these issues with the current research. Our method encompasses an active search procedure that uses contextual data to generate initial bounding-box proposals for a target object. We train a convolutional neural network to approximate an offset distance from the target object. Next, we use a Gaussian Process to model this offset response signal over the search space of the target. We then employ a Bayesian active search for accurate localization of the target. In experiments, we compare our approach to a state-of-theart bounding-box regression method for a challenging pedestrian localization task. Our method exhibits a substantial improvement over this baseline regression method.Comment: 10 pages, 4 figure

arXiv.org e-Print Archive

Crossref

A Review of Verbal and Non-Verbal Human-Robot Interactive Communication

Author: Mavridis Nikolaos
Publication venue
Publication date: 20/01/2014
Field of study

In this paper, an overview of human-robot interactive communication is presented, covering verbal as well as non-verbal aspects of human-robot interaction. Following a historical introduction, and motivation towards fluid human-robot communication, ten desiderata are proposed, which provide an organizational axis both of recent as well as of future research on human-robot communication. Then, the ten desiderata are examined in detail, culminating to a unifying discussion, and a forward-looking conclusion

arXiv.org e-Print Archive

Elsevier - Publisher Connector

Internet of robotic things : converging sensing/actuating, hypoconnectivity, artificial intelligence and IoT Platforms

Author: Bacciu D
Bahr R
Bröring A
Cavallo F
Chessa S
Dragone M
Gallicchio C
Micheli A.
Saffiotti A
Serrano M
Simoens Pieter
Tragos E
Vermesan O
Publication venue
Publication date: 01/01/2017
Field of study

The Internet of Things (IoT) concept is evolving rapidly and influencing newdevelopments in various application domains, such as the Internet of MobileThings (IoMT), Autonomous Internet of Things (A-IoT), Autonomous Systemof Things (ASoT), Internet of Autonomous Things (IoAT), Internetof Things Clouds (IoT-C) and the Internet of Robotic Things (IoRT) etc.that are progressing/advancing by using IoT technology. The IoT influencerepresents new development and deployment challenges in different areassuch as seamless platform integration, context based cognitive network integration,new mobile sensor/actuator network paradigms, things identification(addressing, naming in IoT) and dynamic things discoverability and manyothers. The IoRT represents new convergence challenges and their need to be addressed, in one side the programmability and the communication ofmultiple heterogeneous mobile/autonomous/robotic things for cooperating,their coordination, configuration, exchange of information, security, safetyand protection. Developments in IoT heterogeneous parallel processing/communication and dynamic systems based on parallelism and concurrencyrequire new ideas for integrating the intelligent “devices”, collaborativerobots (COBOTS), into IoT applications. Dynamic maintainability, selfhealing,self-repair of resources, changing resource state, (re-) configurationand context based IoT systems for service implementation and integrationwith IoT network service composition are of paramount importance whennew “cognitive devices” are becoming active participants in IoT applications.This chapter aims to be an overview of the IoRT concept, technologies,architectures and applications and to provide a comprehensive coverage offuture challenges, developments and applications

Ghent University Academic Bibliography

Publikationer från Örebro universitet

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Recommended from our members

Learning To Grasp

Author: Varley Jacob Joseph
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2018
Field of study

Providing robots with the ability to grasp objects has, despite decades of research, remained a challenging problem. The problem is approachable in constrained environments where there is ample prior knowledge of the scene and objects that will be manipulated. The challenge is in building systems that scale beyond specific situational instances and gracefully operate in novel conditions. In the past, heuristic and simple rule based strategies were used to accomplish tasks such as scene segmentation or reasoning about occlusion. These heuristic strategies work in constrained environments where a roboticist can make simplifying assumptions about everything from the geometries of the objects to be interacted with, level of clutter, camera position, lighting, and a myriad of other relevant variables. With these assumptions in place, it becomes tractable for a roboticist to hardcode desired behaviour and build a robotic system capable of completing repetitive tasks. These hardcoded behaviours will quickly fail if the assumptions about the environment are invalidated. In this thesis we will demonstrate how a robust grasping system can be built that is capable of operating under a more variable set of conditions without requiring significant engineering of behavior by a roboticist. This robustness is enabled by a new found ability to empower novel machine learning techniques with massive amounts of synthetic training data. The ability of simulators to create realistic sensory data enables the generation of massive corpora of labeled training data for various grasping related tasks. The use of simulation allows for the creation of a wide variety of environments and experiences exposing the robotic system to a large number of scenarios before ever operating in the real world. This thesis demonstrates that it is now possible to build systems that work in the real world trained using deep learning on synthetic data. The sheer volume of data that can be produced via simulation enables the use of powerful deep learning techniques whose performance scales with the amount of data available. This thesis will explore how deep learning and other techniques can be used to encode these massive datasets for efficient runtime use. The ability to train and test on synthetic data allows for quick iterative development of new perception, planning and grasp execution algorithms that work in a large number of environments. Creative applications of machine learning and massive synthetic datasets are allowing robotic systems to learn skills, and move beyond repetitive hardcoded tasks

Columbia University Academic Commons

Virtual Reality applied to biomedical engineering

Author: Cunill Fulquet Bernat
Mora Tarragona Arnau
Publication venue: Universitat Politècnica de Catalunya
Publication date: 11/01/2018
Field of study

Actualment, la realitat virtual esta sent tendència i s'està expandint a l'àmbit mèdic, fent possible l'aparició de nombroses aplicacions dissenyades per entrenar metges i tractar pacients de forma més eficient, així com optimitzar els processos de planificació quirúrgica. La necessitat mèdica i objectiu d'aquest projecte és fer òptim el procés de planificació quirúrgica per a cardiopaties congènites, que compren la reconstrucció en 3D del cor del pacient i la seva integració en una aplicació de realitat virtual. Seguint aquesta línia s’ha combinat un procés de modelat 3D d’imatges de cors obtinguts gracies al Hospital Sant Joan de Déu i el disseny de l’aplicació mitjançant el software Unity 3D gracies a l’empresa VISYON. S'han aconseguit millores en quant al software emprat per a la segmentació i reconstrucció, i s’han assolit funcionalitats bàsiques a l’aplicació com importar, moure, rotar i fer captures de pantalla en 3D de l'òrgan cardíac i així, entendre millor la cardiopatia que s’ha de tractar. El resultat ha estat la creació d'un procés òptim, en el que la reconstrucció en 3D ha aconseguit ser ràpida i precisa, el mètode d’importació a l’app dissenyada molt senzill, i una aplicació que permet una interacció atractiva i intuïtiva, gracies a una experiència immersiva i realista per ajustar-se als requeriments d'eficiència i precisió exigits en el camp mèdic

UPCommons. Portal del coneixement obert de la UPC

Categorization of indoor places by combining local binary pattern histograms of range and reflectance data from laser range finders

Author: Axelsson P
Fazl-Ersi E
Hitoshi Mizutani
Hojung Jung
Mozos OM
Oscar Martinez Mozos
Ryo Kurazume
Stachniss C
Tsutomu Hasegawa
Wu J
Publication venue: 'Informa UK Limited'
Publication date: 01/12/2013
Field of study

This paper presents an approach to categorize typical places in indoor environments using 3D scans provided by a laser range finder. Examples of such places are offices, laboratories, or kitchens. In our method, we combine the range and reflectance data from the laser scan for the final categorization of places. Range and reflectance images are transformed into histograms of local binary patterns and combined into a single feature vector. This vector is later classified using support vector machines. The results of the presented experiments demonstrate the capability of our technique to categorize indoor places with high accuracy. We also show that the combination of range and reflectance information improves the final categorization results in comparison with a single modality

University of Lincoln Institutional Repository

Crossref