Search CORE

1,448 research outputs found

Leveraging Image based Prior for Visual Place Recognition

Author: Kanji Tanaka
Taisho Tsukamoto
Publication venue
Publication date: 14/05/2015
Field of study

In this study, we propose a novel scene descriptor for visual place recognition. Unlike popular bag-of-words scene descriptors which rely on a library of vector quantized visual features, our proposed descriptor is based on a library of raw image data, such as publicly available photo collections from Google StreetView and Flickr. The library images need not to be associated with spatial information regarding the viewpoint and orientation of the scene. As a result, these images are cheaper than the database images; in addition, they are readily available. Our proposed descriptor directly mines the image library to discover landmarks (i.e., image patches) that suitably match an input query/database image. The discovered landmarks are then compactly described by their pose and shape (i.e., library image ID, bounding boxes) and used as a compact discriminative scene descriptor for the input image. We evaluate the effectiveness of our scene description framework by comparing its performance to that of previous approaches.Comment: 8 pages, 6 figures, preprint. Accepted for publication in MVA2015 (oral presentation

arXiv.org e-Print Archive

Crossref

Data-Driven Shape Analysis and Processing

Author: Huang Qixing
Kalogerakis Evangelos
Kim Vladimir G.
Xu Kai
Publication venue
Publication date: 23/02/2015
Field of study

Data-driven methods play an increasingly important role in discovering geometric, structural, and semantic relationships between 3D shapes in collections, and applying this analysis to support intelligent modeling, editing, and visualization of geometric data. In contrast to traditional approaches, a key feature of data-driven approaches is that they aggregate information from a collection of shapes to improve the analysis and processing of individual shapes. In addition, they are able to learn models that reason about properties and relationships of shapes without relying on hard-coded rules or explicitly programmed instructions. We provide an overview of the main concepts and components of these techniques, and discuss their application to shape classification, segmentation, matching, reconstruction, modeling and exploration, as well as scene analysis and synthesis, through reviewing the literature and relating the existing works with both qualitative and numerical comparisons. We conclude our report with ideas that can inspire future research in data-driven shape analysis and processing.Comment: 10 pages, 19 figure

arXiv.org e-Print Archive

CiteSeerX

Augmented Reality and Health Informatics: A Study based on Bibliometric and Content Analysis of Scholarly Communication and Social Media

Author: Gupte Nilish
Publication venue: Digital Commons @ LIU
Publication date: 01/01/2019
Field of study

Healthcare outcomes have been shown to improve when technology is used as part of patient care. Health Informatics (HI) is a multidisciplinary study of the design, development, adoption, and application of IT-based innovations in healthcare services delivery, management, and planning. Augmented Reality (AR) is an emerging technology that enhances the user’s perception and interaction with the real world. This study aims to illuminate the intersection of the field of AR and HI. The domains of AR and HI by themselves are areas of significant research. However, there is a scarcity of research on augmented reality as it applies to health informatics. Given both scholarly research and social media communication having contributed to the domains of AR and HI, research methodologies of bibliometric and content analysis on scholarly research and social media communication were employed to investigate the salient features and research fronts of the field. The study used Scopus data (7360 scholarly publications) to identify the bibliometric features and to perform content analysis of the identified research. The Altmetric database (an aggregator of data sources) was used to determine the social media communication for this field. The findings from this study included Publication Volumes, Top Authors, Affiliations, Subject Areas and Geographical Locations from scholarly publications as well as from a social media perspective. The highest cited 200 documents were used to determine the research fronts in scholarly publications. Content Analysis techniques were employed on the publication abstracts as a secondary technique to determine the research themes of the field. The study found the research frontiers in the scholarly communication included emerging AR technologies such as tracking and computer vision along with Surgical and Learning applications. There was a commonality between social media and scholarly communication themes from an applications perspective. In addition, social media themes included applications of AR in Healthcare Delivery, Clinical Studies and Mental Disorders. Europe as a geographic region dominates the research field with 50% of the articles and North America and Asia tie for second with 20% each. Publication volumes show a steep upward slope indicating continued research. Social Media communication is still in its infancy in terms of data extraction, however aggregators like Altmetric are helping to enhance the outcomes. The findings from the study revealed that the frontier research in AR has made an impact in the surgical and learning applications of HI and has the potential for other applications as new technologies are adopted

Long Island University

의미론적 환경 이해 기반 인간 로봇 협업

Author: 문지윤
Publication venue: 서울대학교 대학원
Publication date: 01/02/2020
Field of study

학위논문(박사)--서울대학교 대학원 :공과대학 전기·정보공학부,2020. 2. 이범희.Human-robot cooperation is unavoidable in various applications ranging from manufacturing to field robotics owing to the advantages of adaptability and high flexibility. Especially, complex task planning in large, unconstructed, and uncertain environments can employ the complementary capabilities of human and diverse robots. For a team to be effectives, knowledge regarding team goals and current situation needs to be effectively shared as they affect decision making. In this respect, semantic scene understanding in natural language is one of the most fundamental components for information sharing between humans and heterogeneous robots, as robots can perceive the surrounding environment in a form that both humans and other robots can understand. Moreover, natural-language-based scene understanding can reduce network congestion and improve the reliability of acquired data. Especially, in field robotics, transmission of raw sensor data increases network bandwidth and decreases quality of service. We can resolve this problem by transmitting information in the form of natural language that has encoded semantic representations of environments. In this dissertation, I introduce a human and heterogeneous robot cooperation scheme based on semantic scene understanding. I generate sentences and scene graphs, which is a natural language grounded graph over the detected objects and their relationships, with the graph map generated using a robot mapping algorithm. Subsequently, a framework that can utilize the results for cooperative mission planning of humans and robots is proposed. Experiments were performed to verify the effectiveness of the proposed methods. This dissertation comprises two parts: graph-based scene understanding and scene understanding based on the cooperation between human and heterogeneous robots. For the former, I introduce a novel natural language processing method using a semantic graph map. Although semantic graph maps have been widely applied to study the perceptual aspects of the environment, such maps do not find extensive application in natural language processing tasks. Several studies have been conducted on the understanding of workspace images in the field of computer vision; in these studies, the sentences were automatically generated, and therefore, multiple scenes have not yet been utilized for sentence generation. A graph-based convolutional neural network, which comprises spectral graph convolution and graph coarsening, and a recurrent neural network are employed to generate sentences attention over graphs. The proposed method outperforms the conventional methods on a publicly available dataset for single scenes and can be utilized for sequential scenes. Recently, deep learning has demonstrated impressive developments in scene understanding using natural language. However, it has not been extensively applied to high-level processes such as causal reasoning, analogical reasoning, or planning. The symbolic approach that calculates the sequence of appropriate actions by combining the available skills of agents outperforms in reasoning and planning; however, it does not entirely consider semantic knowledge acquisition for human-robot information sharing. An architecture that combines deep learning techniques and symbolic planner for human and heterogeneous robots to achieve a shared goal based on semantic scene understanding is proposed for scene understanding based on human-robot cooperation. In this study, graph-based perception is used for scene understanding. A planning domain definition language (PDDL) planner and JENA-TDB are utilized for mission planning and data acquisition storage, respectively. The effectiveness of the proposed method is verified in two situations: a mission failure, in which the dynamic environment changes, and object detection in a large and unseen environment.인간과 이종 로봇 간의 협업은 높은 유연성과 적응력을 보일 수 있다는 점에서 제조업에서 필드 로보틱스까지 다양한 분야에서 필연적이다. 특히, 서로 다른 능력을 지닌 로봇들과 인간으로 구성된 하나의 팀은 넓고 정형화되지 않은 공간에서 서로의 능력을 보완하며 복잡한 임무 수행을 가능하게 한다는 점에서 큰 장점을 갖는다. 효율적인 한 팀이 되기 위해서는, 팀의 공통된 목표 및 각 팀원의 현재 상황에 관한 정보를 실시간으로 공유할 수 있어야 하며 함께 의사 결정을 할 수 있어야 한다. 이러한 관점에서, 자연어를 통한 의미론적 환경 이해는 인간과 서로 다른 로봇들이 모두 이해할 수 있는 형태로 환경을 인지한다는 점에서 가장 필수적인 요소이다. 또한, 우리는 자연어 기반 환경 이해를 통해 네트워크 혼잡을 피함으로써 획득한 정보의 신뢰성을 높일 수 있다. 특히, 대량의 센서 데이터 전송에 의해 네트워크 대역폭이 증가하고 통신 QoS (Quality of Service) 신뢰도가 감소하는 문제가 빈번히 발생하는 필드 로보틱스 영역에서는 의미론적 환경 정보인 자연어를 전송함으로써 통신 대역폭을 감소시키고 통신 QoS 신뢰도를 증가시킬 수 있다. 본 학위 논문에서는 환경의 의미론적 이해 기반 인간 로봇 협동 방법에 대해 소개한다. 먼저, 로봇의 지도 작성 알고리즘을 통해 획득한 그래프 지도를 이용하여 자연어 문장과 검출한 객체 및 각 객체 간의 관계를 자연어 단어로 표현하는 그래프를 생성한다. 그리고 자연어 처리 결과를 이용하여 인간과 다양한 로봇들이 함께 협업하여 임무를 수행할 수 있도록 하는 프레임워크를 제안한다. 본 학위 논문은 크게 그래프를 이용한 의미론적 환경 이해와 의미론적 환경 이해를 통한 인간과 이종 로봇 간의 협업 방법으로 구성된다. 먼저, 그래프를 이용한 의미론적 환경 이해 부분에서는 의미론적 그래프 지도를 이용한 새로운 자연어 처리 방법에 대해 소개한다. 의미론적 그래프 지도 작성 방법은 로봇의 환경 인지 측면에서 많이 연구되었지만 이를 이용한 자연어 처리 방법은 거의 연구되지 않았다. 반면 컴퓨터 비전 분야에서는 이미지를 이용한 환경 이해 연구가 많이 이루어졌지만, 연속적인 장면들은 다루는데는 한계점이 있다. 따라서 우리는 그래프 스펙트럼 이론에 기반한 그래프 컨볼루션과 그래프 축소 레이어로 구성된 그래프 컨볼루션 신경망 및 순환 신경망을 이용하여 그래프를 설명하는 문장을 생성한다. 제안한 방법은 기존의 방법들보다 한 장면에 대해 향상된 성능을 보였으며 연속된 장면들에 대해서도 성공적으로 자연어 문장을 생성한다. 최근 딥러닝은 자연어 기반 환경 인지에 있어 급속도로 큰 발전을 이루었다. 하지만 인과 추론, 유추적 추론, 임무 계획과 같은 높은 수준의 프로세스에는 적용이 힘들다. 반면 임무를 수행하는 데 있어 각 에이전트의 능력에 맞게 행위들의 순서를 계산해주는 상징적 접근법(symbolic approach)은 추론과 임무 계획에 있어 뛰어난 성능을 보이지만 인간과 로봇들 사이의 의미론적 정보 공유 방법에 대해서는 거의 다루지 않는다. 따라서, 인간과 이종 로봇 간의 협업 방법 부분에서는 딥러닝 기법들과 상징적 플래너(symbolic planner)를 연결하는 프레임워크를 제안하여 의미론적 이해를 통한 인간 및 이종 로봇 간의 협업을 가능하게 한다. 우리는 의미론적 주변 환경 이해를 위해 이전 부분에서 제안한 그래프 기반 자연어 문장 생성을 수행한다. PDDL 플래너와 JENA-TDB는 각각 임무 계획 및 정보 획득 저장소로 사용한다. 제안한 방법의 효용성은 시뮬레이션을 통해 두 가지 상황에 대해서 검증한다. 하나는 동적 환경에서 임무 실패 상황이며 다른 하나는 넓은 공간에서 객체를 찾는 상황이다.1 Introduction 1 1.1 Background and Motivation 1 1.2 Literature Review 5 1.2.1 Natural Language-Based Human-Robot Cooperation 5 1.2.2 Artificial Intelligence Planning 5 1.3 The Problem Statement 10 1.4 Contributions 11 1.5 Dissertation Outline 12 2 Natural Language-Based Scene Graph Generation 14 2.1 Introduction 14 2.2 Related Work 16 2.3 Scene Graph Generation 18 2.3.1 Graph Construction 19 2.3.2 Graph Inference 19 2.4 Experiments 22 2.5 Summary 25 3 Language Description with 3D Semantic Graph 26 3.1 Introduction 26 3.2 Related Work 26 3.3 Natural Language Description 29 3.3.1 Preprocess 29 3.3.2 Graph Feature Extraction 33 3.3.3 Natural Language Description with Graph Features 34 3.4 Experiments 35 3.5 Summary 42 4 Natural Question with Semantic Graph 43 4.1 Introduction 43 4.2 Related Work 45 4.3 Natural Question Generation 47 4.3.1 Preprocess 49 4.3.2 Graph Feature Extraction 50 4.3.3 Natural Question with Graph Features 51 4.4 Experiments 52 4.5 Summary 58 5 PDDL Planning with Natural Language 59 5.1 Introduction 59 5.2 Related Work 60 5.3 PDDL Planning with Incomplete World Knowledge 61 5.3.1 Natural Language Process for PDDL Planning 63 5.3.2 PDDL Planning System 64 5.4 Experiments 65 5.5 Summary 69 6 PDDL Planning with Natural Language-Based Scene Understanding 70 6.1 Introduction 70 6.2 Related Work 74 6.3 A Framework for Heterogeneous Multi-Agent Cooperation 77 6.3.1 Natural Language-Based Cognition 78 6.3.2 Knowledge Engine 80 6.3.3 PDDL Planning Agent 81 6.4 Experiments 82 6.4.1 Experiment Setting 82 6.4.2 Scenario 84 6.4.3 Results 87 6.5 Summary 91 7 Conclusion 92Docto

SNU Open Repository and Archive

{CurveFusion}: {R}econstructing Thin Structures from {RGBD} Sequences

Author: Ceylan D.
Chen N.
Liu L.
Mitra N.
Theobalt C.
Wang W.
Publication venue
Publication date: 01/01/2021
Field of study

We introduce CurveFusion, the first approach for high quality scanning of thin structures at interactive rates using a handheld RGBD camera. Thin filament-like structures are mathematically just 1D curves embedded in R^3, and integration-based reconstruction works best when depth sequences (from the thin structure parts) are fused using the object's (unknown) curve skeleton. Thus, using the complementary but noisy color and depth channels, CurveFusion first automatically identifies point samples on potential thin structures and groups them into bundles, each being a group of a fixed number of aligned consecutive frames. Then, the algorithm extracts per-bundle skeleton curves using L1 axes, and aligns and iteratively merges the L1 segments from all the bundles to form the final complete curve skeleton. Thus, unlike previous methods, reconstruction happens via integration along a data-dependent fusion primitive, i.e., the extracted curve skeleton. We extensively evaluate CurveFusion on a range of challenging examples, different scanner and calibration settings, and present high fidelity thin structure reconstructions previously just not possible from raw RGBD sequences

MPG.PuRe

Computational time analysis in extended kalman filter based simultaneous localization and mapping

Author: Maziatun Mohamad Mazlan
Publication venue
Publication date: 01/01/2023
Field of study

The simultaneous localization and mapping (SLAM) of a mobile robot is one of the applications that use estimation techniques. SLAM is a navigation technique that allows a mobile robot to navigate around autonomously while observing its surroundings in an unfamiliar environment. SLAM does not require a priori map, instead the mobile robot creates a map of the area incrementally with the help of sensors on board and uses this map to localize its location Due to its relatively easy algorithm and efficiency of estimation via the representation of the belief by a multivariate Gaussian distribution and a unimodal distribution, with a single mean annotated and corresponding covariance uncertainty, the extended Kalman filter (EKF) has become one of the most preferred estimators in mobile robot SLAM. However, due to the update process of the covariance matrix, EKF-based SLAM has high computational time. In SLAM, if more observation is being made by mobile robot, the state covariance size will be increasing. This eventually requires more memory and processing time due to excessive computation needs to be calculated over time. Therefore there is a need of enhancing the estimation performance by reducing the computational time in SLAM. Three phases involve in this research methodology which the first is theoretical formulation of the mobile robot model. This is followed by the environment and estimation method used to solve the SLAM of mobile robot. Simulation analysis was used to verify the findings. This research attempts to introduce a new approach to simplify the structure of the covariance matrix using the eigenvalues matrix diagonalization method. Through simulation result it is proved that time taken to complete the SLAM process using diagonalized covariance was reduced as compared to the normal covariance. However, there is one limitation encountered from this method in which the covariance values become too small, that indicates an optimistic estimation. For this reason, second objective is motivated to improve the optimistic problem. Addition of new element into the diagonal matrix, which is known as a pseudo element, is also investigated in this study. Via mathematical approach, these problems are discussed and explored from estimation-theoretic point of view. Through adding the pseudo noise element into diagonalized covariance, the optimistic condition of covariance matrix can be improved. This was shown through the increased size of covariance ellipses at the end of simulation process. Based on the findings it can be concluded that the addition of pseudo matrix in the updated state covariance can further improved the computational time for mobile robot estimation

UMP Institutional Repository