110,139 research outputs found

    An intelligent modular real-time vision-based system for environment perception

    Full text link
    A significant portion of driving hazards is caused by human error and disregard for local driving regulations; Consequently, an intelligent assistance system can be beneficial. This paper proposes a novel vision-based modular package to ensure drivers' safety by perceiving the environment. Each module is designed based on accuracy and inference time to deliver real-time performance. As a result, the proposed system can be implemented on a wide range of vehicles with minimum hardware requirements. Our modular package comprises four main sections: lane detection, object detection, segmentation, and monocular depth estimation. Each section is accompanied by novel techniques to improve the accuracy of others along with the entire system. Furthermore, a GUI is developed to display perceived information to the driver. In addition to using public datasets, like BDD100K, we have also collected and annotated a local dataset that we utilize to fine-tune and evaluate our system. We show that the accuracy of our system is above 80% in all the sections. Our code and data are available at https://github.com/Pandas-Team/Autonomous-Vehicle-Environment-PerceptionComment: Accepted in NeurIPS 2022 Workshop on Machine Learning for Autonomous Drivin

    RGB-D datasets using microsoft kinect or similar sensors: a survey

    Get PDF
    RGB-D data has turned out to be a very useful representation of an indoor scene for solving fundamental computer vision problems. It takes the advantages of the color image that provides appearance information of an object and also the depth image that is immune to the variations in color, illumination, rotation angle and scale. With the invention of the low-cost Microsoft Kinect sensor, which was initially used for gaming and later became a popular device for computer vision, high quality RGB-D data can be acquired easily. In recent years, more and more RGB-D image/video datasets dedicated to various applications have become available, which are of great importance to benchmark the state-of-the-art. In this paper, we systematically survey popular RGB-D datasets for different applications including object recognition, scene classification, hand gesture recognition, 3D-simultaneous localization and mapping, and pose estimation. We provide the insights into the characteristics of each important dataset, and compare the popularity and the difficulty of those datasets. Overall, the main goal of this survey is to give a comprehensive description about the available RGB-D datasets and thus to guide researchers in the selection of suitable datasets for evaluating their algorithms

    Indoor place classification for intelligent mobile systems

    Full text link
    University of Technology Sydney. Faculty of Engineering and Information Technology.Place classification is an emerging theme in the study of human-robot interaction which requires common understanding of human-defined concepts between the humans and machines. The requirement posts a significant challenge to the current intelligent mobile systems which are more likely to be operating in absolute coordinate systems, and hence unaware of the semantic labels. Aimed at filling this gap, the objective of the research is to develop an approach for intelligent mobile systems to understand and label the indoor environments in a holistic way based on the sensory observations. Focusing on commonly available sensors and machine learning based solutions which play a significant role in the research of place classification, solutions to train a machine to assign unknown instances with concepts understandable to human beings, like room, office and corridor, in both independent and structured prediction ways, have been proposed in this research. The solution modelling dependencies between random variables, which takes the spatial relationship between observations into consideration, is further extended by integrating the logical coexistence of the objects and the places to provide the machine with the additional object detection ability. The main techniques involve logistic regression, support vector machine, and conditional random field, in both supervised and semi-supervised learning frameworks. Experiments in a variety of environments show convincing place classification results through machine learning based approaches on data collected with either single or multiple sensory modalities; modelling spatial dependencies and introducing semi-supervised learning paradigm further improve the accuracy of the prediction and the generalisation ability of the system; and vision-based object detection can be seamlessly integrated into the learning framework to enhance the discrimination ability and the flexibility of the system. The contributions of this research lie in the in-depth studies on the place classification solutions with independent predictions, the improvements on the generalisation ability of the system through semi-supervised learning paradigm, the formulation of training a conditional random field with partially labelled data, and the integration of multiple cues in two sensory modalities to improve the system's functionality. It is anticipated that the findings of this research will significantly enhance the current capabilities of the human robot interaction and robot-environment interaction

    RGBD Datasets: Past, Present and Future

    Full text link
    Since the launch of the Microsoft Kinect, scores of RGBD datasets have been released. These have propelled advances in areas from reconstruction to gesture recognition. In this paper we explore the field, reviewing datasets across eight categories: semantics, object pose estimation, camera tracking, scene reconstruction, object tracking, human actions, faces and identification. By extracting relevant information in each category we help researchers to find appropriate data for their needs, and we consider which datasets have succeeded in driving computer vision forward and why. Finally, we examine the future of RGBD datasets. We identify key areas which are currently underexplored, and suggest that future directions may include synthetic data and dense reconstructions of static and dynamic scenes.Comment: 8 pages excluding references (CVPR style
    • …
    corecore