380 research outputs found

    Fast Shadow Detection from a Single Image Using a Patched Convolutional Neural Network

    Full text link
    In recent years, various shadow detection methods from a single image have been proposed and used in vision systems; however, most of them are not appropriate for the robotic applications due to the expensive time complexity. This paper introduces a fast shadow detection method using a deep learning framework, with a time cost that is appropriate for robotic applications. In our solution, we first obtain a shadow prior map with the help of multi-class support vector machine using statistical features. Then, we use a semantic- aware patch-level Convolutional Neural Network that efficiently trains on shadow examples by combining the original image and the shadow prior map. Experiments on benchmark datasets demonstrate the proposed method significantly decreases the time complexity of shadow detection, by one or two orders of magnitude compared with state-of-the-art methods, without losing accuracy.Comment: 6 pages, 5 figures, Submitted to IROS 201

    Exploring the usage of edge gradients within images to perform coarse localisation

    Get PDF
    With the rise in autonomous systems being integrated into the world around us, it has become increasingly important that theses systems have functions that allow the navigation of environments. One of the key functions is the recognition of the environment in which the system resides. This thesis seeks to contribute to methods that a given system can use to recognise an environment. To do this, an omni-directional camera is used to produce images of locations which contain sharp edges that lay at certain angles. By counting the pixels on these sharp edges and putting them into histograms based on the corresponding angles, a data structure can be formed to describe the location depicted in the image. This data is taken from multiple images over two locations and then compared to one another. These comparisons show that a system can differentiate between images of locations with this data structure showing a significant difference between two locations. Knowing this, it was then analysed how the differentiating ability of this kind of system developed as the amount of locations increased. This was done by increasing the amount of locations and having the system make a decision as to whether two images belong to the same location. This is then compared to how a human participant performed with the exact same image set. This experiment needs to be performed on a larger data set for any kind of statistical significance, however these initial results show that there is a steady decline in the ability to differentiate between images with this system. However the system had a very high false positive rate which is something that should be studied in more detail

    Human robot interaction in a crowded environment

    No full text
    Human Robot Interaction (HRI) is the primary means of establishing natural and affective communication between humans and robots. HRI enables robots to act in a way similar to humans in order to assist in activities that are considered to be laborious, unsafe, or repetitive. Vision based human robot interaction is a major component of HRI, with which visual information is used to interpret how human interaction takes place. Common tasks of HRI include finding pre-trained static or dynamic gestures in an image, which involves localising different key parts of the human body such as the face and hands. This information is subsequently used to extract different gestures. After the initial detection process, the robot is required to comprehend the underlying meaning of these gestures [3]. Thus far, most gesture recognition systems can only detect gestures and identify a person in relatively static environments. This is not realistic for practical applications as difficulties may arise from people‟s movements and changing illumination conditions. Another issue to consider is that of identifying the commanding person in a crowded scene, which is important for interpreting the navigation commands. To this end, it is necessary to associate the gesture to the correct person and automatic reasoning is required to extract the most probable location of the person who has initiated the gesture. In this thesis, we have proposed a practical framework for addressing the above issues. It attempts to achieve a coarse level understanding about a given environment before engaging in active communication. This includes recognizing human robot interaction, where a person has the intention to communicate with the robot. In this regard, it is necessary to differentiate if people present are engaged with each other or their surrounding environment. The basic task is to detect and reason about the environmental context and different interactions so as to respond accordingly. For example, if individuals are engaged in conversation, the robot should realize it is best not to disturb or, if an individual is receptive to the robot‟s interaction, it may approach the person. Finally, if the user is moving in the environment, it can analyse further to understand if any help can be offered in assisting this user. The method proposed in this thesis combines multiple visual cues in a Bayesian framework to identify people in a scene and determine potential intentions. For improving system performance, contextual feedback is used, which allows the Bayesian network to evolve and adjust itself according to the surrounding environment. The results achieved demonstrate the effectiveness of the technique in dealing with human-robot interaction in a relatively crowded environment [7]
    • …
    corecore