153 research outputs found

    Vision-based localization methods under GPS-denied conditions

    Full text link
    This paper reviews vision-based localization methods in GPS-denied environments and classifies the mainstream methods into Relative Vision Localization (RVL) and Absolute Vision Localization (AVL). For RVL, we discuss the broad application of optical flow in feature extraction-based Visual Odometry (VO) solutions and introduce advanced optical flow estimation methods. For AVL, we review recent advances in Visual Simultaneous Localization and Mapping (VSLAM) techniques, from optimization-based methods to Extended Kalman Filter (EKF) based methods. We also introduce the application of offline map registration and lane vision detection schemes to achieve Absolute Visual Localization. This paper compares the performance and applications of mainstream methods for visual localization and provides suggestions for future studies.Comment: 32 pages, 15 figure

    Designing a fruit identification algorithm in orchard conditions to develop robots using video processing and majority voting based on hybrid artificial neural network

    Get PDF
    The first step in identifying fruits on trees is to develop garden robots for different purposes such as fruit harvesting and spatial specific spraying. Due to the natural conditions of the fruit orchards and the unevenness of the various objects throughout it, usage of the controlled conditions is very difficult. As a result, these operations should be performed in natural conditions, both in light and in the background. Due to the dependency of other garden robot operations on the fruit identification stage, this step must be performed precisely. Therefore, the purpose of this paper was to design an identification algorithm in orchard conditions using a combination of video processing and majority voting based on different hybrid artificial neural networks. The different steps of designing this algorithm were: (1) Recording video of different plum orchards at different light intensities; (2) converting the videos produced into its frames; (3) extracting different color properties from pixels; (4) selecting effective properties from color extraction properties using hybrid artificial neural network-harmony search (ANN-HS); and (5) classification using majority voting based on three classifiers of artificial neural network-bees algorithm (ANN-BA), artificial neural network-biogeography-based optimization (ANN-BBO), and artificial neural network-firefly algorithm (ANN-FA). Most effective features selected by the hybrid ANN-HS consisted of the third channel in hue saturation lightness (HSL) color space, the second channel in lightness chroma hue (LCH) color space, the first channel in L*a*b* color space, and the first channel in hue saturation intensity (HSI). The results showed that the accuracy of the majority voting method in the best execution and in 500 executions was 98.01% and 97.20%, respectively. Based on different performance evaluation criteria of the classifiers, it was found that the majority voting method had a higher performance.European Union (EU) under Erasmus+ project entitled “Fostering Internationalization in Agricultural Engineering in Iran and Russia” [FARmER] with grant number 585596-EPP-1-2017-1-DE-EPPKA2-CBHE-JPinfo:eu-repo/semantics/publishedVersio

    Regression Based Gaze Estimation with Natural Head Movement

    Get PDF
    This thesis presents a non-contact, video-based gaze tracking system using novel eye detection and gaze estimation techniques. The objective of the work is to develop a real-time gaze tracking system that is capable of estimating the gaze accurately under natural head movement. The system contains both hardware and software components. The hardware of the system is responsible for illuminating the scene and capturing facial images for further computer analysis, while the software implements the core technique of gaze tracking which consists of two main modules, i.e., eye detection subsystem and gaze estimation subsystem. The proposed gaze tracking technique uses image plane features, namely, the inter-pupil vector (IPV) and the image center-inter pupil center vector (IC-IPCV) to improve gaze estimation precision under natural head movement. A support vector regression (SVR) based estimation method using image plane features along with traditional pupil center-cornea reflection (PC-CR) vector is also proposed to estimate the gaze. The designed gaze tracking system can work in real-time and achieve an overall estimation accuracy of 0.84º with still head and 2.26º under natural head movement. By using the SVR method for off-line processing, the estimation accuracy with head movement can be improved to 1.12º while providing a tolerance of 10cm×8cm×5cm head movement

    Human-Centric Machine Vision

    Get PDF
    Recently, the algorithms for the processing of the visual information have greatly evolved, providing efficient and effective solutions to cope with the variability and the complexity of real-world environments. These achievements yield to the development of Machine Vision systems that overcome the typical industrial applications, where the environments are controlled and the tasks are very specific, towards the use of innovative solutions to face with everyday needs of people. The Human-Centric Machine Vision can help to solve the problems raised by the needs of our society, e.g. security and safety, health care, medical imaging, and human machine interface. In such applications it is necessary to handle changing, unpredictable and complex situations, and to take care of the presence of humans

    Towards Autonomous Selective Harvesting: A Review of Robot Perception, Robot Design, Motion Planning and Control

    Full text link
    This paper provides an overview of the current state-of-the-art in selective harvesting robots (SHRs) and their potential for addressing the challenges of global food production. SHRs have the potential to increase productivity, reduce labour costs, and minimise food waste by selectively harvesting only ripe fruits and vegetables. The paper discusses the main components of SHRs, including perception, grasping, cutting, motion planning, and control. It also highlights the challenges in developing SHR technologies, particularly in the areas of robot design, motion planning and control. The paper also discusses the potential benefits of integrating AI and soft robots and data-driven methods to enhance the performance and robustness of SHR systems. Finally, the paper identifies several open research questions in the field and highlights the need for further research and development efforts to advance SHR technologies to meet the challenges of global food production. Overall, this paper provides a starting point for researchers and practitioners interested in developing SHRs and highlights the need for more research in this field.Comment: Preprint: to be appeared in Journal of Field Robotic

    Automatic Screening and Classification of Diabetic Retinopathy Eye Fundus Image

    Get PDF
    Diabetic Retinopathy (DR) is a disorder of the retinal vasculature. It develops to some degree in nearly all patients with long-standing diabetes mellitus and can result in blindness. Screening of DR is essential for both early detection and early treatment. This thesis aims to investigate automatic methods for diabetic retinopathy detection and subsequently develop an effective system for the detection and screening of diabetic retinopathy. The presented diabetic retinopathy research involves three development stages. Firstly, the thesis presents the development of a preliminary classification and screening system for diabetic retinopathy using eye fundus images. The research will then focus on the detection of the earliest signs of diabetic retinopathy, which are the microaneurysms. The detection of microaneurysms at an early stage is vital and is the first step in preventing diabetic retinopathy. Finally, the thesis will present decision support systems for the detection of diabetic retinopathy and maculopathy in eye fundus images. The detection of maculopathy, which are yellow lesions near the macula, is essential as it will eventually cause the loss of vision if the affected macula is not treated in time. An accurate retinal screening, therefore, is required to assist the retinal screeners to classify the retinal images effectively. Highly efficient and accurate image processing techniques must thus be used in order to produce an effective screening of diabetic retinopathy. In addition to the proposed diabetic retinopathy detection systems, this thesis will present a new dataset, and will highlight the dataset collection, the expert diagnosis process and the advantages of the new dataset, compared to other public eye fundus images datasets available. The new dataset will be useful to researchers and practitioners working in the retinal imaging area and would widely encourage comparative studies in the field of diabetic retinopathy research. It is envisaged that the proposed decision support system for clinical screening would greatly contribute to and assist the management and the detection of diabetic retinopathy. It is also hoped that the developed automatic detection techniques will assist clinicians to diagnose diabetic retinopathy at an early stage

    Internal visuomotor models for cognitive simulation processes

    Get PDF
    Kaiser A. Internal visuomotor models for cognitive simulation processes. Bielefeld: Bielefeld University; 2014.Recent theories in cognitive science step back from the strict separation of perception, cognition, and the generation of behavior. Instead, cognition is viewed as a distributed process that relies on sensory, motor and affective states. In this notion, internal simulations -i.e. the mental reenactment of actions and their corresponding perceptual consequences - replace the application of logical rules on a set of abstract representations. These internal simulations are directly related to the physical body of an agent with its designated senses and motor repertoire. Correspondingly, the environment and the objects that reside therein are not viewed as a collection of symbols with abstract properties, but described in terms of their action possibilities, and thus as reciprocally coupled to the agent. In this thesis we will investigate a hypothetical computational model that enables an agent to infer information about specific objects based on internal sensorimotor simulations. This model will eventually enable the agent to reveal the behavioral meaning of objects. We claim that such a model would be more powerful than classical approaches that rely on the classification of objects based on visual features alone. However, the internal sensorimotor simulation needs to be driven by a number of modules that model certain aspects of the agents senses which is, especially for the visual sense, demanding in many aspects. The main part of this thesis will deal with the learning and modeling of sensorimotor patterns which represents an essential prerequisite for internal simulation. We present an efficient adaptive model for the prediction of optical flow patterns that occur during eye movements: This model enables the agent to transform its current view according to a covert motor command to virtually fixate a given point within its visual field. The model is further simplified based on a geometric analysis of the problem. This geometric model also serves as a solution to the problem of eye control. The resulting controller generates a kinematic motor command that moves the eye to a specific location within the visual field. We will investigate a neurally inspired extension of the eye control scheme that results in a higher accuracy of the controller. We will also address the problem of generating distal stimuli, i.e. views of the agent's gripper that are not present in its current view. The model we describe associates arm postures to pictorial views of the gripper. Finally, the problem of stereoptic depth perception is addressed. Here, we employ visual prediction in combination with an eye controller to generate virtually fixated views of objects in the left and right camera images. These virtually fixated views can be easily matched in order to establish correspondences. Furthermore, the motor information of the virtual fixation movement can be used to infer depth information

    A Survey of Facial Capture for Virtual Reality

    Full text link
    corecore