14 research outputs found

    Reconstructing partially visible models using stereo vision, structured light, and the g2o framework

    Get PDF
    ABSTRACT This paper describes a framework for model-based 3D reconstruction of vines and trellising for a robot equipped with stereo cameras and structured light. In each frame, high-level 2D features, and a sparse set of 3D structured light points are found. Detected features are matched to 3D model components, and the g2o optimisation framework is used to estimate both the model's structure and the camera's trajectory. The system is demonstrated reconstructing the trellising present in images of vines, together with the camera's trajectory, over a 12m track consisting of 360 sets of frames. The estimated model is structurally correct and is almost complete, and the estimated trajectory drifts by just 4%. Future work will extend the framework to reconstruct the more complex structure of the vines

    Automated Archaeological Feature Detection Using Deep Learning on Optical UAV Imagery: Preliminary Results

    Get PDF
    This communication article provides a call for unmanned aerial vehicle (UAV) users in archaeology to make imagery data more publicly available while developing a new application to facilitate the use of a common deep learning algorithm (mask region-based convolutional neural network; Mask R-CNN) for instance segmentation. The intent is to provide specialists with a GUI-based tool that can apply annotation used for training for neural network models, enable training and development of segmentation models, and allow classification of imagery data to facilitate auto-discovery of features. The tool is generic and can be used for a variety of settings, although the tool was tested using datasets from the United Arab Emirates (UAE), Oman, Iran, Iraq, and Jordan. Current outputs suggest that trained data are able to help identify ruined structures, that is, structures such as burials, exposed building ruins, and other surface features that are in some degraded state. Additionally, qanat(s), or ancient underground channels having surface access holes, and mounded sites, which have distinctive hill-shaped features, are also identified. Other classes are also possible, and the tool helps users make their own training-based approach and feature identification classes. To improve accuracy, we strongly urge greater publication of UAV imagery data by projects using open journal publications and public repositories. This is something done in other fields with UAV data and is now needed in heritage and archaeology. Our tool is provided as part of the outputs given

    Automated Archaeological Feature Detection Using Deep Learning on Optical UAV Imagery: Preliminary Results

    Get PDF
    This communication article provides a call for unmanned aerial vehicle (UAV) users in archaeology to make imagery data more publicly available while developing a new application to facilitate the use of a common deep learning algorithm (mask region-based convolutional neural network; Mask R-CNN) for instance segmentation. The intent is to provide specialists with a GUI-based tool that can apply annotation used for training for neural network models, enable training and development of segmentation models, and allow classification of imagery data to facilitate auto-discovery of features. The tool is generic and can be used for a variety of settings, although the tool was tested using datasets from the United Arab Emirates (UAE), Oman, Iran, Iraq, and Jordan. Current outputs suggest that trained data are able to help identify ruined structures, that is, structures such as burials, exposed building ruins, and other surface features that are in some degraded state. Additionally, qanat(s), or ancient underground channels having surface access holes, and mounded sites, which have distinctive hill-shaped features, are also identified. Other classes are also possible, and the tool helps users make their own training-based approach and feature identification classes. To improve accuracy, we strongly urge greater publication of UAV imagery data by projects using open journal publications and public repositories. This is something done in other fields with UAV data and is now needed in heritage and archaeology. Our tool is provided as part of the outputs give

    Enhanced device-based 3D object manipulation technique for handheld mobile augmented reality

    Get PDF
    3D object manipulation is one of the most important tasks for handheld mobile Augmented Reality (AR) towards its practical potential, especially for realworld assembly support. In this context, techniques used to manipulate 3D object is an important research area. Therefore, this study developed an improved device based interaction technique within handheld mobile AR interfaces to solve the large range 3D object rotation problem as well as issues related to 3D object position and orientation deviations in manipulating 3D object. The research firstly enhanced the existing device-based 3D object rotation technique with an innovative control structure that utilizes the handheld mobile device tilting and skewing amplitudes to determine the rotation axes and directions of the 3D object. Whenever the device is tilted or skewed exceeding the threshold values of the amplitudes, the 3D object rotation will start continuously with a pre-defined angular speed per second to prevent over-rotation of the handheld mobile device. This over-rotation is a common occurrence when using the existing technique to perform large-range 3D object rotations. The problem of over-rotation of the handheld mobile device needs to be solved since it causes a 3D object registration error and a 3D object display issue where the 3D object does not appear consistent within the user’s range of view. Secondly, restructuring the existing device-based 3D object manipulation technique was done by separating the degrees of freedom (DOF) of the 3D object translation and rotation to prevent the 3D object position and orientation deviations caused by the DOF integration that utilizes the same control structure for both tasks. Next, an improved device-based interaction technique, with better performance on task completion time for 3D object rotation unilaterally and 3D object manipulation comprehensively within handheld mobile AR interfaces was developed. A pilot test was carried out before other main tests to determine several pre-defined values designed in the control structure of the proposed 3D object rotation technique. A series of 3D object rotation and manipulation tasks was designed and developed as separate experimental tasks to benchmark both the proposed 3D object rotation and manipulation techniques with existing ones on task completion time (s). Two different groups of participants aged 19-24 years old were selected for both experiments, with each group consisting sixteen participants. Each participant had to complete twelve trials, which came to a total 192 trials per experiment for all the participants. Repeated measure analysis was used to analyze the data. The results obtained have statistically proven that the developed 3D object rotation technique markedly outpaced existing technique with significant shorter task completion times of 2.04s shorter on easy tasks and 3.09s shorter on hard tasks after comparing the mean times upon all successful trials. On the other hand, for the failed trials, the 3D object rotation technique was 4.99% more accurate on easy tasks and 1.78% more accurate on hard tasks in comparison to the existing technique. Similar results were also extended to 3D object manipulation tasks with an overall 9.529s significant shorter task completion time of the proposed manipulation technique as compared to the existing technique. Based on the findings, an improved device-based interaction technique has been successfully developed to address the insufficient functionalities of the current technique

    Improving land cover classification using genetic programming for feature construction

    Get PDF
    Batista, J. E., Cabral, A. I. R., Vasconcelos, M. J. P., Vanneschi, L., & Silva, S. (2021). Improving land cover classification using genetic programming for feature construction. Remote Sensing, 13(9), [1623]. https://doi.org/10.3390/rs13091623Genetic programming (GP) is a powerful machine learning (ML) algorithm that can produce readable white-box models. Although successfully used for solving an array of problems in different scientific areas, GP is still not well known in the field of remote sensing. The M3GP algorithm, a variant of the standard GP algorithm, performs feature construction by evolving hyperfeatures from the original ones. In this work, we use the M3GP algorithm on several sets of satellite images over different countries to create hyperfeatures from satellite bands to improve the classification of land cover types. We add the evolved hyperfeatures to the reference datasets and observe a significant improvement of the performance of three state-of-the-art ML algorithms (decision trees, random forests, and XGBoost) on multiclass classifications and no significant effect on the binary classifications. We show that adding the M3GP hyperfeatures to the reference datasets brings better results than adding the well-known spectral indices NDVI, NDWI, and NBR. We also compare the performance of the M3GP hyperfeatures in the binary classification problems with those created by other feature construction methods such as FFX and EFS.publishersversionpublishe

    Localisation and tracking of stationary users for extended reality

    Get PDF
    In this thesis, we investigate the topics of localisation and tracking in the context of Extended Reality. In many on-site or outdoor Augmented Reality (AR) applications, users are standing or sitting in one place and performing mostly rotational movements, i.e. stationary. This type of stationary motion also occurs in Virtual Reality (VR) applications such as panorama capture by moving a camera in a circle. Both applications require us to track the motion of a camera in potentially very large and open environments. State-of-the-art methods such as Structure-from-Motion (SfM), and Simultaneous Localisation and Mapping (SLAM), tend to rely on scene reconstruction from significant translational motion in order to compute camera positions. This can often lead to failure in application scenarios such as tracking for seated sport spectators, or stereo panorama capture where the translational movement is small compared to the scale of the environment. To begin with, we investigate the topic of localisation as it is key to providing global context for many stationary applications. To achieve this, we capture our own datasets in a variety of large open spaces including two sports stadia. We then develop and investigate these techniques in the context of these sports stadia using a variety of state-of-the-art localisation approaches. We cover geometry-based methods to handle dynamic aspects of a stadium environment, as well as appearance-based methods, and compare them to a state-of-the-art SfM system to identify the most applicable methods for server-based and on-device localisation. Recent work in SfM has shown that the type of stationary motion that we target can be reliably estimated by applying spherical constraints to the pose estimation. In this thesis, we extend these concepts into a real-time keyframe-based SLAM system for the purposes of AR, and develop a unique data structure for simplifying keyframe selection. We show that our constrained approach can track more robustly in these challenging stationary scenarios compared to state-of-the-art SLAM through both synthetic and real-data tests. In the application of capturing stereo panoramas for VR, this thesis demonstrates the unsuitability of standard SfM techniques for reconstructing these circular videos. We apply and extend recent research in spherically constrained SfM to creating stereo panoramas and compare this with state-of-the-art general SfM in a technical evaluation. With a user study, we show that the motion requirements of our SfM approach are similar to the natural motion of users, and that a constrained SfM approach is sufficient for providing stereoscopic effects when viewing the panoramas in VR

    A Flexible, Low-Power, Programmable Unsupervised Neural Network Based on Microcontrollers for Medical Applications

    Get PDF
    We present an implementation and laboratory tests of a winner takes all (WTA) artificial neural network (NN) on two microcontrollers (μC) with the ARM Cortex M3 and the AVR cores. The prospective application of this device is in wireless body sensor network (WBSN) in an on-line analysis of electrocardiograph (ECG) and electromyograph (EMG) biomedical signals. The proposed device will be used as a base station in the WBSN, acquiring and analysing the signals from the sensors placed on the human body. The proposed system is equiped with an analog-todigital converter (ADC), and allows for multi-channel acquisition of analog signals, preprocessing (filtering) and further analysis

    Motion measurement algorithms for MARS imaging

    Get PDF
    The goal of the MARS molecular imaging team is to advance medicine by researching, developing, and commercialising spectral CT systems. This thesis presents the work I performed to facilitate live imaging with MARS scanners. This aim was achieved by developing a gating algorithm, designing and developing a mouse holder, and creating a motorised motion phantom. My gating algorithms will contribute to improving the image quality of human data obtained by human-scale MARS scanners. I contributed to the design and development of a mouse holder with a temperature regulating system that is compatible with MARS scanners for the purpose of live animal imaging. This holder design provides simple animal handling, secure positioning, anaesthesia delivery, regulated temperature control, and physiological monitoring. I developed a post-acquisition automatic gating method based on the acquired scan data over time. This method is capable of identifying various movement phases to sort the acquired exposure images into temporal bins. To reduce the undersampling noise due to gating, a weight-based reconstruction algorithm was introduced and implemented. Instead of binning the data, this method employed all images for the reconstruction of specific time points by assigning a weight to each. The result of applying this method showed that it can improve the undersampling artefacts compared to the temporal binning method. To evaluate the gating method, a motorised motion phantom was designed and manufactured. The motion phantom could be programmed to produce periodic signals with a similar frequency and amplitude to that of a mouse or human breathing. The quantitative measurements showed that gating can reduce motion artefacts and blurring by 50% with a 1mm amplitude and 26% for a 5mm amplitude. The effect of motion on the material decomposition process in MARS imaging systems was investigated. Known contrast agents were added to the motion phantom and then scanned with movements with the amplitude of 1 to 5 mm. No clear trend between the motion amplitude and the material decomposition accuracy was observed. The gated images had lower SNR compared to the non-gated data, resulting in more misidentified voxels. This suggests that noise properties are more important than motion blur. In summary, the research documented in this thesis facilitates live imaging in MARS scanners in the future

    Image quality characterisation in computed tomography to assess the relevant diagnostic information

    Get PDF

    Removing spatial boundaries in immersive mobile communications

    Get PDF
    Despite a worldwide trend towards mobile computing, current telepresence experiences focus on stationary desktop computers, limiting how, when, and where researched solutions can be used. In this thesis I demonstrate that mobile phones are a capable platform for future research, showing the effectiveness of the communications possible through their inherent portability and ubiquity. I first describe a framework upon which future systems can be built, which allows two distant users to explore one of several panoramic representations of the local environment by reorienting their device. User experiments demonstrate this framework's ability to induce a sense of presence within the space and between users, and show that capturing this environment live provides no significant benefits over constructing it incrementally. This discovery enables a second application that allows users to explore a three-dimensional representation of their environment. Each user's position is shown as an avatar, with live facial capture to facilitate natural communication. Either may also see the full environment by occupying the same virtual space. This application is also evaluated and shown to provide efficient communications to its users, providing a novel untethered experience not possible on stationary hardware despite the inherent lack of computational ability available on mobile devices
    corecore