3,974 research outputs found

    Impact of Imaging and Distance Perception in VR Immersive Visual Experience

    Get PDF
    Virtual reality (VR) headsets have evolved to include unprecedented viewing quality. Meanwhile, they have become lightweight, wireless, and low-cost, which has opened to new applications and a much wider audience. VR headsets can now provide users with greater understanding of events and accuracy of observation, making decision-making faster and more effective. However, the spread of immersive technologies has shown a slow take-up, with the adoption of virtual reality limited to a few applications, typically related to entertainment. This reluctance appears to be due to the often-necessary change of operating paradigm and some scepticism towards the "VR advantage". The need therefore arises to evaluate the contribution that a VR system can make to user performance, for example to monitoring and decision-making. This will help system designers understand when immersive technologies can be proposed to replace or complement standard display systems such as a desktop monitor. In parallel to the VR headsets evolution there has been that of 360 cameras, which are now capable to instantly acquire photographs and videos in stereoscopic 3D (S3D) modality, with very high resolutions. 360° images are innately suited to VR headsets, where the captured view can be observed and explored through the natural rotation of the head. Acquired views can even be experienced and navigated from the inside as they are captured. The combination of omnidirectional images and VR headsets has opened to a new way of creating immersive visual representations. We call it: photo-based VR. This represents a new methodology that combines traditional model-based rendering with high-quality omnidirectional texture-mapping. Photo-based VR is particularly suitable for applications related to remote visits and realistic scene reconstruction, useful for monitoring and surveillance systems, control panels and operator training. The presented PhD study investigates the potential of photo-based VR representations. It starts by evaluating the role of immersion and user’s performance in today's graphical visual experience, to then use it as a reference to develop and evaluate new photo-based VR solutions. With the current literature on photo-based VR experience and associated user performance being very limited, this study builds new knowledge from the proposed assessments. We conduct five user studies on a few representative applications examining how visual representations can be affected by system factors (camera and display related) and how it can influence human factors (such as realism, presence, and emotions). Particular attention is paid to realistic depth perception, to support which we develop target solutions for photo-based VR. They are intended to provide users with a correct perception of space dimension and objects size. We call it: true-dimensional visualization. The presented work contributes to unexplored fields including photo-based VR and true-dimensional visualization, offering immersive system designers a thorough comprehension of the benefits, potential, and type of applications in which these new methods can make the difference. This thesis manuscript and its findings have been partly presented in scientific publications. In particular, five conference papers on Springer and the IEEE symposia, [1], [2], [3], [4], [5], and one journal article in an IEEE periodical [6], have been published

    Self-supervised learning for transferable representations

    Get PDF
    Machine learning has undeniably achieved remarkable advances thanks to large labelled datasets and supervised learning. However, this progress is constrained by the labour-intensive annotation process. It is not feasible to generate extensive labelled datasets for every problem we aim to address. Consequently, there has been a notable shift in recent times toward approaches that solely leverage raw data. Among these, self-supervised learning has emerged as a particularly powerful approach, offering scalability to massive datasets and showcasing considerable potential for effective knowledge transfer. This thesis investigates self-supervised representation learning with a strong focus on computer vision applications. We provide a comprehensive survey of self-supervised methods across various modalities, introducing a taxonomy that categorises them into four distinct families while also highlighting practical considerations for real-world implementation. Our focus thenceforth is on the computer vision modality, where we perform a comprehensive benchmark evaluation of state-of-the-art self supervised models against many diverse downstream transfer tasks. Our findings reveal that self-supervised models often outperform supervised learning across a spectrum of tasks, albeit with correlations weakening as tasks transition beyond classification, particularly for datasets with distribution shifts. Digging deeper, we investigate the influence of data augmentation on the transferability of contrastive learners, uncovering a trade-off between spatial and appearance-based invariances that generalise to real-world transformations. This begins to explain the differing empirical performances achieved by self-supervised learners on different downstream tasks, and it showcases the advantages of specialised representations produced with tailored augmentation. Finally, we introduce a novel self-supervised pre-training algorithm for object detection, aligning pre-training with downstream architecture and objectives, leading to reduced localisation errors and improved label efficiency. In conclusion, this thesis contributes a comprehensive understanding of self-supervised representation learning and its role in enabling effective transfer across computer vision tasks

    UMSL Bulletin 2023-2024

    Get PDF
    The 2023-2024 Bulletin and Course Catalog for the University of Missouri St. Louis.https://irl.umsl.edu/bulletin/1088/thumbnail.jp

    Dermoscopic dark corner artifacts removal: Friend or foe?

    Get PDF
    Background and Objectives: One of the more significant obstacles in classification of skin cancer is the presence of artifacts. This paper investigates the effect of dark corner artifacts, which result from the use of dermoscopes, on the performance of a deep learning binary classification task. Previous research attempted to remove and inpaint dark corner artifacts, with the intention of creating an ideal condition for models. However, such research has been shown to be inconclusive due to a lack of available datasets with corresponding labels for dark corner artifact cases. Methods: To address these issues, we label 10,250 skin lesion images from publicly available datasets and introduce a balanced dataset with an equal number of melanoma and non-melanoma cases. The training set comprises 6126 images without artifacts, and the testing set comprises 4124 images with dark corner artifacts. We conduct three experiments to provide new understanding on the effects of dark corner artifacts, including inpainted and synthetically generated examples, on a deep learning method. Results: Our results suggest that introducing synthetic dark corner artifacts which have been superimposed onto the training set improved model performance, particularly in terms of the true negative rate. This indicates that deep learning learnt to ignore dark corner artifacts, rather than treating it as melanoma, when dark corner artifacts were introduced into the training set. Further, we propose a new approach to quantifying heatmaps indicating network focus using a root mean square measure of the brightness intensity in the different regions of the heatmaps. Conclusions: The proposed artifact methods can be used in future experiments to help alleviate possible impacts on model performance. Additionally, the newly proposed heatmap quantification analysis will help to better understand the relationships between heatmap results and other model performance metrics

    Current Challenges and Advances in Cataract Surgery

    Get PDF
    This reprint focuses on new trials related to cataract surgery, intraocular lens power calculations for cataracts after refractive surgery, problems related to high myopia, toric IOL power calculations, etc. Intraoperative use of the 3D Viewing System and OCT, studies on the spectacle dependence of EDOF, IOL fixation status and visual function, and dry eye after FLAC are also discussed. Proteomic analysis of aqueous humor proteins is also discussed

    UMSL Bulletin 2022-2023

    Get PDF
    The 2022-2023 Bulletin and Course Catalog for the University of Missouri St. Louis.https://irl.umsl.edu/bulletin/1087/thumbnail.jp

    Medical Image Analysis using Deep Relational Learning

    Full text link
    In the past ten years, with the help of deep learning, especially the rapid development of deep neural networks, medical image analysis has made remarkable progress. However, how to effectively use the relational information between various tissues or organs in medical images is still a very challenging problem, and it has not been fully studied. In this thesis, we propose two novel solutions to this problem based on deep relational learning. First, we propose a context-aware fully convolutional network that effectively models implicit relation information between features to perform medical image segmentation. The network achieves the state-of-the-art segmentation results on the Multi Modal Brain Tumor Segmentation 2017 (BraTS2017) and Multi Modal Brain Tumor Segmentation 2018 (BraTS2018) data sets. Subsequently, we propose a new hierarchical homography estimation network to achieve accurate medical image mosaicing by learning the explicit spatial relationship between adjacent frames. We use the UCL Fetoscopy Placenta dataset to conduct experiments and our hierarchical homography estimation network outperforms the other state-of-the-art mosaicing methods while generating robust and meaningful mosaicing result on unseen frames.Comment: arXiv admin note: substantial text overlap with arXiv:2007.0778

    A Benchmark Comparison of Visual Place Recognition Techniques for Resource-Constrained Embedded Platforms

    Get PDF
    Autonomous navigation has become a widely researched area of expertise over the past few years, gaining a massive following due to its necessity in creating a fully autonomous robotic system. Autonomous navigation is an exceedingly difficult task to accomplish in and of itself. Successful navigation relies heavily on the ability to self-localise oneself within a given environment. Without this awareness of one’s own location, it is impossible to successfully navigate in an autonomous manner. Since its inception Simultaneous Localization and Mapping (SLAM) has become one of the most widely researched areas of autonomous navigation. SLAM focuses on self-localization within a mapped or un-mapped environment, and constructing or updating the map of one’s surroundings. Visual Place Recognition (VPR) is an essential part of any SLAM system. VPR relies on visual cues to determine one’s location within a mapped environment. This thesis presents two main topics within the field of VPR. First, this thesis presents a benchmark analysis of several popular embedded platforms when performing VPR. The presented benchmark analyses six different VPR techniques across three different datasets, and investigates accuracy, CPU usage, memory usage, processing time and power consumption. The benchmark demonstrated a clear relationship between platform architecture and the metrics measured, with platforms of the same architecture achieving comparable accuracy and algorithm efficiency. Additionally, the Raspberry Pi platform was noted as a standout in terms of algorithm efficiency and power consumption. Secondly, this thesis proposes an evaluation framework intended to provide information about a VPR technique’s useability within a real-time application. The approach makes use of the incoming frame rate of an image stream and the VPR frame rate, the rate at which the technique can perform VPR, to determine how efficient VPR techniques would be in a real-time environment. This evaluation framework determined that CoHOG would be the most effective algorithm to be deployed in a real-time environment as it had the best ratio between computation time and accuracy

    Determining Relationships Between Kinematic Sequencing and Baseball Pitch Velocity Using pitchAITM

    Get PDF
    Professional baseball pitchers have consistently been increasing pitch velocity since 2008 (the first year of automated pitch tracking and classification at all 30 MLB stadiums) and increasing the number of pitches thrown over 95mph (Sullivan, 2019). Fastball velocity is a primary risk factor for elbow injuries as there is a general linear relationship with increased elbow torques (Aguinaldo & Chambers, 2009; Chalmers et al., 2016; Slowik et al., 2019). The kinematic sequence has been referred to as the order and magnitude of joint angular velocities during the pitch delivery and has been associated with pitch velocity and elbow torque (Nicholson et al., 2022a, 2022b; Scarborough, Leonard, et al., 2021). The purpose of the research was to identify kinematic sequence metrics associated with pitch velocity and use them to predict pitch velocity using pitchAITM (Dobos et al., 2022). A total of 80 pitchers (187.2 ± 8.2 cm, age 20.1 ± 3.3 years) ranging in skill level from high school to professional baseball participated in this study. Video for pitchAITM, player height and weight were collected at 2 baseball training facilities. Extracted pitchAITM data included the peak magnitudes and relative timings of pelvis rotation velocity, trunk rotation velocity, elbow extension velocity, and shoulder internal rotation velocity. Average pitch velocity in the data set was 85.3 ± 5.7 mph or 38.1 ± 2.5 m/s. Pitch velocity was predicted using both a multilinear regression, as well as a custom neural network model. The multilinear regression generated a significant prediction for pitch velocity with an R2 = 0.368 and p < 0.01. Pitcher weight (β = 0.535, p < 0.001), peak pelvis rotational velocity timing (β = -0.157, p = 0.001), peak elbow extension timing (β = 0.122, p = 0.006), and peak shoulder internal rotation timing (β = -0.113, p = 0.018), were significant contributors to the multilinear model. The neural network model significantly predicted velocity with an R2 = 0.372, p < 0.01. Actual and predicted velocity were not significantly different (p = 0.353). In conclusion, pitchAITM kinematic sequencing can predict pitch velocity using both a multilinear regression and custom neural network
    • …
    corecore