2,869 research outputs found

    Augmented Reality Meets Computer Vision : Efficient Data Generation for Urban Driving Scenes

    Full text link
    The success of deep learning in computer vision is based on availability of large annotated datasets. To lower the need for hand labeled images, virtually rendered 3D worlds have recently gained popularity. Creating realistic 3D content is challenging on its own and requires significant human effort. In this work, we propose an alternative paradigm which combines real and synthetic data for learning semantic instance segmentation and object detection models. Exploiting the fact that not all aspects of the scene are equally important for this task, we propose to augment real-world imagery with virtual objects of the target category. Capturing real-world images at large scale is easy and cheap, and directly provides real background appearances without the need for creating complex 3D models of the environment. We present an efficient procedure to augment real images with virtual objects. This allows us to create realistic composite images which exhibit both realistic background appearance and a large number of complex object arrangements. In contrast to modeling complete 3D environments, our augmentation approach requires only a few user interactions in combination with 3D shapes of the target object. Through extensive experimentation, we conclude the right set of parameters to produce augmented data which can maximally enhance the performance of instance segmentation models. Further, we demonstrate the utility of our approach on training standard deep models for semantic instance segmentation and object detection of cars in outdoor driving scenes. We test the models trained on our augmented data on the KITTI 2015 dataset, which we have annotated with pixel-accurate ground truth, and on Cityscapes dataset. Our experiments demonstrate that models trained on augmented imagery generalize better than those trained on synthetic data or models trained on limited amount of annotated real data

    Procedural Modeling and Physically Based Rendering for Synthetic Data Generation in Automotive Applications

    Full text link
    We present an overview and evaluation of a new, systematic approach for generation of highly realistic, annotated synthetic data for training of deep neural networks in computer vision tasks. The main contribution is a procedural world modeling approach enabling high variability coupled with physically accurate image synthesis, and is a departure from the hand-modeled virtual worlds and approximate image synthesis methods used in real-time applications. The benefits of our approach include flexible, physically accurate and scalable image synthesis, implicit wide coverage of classes and features, and complete data introspection for annotations, which all contribute to quality and cost efficiency. To evaluate our approach and the efficacy of the resulting data, we use semantic segmentation for autonomous vehicles and robotic navigation as the main application, and we train multiple deep learning architectures using synthetic data with and without fine tuning on organic (i.e. real-world) data. The evaluation shows that our approach improves the neural network's performance and that even modest implementation efforts produce state-of-the-art results.Comment: The project web page at http://vcl.itn.liu.se/publications/2017/TKWU17/ contains a version of the paper with high-resolution images as well as additional materia

    The Effectiveness of Augmented Reality for Astronauts on Lunar Missions: An Analog Study

    Get PDF
    The uses of augmented reality and head-up displays are becoming more prominent in industries such as aviation, automotive, and medicine. An augmented reality device such as the Microsoft HoloLens can project holograms onto the user’s natural field of view to assist with completion of a variety of tasks. Unfortunately, only a little research and development has begun in the space sector for astronauts using these head-up displays. Future lunar missions could incorporate augmented reality for astronauts to ease task load and improve accuracy. This study evaluated the usability, subjective workload, and task performance of 22 participants using the Microsoft HoloLens to complete tasks that are analogous to those completed by astronauts on a lunar mission, including navigation, rock sample collection, and maintenance tasks. Results from the usability survey, NASA-TLX, and usability interview suggested that augmented reality could support astronaut missions by means of reduced workload and task errors. Usability data information collected from the participants sought to improve on the user interface and confirmed the aforementioned results. The researcher concluded that further research must be conducted to test the development of augmented reality interfaces along with the usability aspect by the National Aeronautics and Space Administration astronauts

    Multimodal interaction for deliberate practice

    Get PDF

    Effectiveness and User Experience of Augmented and Mixed Reality for Procedural Task Training

    Get PDF
    Use of augmented reality (AR) and mixed reality (MR) technologies for training is increasing, due in part to opportunities for increased immersion, safer training, and reduced costs. However, AR/MR training effectiveness and user experience, particularly for head-mounted displays (HMDs), is not well understood. The purpose of this study is to investigate user perceptions and retention of AR/MR training delivered through a HMD for a procedural task. This two-part study utilized a within-subjects experimental design with 30 participants to determine how instruction method (paper vs. AR vs. MR) and time of procedure recall (immediate vs. post-test vs. retention) influenced completion time, perceived task difficulty, perceived confidence in successfully completing the task, workload, user experience, and trainee reactions. Results indicate differences between instruction methods for user experience and preference, with significantly higher user experience ratings for MR and lower preference rankings for AR. Findings also show decreased performance, increased perceived task difficulty, and decreased confidence as time since training increased, with no significant differences in these measures between instruction methods. Completion times and workload were also found to be comparable between instruction methods. This work provides insight into objective and subjective differences between paper-, AR-, and MR-based training experiences, which can be used to determine which type of training is best suited for a particular use case. Recommendations for appropriately matching training modalities and scenarios, as well as for how to successfully design AR/MR training experiences, are discussed

    IndustReal: A Dataset for Procedure Step Recognition Handling Execution Errors in Egocentric Videos in an Industrial-Like Setting

    Full text link
    Although action recognition for procedural tasks has received notable attention, it has a fundamental flaw in that no measure of success for actions is provided. This limits the applicability of such systems especially within the industrial domain, since the outcome of procedural actions is often significantly more important than the mere execution. To address this limitation, we define the novel task of procedure step recognition (PSR), focusing on recognizing the correct completion and order of procedural steps. Alongside the new task, we also present the multi-modal IndustReal dataset. Unlike currently available datasets, IndustReal contains procedural errors (such as omissions) as well as execution errors. A significant part of these errors are exclusively present in the validation and test sets, making IndustReal suitable to evaluate robustness of algorithms to new, unseen mistakes. Additionally, to encourage reproducibility and allow for scalable approaches trained on synthetic data, the 3D models of all parts are publicly available. Annotations and benchmark performance are provided for action recognition and assembly state detection, as well as the new PSR task. IndustReal, along with the code and model weights, is available at: https://github.com/TimSchoonbeek/IndustReal .Comment: Accepted for WACV 2024. 15 pages, 9 figures, including supplementary material

    Interaktiiviset kokoonpano-ohjeet

    Get PDF
    Industrial products are increasingly varying, and the assembly of customized or unique products is slow, expensive, and prone to errors. Conventional static assembly drawings and instructions are suboptimal in supporting complex and dynamic assembly operations. The main objective of the study was to investigate if interactive assembly instructions could substitute the current documents instructing assembly in the case company. Two approaches, 3D instructions and augmented reality (AR) instructions, were developed based on literature review. 3D instructions presented the assembly procedure in steps in which the assembly of the parts is animated. The instructions were based directly on the 3D model of the assembly object. AR instructions utilized the same assembly sequence as 3D instructions. AR instructions were viewed using a head-mounted display, which presented the assembly step animations spatially overlaid on the physical assembly. The developed instructions were evaluated in a user study. The tests were observed by the author, and the participants answered to a post-study questionnaire that concerned subjective efficiency and user acceptance. Both AR instructions and 3D instructions received positive feedback and were evaluated more efficient than the currently used assembly drawings. The features of the interactive assembly drawings address directly the problems of the current assembly documents. Hence, it was concluded that interactive assembly instructions could be used instead of the current assembly drawings and work instructions. However, the complexity of the case company products require that the instructions must be configurable to enable their implementation.Teolliset tuotteet kehittyvät jatkuvasti monipuolisemmin muunneltaviksi, ja samalla niiden kokoonpano muuttuu hankalammaksi ja kalliimmaksi. Perinteiset kuviin ja tekstiin perustuvat kokoonpanokuvat ja työohjeet ovat monin tavoin riittämättömiä ohjeistamaan monimutkaisia ja dynaamisia kokoonpanotehtäviä. Tässä työssä tavoitteena oli tutkia, voisiko interaktiivisilla kokoonpano-ohjeilla korvata kohdeyrityksessä nykyisin käytössä olevat työohjeet ja kokoonpanokuvat. Työssä kehitettiin aikaisempien tutkimusten pohjalta kaksi erilaista interaktiivista ohjeistustapaa. 3D-ohjeet opastavat kokoonpanoa vaihe vaiheelta näyttäen jokaisen osan asennuksen animoidusti. 3D-ohjeet luodaan suoraan kokoonpanon 3D-mallin pohjalta. Toiseksi menetelmäksi valikoitui lisättyä todellisuutta (augmented reality, AR) hyödyntävät ohjeet. AR-ohjeet perustuvat 3D-ohjeita varten luotuihin vaiheistuksiin sekä animaatioihin. AR-ohjeita katsotaan silmikkonäytöllä, joka näyttää ohjeiden virtuaaliset komponentit todellisen kokoonpanon päällä. Ohjeiden toimivuutta testattiin käyttäjäkokeissa. Testeissä havainnoitiin koehenkilöiden toimintaa, ja lisäksi he vastasivat kyselyyn. Kyselyllä selvitettiin, miten tehokkaana koehenkilöt pitivät testattuja ohjeita verrattuna heidän tavallisesti käyttämiin kokoonpanokuviin. Sekä AR- että 3D-ohjeet saivat positiivista palautetta, ja koehenkilöt kokivat niiden toimivan tavallisia kokoonpanokuvia paremmin. Interaktiiviset ohjeet ja niiden tärkeimmät ominaisuudet vastaavat nykyisten kokoonpanokuvien ja työohjeiden ongelmakohtiin. Työn johtopäätöksenä voidaankin todeta, että interaktiiviset kokoonpano-ohjeet sopisivat korvaamaan nykyiset kokoonpanokuvat sekä työohjeet. Tuotteiden monimutkaisuus kuitenkin edellyttää, että ohjeet pitää pystyä konfiguroimaan varianttikohtaisesti

    Large Language Model as Attributed Training Data Generator: A Tale of Diversity and Bias

    Full text link
    Large language models (LLMs) have been recently leveraged as training data generators for various natural language processing (NLP) tasks. While previous research has explored different approaches to training models using generated data, they generally rely on simple class-conditional prompts, which may limit the diversity of the generated data and inherit systematic biases of LLM. Thus, we investigate training data generation with diversely attributed prompts (e.g., specifying attributes like length and style), which have the potential to yield diverse and attributed generated data. Our investigation focuses on datasets with high cardinality and diverse domains, wherein we demonstrate that attributed prompts outperform simple class-conditional prompts in terms of the resulting model's performance. Additionally, we present a comprehensive empirical study on data generation encompassing vital aspects like bias, diversity, and efficiency, and highlight three key observations: firstly, synthetic datasets generated by simple prompts exhibit significant biases, such as regional bias; secondly, attribute diversity plays a pivotal role in enhancing model performance; lastly, attributed prompts achieve the performance of simple class-conditional prompts while utilizing only 5\% of the querying cost of ChatGPT associated with the latter. We release the generated dataset and used prompts to facilitate future research. The data and code will be available on \url{https://github.com/yueyu1030/AttrPrompt}.Comment: Work in progress. A shorter version is accepted to the ICML DMLR worksho
    corecore