431 research outputs found

    Efficient Image Stitching through Mobile Offloading

    Get PDF
    AbstractImage stitching is the task of combining images with overlapping parts to one big image. It needs a sequence of complex computation steps, especially the execution on a mobile device can take long and consume a lot of energy. Mobile offloading may alleviate those problems as it aims at improving performance and saving energy when executing complex applications on mobile devices. In this paper we investigate to which extent mobile offloading may improve the performance and energy efficiency of image stitching on mobile devices. We demonstrate our approach by stitching two or four images, but the process can be easily extended to an arbitrary number of images.We study three methods to offload parts of the computation to a resourceful server and evaluate them using several metrics. For the first offloading strategy all contributing images are sent, processed and the combined image is returned. For the second strategy images are offloaded, but not all stitching steps are executed on the remote server, and a smaller XML file is returned to the mobile client. The XML file contains a homography information which is needed by the mobile device to perform the last stitching step, the combination of the images. For the third strategy the images are transformed into grey scale before being transmitted to the server and an XML file is returned. The considered metrics are the execution time, the size of data to be transmitted and the memory usage. We find that the first strategy achieves the lowest total execution time but it requires more data to be transmitted than both the other strategies

    The Study and Literature Review of a Feature Extraction Mechanism in Computer Vison

    Get PDF
    Detecting the Features in the image is a challenging task in computer vison and numerous image processing applications. For example to detect the corners in an image there exists numerous algorithms. Corners are formed by combining multiple edges and which sometimes may not define the boundary of an image. This paper is mainly concentrates on the study of the Harris corner detection algorithm which accurately detects the corners exists in the image. The Harris corner detector is a widely used interest point detector due to strong features such as rotation, scale, illumination and in the case of noise. It is based on the local auto-correlation function of a signal; where the local auto-correlation function measures the local changes of the signal with patches shifted by a small amount in di?erent directions. In out experiments we have shown the results for gray scale images as well as for color images which gives the results for the individual regions present in the image. This algorithm is more reliable than the conventional methods

    Towards high-accuracy augmented reality GIS for architecture and geo-engineering

    Get PDF
    L’architecture et la géo-ingénierie sont des domaines où les professionnels doivent prendre des décisions critiques. Ceux-ci requièrent des outils de haute précision pour les assister dans leurs tâches quotidiennes. La Réalité Augmentée (RA) présente un excellent potentiel pour ces professionnels en leur permettant de faciliter l’association des plans 2D/3D représentatifs des ouvrages sur lesquels ils doivent intervenir, avec leur perception de ces ouvrages dans la réalité. Les outils de visualisation s’appuyant sur la RA permettent d’effectuer ce recalage entre modélisation spatiale et réalité dans le champ de vue de l’usager. Cependant, ces systèmes de RA nécessitent des solutions de positionnement en temps réel de très haute précision. Ce n’est pas chose facile, spécialement dans les environnements urbains ou sur les sites de construction. Ce projet propose donc d’investiguer les principaux défis que présente un système de RA haute précision basé sur les panoramas omnidirectionels.Architecture and geo-engineering are application domains where professionals need to take critical decisions. These professionals require high-precision tools to assist them in their daily decision taking process. Augmented Reality (AR) shows great potential to allow easier association between the abstract 2D drawings and 3D models representing infrastructure under reviewing and the actual perception of these objects in the reality. The different visualization tools based on AR allow to overlay the virtual models and the reality in the field of view of the user. However, the architecture and geo-engineering context requires high-accuracy and real-time positioning from these AR systems. This is not a trivial task, especially in urban environments or on construction sites where the surroundings may be crowded and highly dynamic. This project investigates the accuracy requirements of mobile AR GIS as well as the main challenges to address when tackling high-accuracy AR based on omnidirectional panoramas

    Handheld image acquisition with real-time vision for human-computer interaction on mobile applications

    Get PDF
    Tese de mestrado integrado, Engenharia Biomédica e Biofísica (Engenharia Clínica e Instrumentação Médica), Universidade de Lisboa, Faculdade de Ciências, 2019Várias patologias importantes manifestam-se na retina, sendo que estas podem ter origem na própria retina ou então provirem de doenças sistémicas. A retinopatia diabética, o glaucoma e a degeneração macular relacionada com a idade são algumas dessas patologias oculares, e também as maiores causas de cegueira nos países desenvolvidos. Graças à maior prevalência que se tem verificado, tem havido uma aposta cada vez maior na massificação do rastreio destas doenças, principalmente na população mais suscetível de as contrair. Visto que a retina é responsável pela formação de imagens, ou seja, pelo sentido da visão, os componentes oculares que estão localizados anteriormente têm de ser transparentes, permitindo assim a passagem da luz. Isto faz com que a retina e, por sua vez, o tecido cerebral, possam ser examinados de forma não-invasiva. Existem várias técnicas de imagiologia da retina, incluindo a angiografia fluoresceínica, a tomografia de coerência ótica e a retinografia. O protótipo EyeFundusScope (EFS) da Fraunhofer é um retinógrafo portátil, acoplado a um smartphone, que permite a obtenção de imagens do fundo do olho, sem que seja necessária a dilatação da pupila. Utiliza um algoritmo de aprendizagem automática para detetar lesões existentes na retina, que estão normalmente associadas a um quadro de retinopatia diabética. Para além disso, utiliza um sistema de suporte à decisão, que indica a ausência ou presença da referida retinopatia. A fiabilidade deste tipo de algoritmos e o correto diagnóstico por parte de oftalmologistas e neurologistas estão extremamente dependentes da qualidade das imagens adquiridas. A consistência da captura portátil, com este tipo de retinógrafos, está intimamente relacionada com uma interação apropriada com o utilizador. De forma a melhorar o contributo prestado pelo utilizador, durante o procedimento habitual da retinografia, foi desenvolvida uma nova interface gráfica de utilizador, na aplicação Android do EFS. A abordagem pretendida consiste em tornar o uso do EFS mais acessível, e encorajar técnicos não especializados a utilizarem esta técnica de imagem médica, tanto em ambiente clínico como fora deste. Composto por vários elementos de interação, que foram criados para atender às necessidades do protocolo de aquisição de imagem, a interface gráfica de utilizador deverá auxiliar todos os utilizadores no posicionamento e alinhamento do EFS com a pupila do doente. Para além disto, poderá existir um controlo personalizado do tempo despendido em aquisições do mesmo olho. Inicialmente, foram desenhadas várias versões dos elementos de interação rotacionais, sendo posteriormente as mesmas implementadas na aplicação Android. Estes elementos de interação utilizam os dados recolhidos dos sensores inerciais, já existentes no smartphone, para transmitir uma resposta em tempo real ao utilizador enquanto este move o EFS. Além dos elementos de interação rotacionais, também foram implementados um temporizador e um indicador do olho que está a ser examinado. Após a implementação de três configurações com as várias versões dos elementos de interação, procedeu-se à realização dos testes de usabilidade. No entanto, antes desta etapa se poder concretizar, foram realizados vários acertos e correções com a ajuda de um olho fantoma. Durante o planeamento dos testes de usabilidade foi estabelecido um protocolo para os diferentes cenários de uso e foi criado um tutorial com as principais cautelas que os utilizadores deveriam ter aquando das aquisições. Os resultados dos testes de usabilidade mostram que a nova interface gráfica teve um efeito bastante positivo na experiência dos utilizadores. A maioria adaptou-se rapidamente à nova interface, sendo que para muitos contribuiu para o sucesso da tarefa de aquisição de imagem. No futuro, espera-se que a combinação dos dados fornecidos pelos sensores inerciais, juntamente com a implementação de novos algoritmos de reconhecimento de imagem, sejam a base de uma nova e mais eficaz técnica de interação em prática clínica. Além disso, a nova interface gráfica poderá proporcionar ao EFS uma aplicação que sirva exclusivamente para efeitos de formação profissional.Many important diseases manifest themselves in the retina, both primary retinal conditions and systemic disorders. Diabetic retinopathy, glaucoma and age-related macular degeneration are some of the most frequent ocular disorders and the leading causes of blindness in developed countries. Since these disorders are becoming increasingly prevalent, there has been the need to encourage high coverage screening among the most susceptible population. As its function requires the retina to see the outside world, the involved optical components must be transparent for image formation. This makes the retinal tissue, and thereby brain tissue, accessible for imaging in a non-invasive manner. There are several approaches to visualize the retina including fluorescein angiography, optical coherence tomography and fundus photography. The Fraunhofer’s EyeFundusScope (EFS) prototype is a handheld smartphone-based fundus camera, that doesn’t require pupil dilation. It employs advanced machine learning algorithms to process the image in search of lesions that are often associated with diabetic retinopathy, making it a pre-diagnostic tool. The robustness of this computer vision algorithm, as well as the diagnose performance of ophthalmologists and neurologists, is strongly related with the quality of the images acquired. The consistency of handheld capture deeply depends on proper human interaction. In order to improve the user’s contribution to the retinal acquisition procedure, a new graphical user interface was designed and implemented in the EFS Acquisition App. The intended approach is to make the EFS easier to use by non-ophthalmic trained personnel, either in a non-clinical or in a clinical environment. Comprised of several interaction elements that were created to suit the needs of the acquisition procedure, the graphical user interface should help the user to position and align the EFS illumination beam with the patient’s pupil as well as keeping track of the time between acquisitions on the same eye. Initially, several versions of rotational interaction elements were designed and later implemented on the EFS Acquisition App. These used data from the smartphone’s inertial sensors to give real-time feedback to the user while moving the EFS. Besides the rotational interactional elements, a time-lapse and an eye indicator were also designed and implemented in the EFS. Usability tests took place, after three assemblies being successfully implemented and corrected with the help of a model eye ophthalmoscope trainer. Also, a protocol for the different use-case scenarios was elaborated, and a tutorial was created. Results from the usability tests, show that the new graphical user interface had a very positive outcome. The majority of users adapted very quickly to the new interface, and for many it contributed for a successful acquisition task. In the future, the grouping of inertial sensors data and image recognition may prove to be the foundations for a more efficient interaction technique performed in clinical practices. Furthermore, the new graphical user interface could provide the EFS with an application for educational purposes

    Streaming and User Behaviour in Omnidirectional Videos

    Get PDF
    Omnidirectional videos (ODVs) have gone beyond the passive paradigm of traditional video, offering higher degrees of immersion and interaction. The revolutionary novelty of this technology is the possibility for users to interact with the surrounding environment, and to feel a sense of engagement and presence in a virtual space. Users are clearly the main driving force of immersive applications and consequentially the services need to be properly tailored to them. In this context, this chapter highlights the importance of the new role of users in ODV streaming applications, and thus the need for understanding their behaviour while navigating within ODVs. A comprehensive overview of the research efforts aimed at advancing ODV streaming systems is also presented. In particular, the state-of-the-art solutions under examination in this chapter are distinguished in terms of system-centric and user-centric streaming approaches: the former approach comes from a quite straightforward extension of well-established solutions for the 2D video pipeline while the latter one takes the benefit of understanding users’ behaviour and enable more personalised ODV streaming

    Localisation and tracking of stationary users for extended reality

    Get PDF
    In this thesis, we investigate the topics of localisation and tracking in the context of Extended Reality. In many on-site or outdoor Augmented Reality (AR) applications, users are standing or sitting in one place and performing mostly rotational movements, i.e. stationary. This type of stationary motion also occurs in Virtual Reality (VR) applications such as panorama capture by moving a camera in a circle. Both applications require us to track the motion of a camera in potentially very large and open environments. State-of-the-art methods such as Structure-from-Motion (SfM), and Simultaneous Localisation and Mapping (SLAM), tend to rely on scene reconstruction from significant translational motion in order to compute camera positions. This can often lead to failure in application scenarios such as tracking for seated sport spectators, or stereo panorama capture where the translational movement is small compared to the scale of the environment. To begin with, we investigate the topic of localisation as it is key to providing global context for many stationary applications. To achieve this, we capture our own datasets in a variety of large open spaces including two sports stadia. We then develop and investigate these techniques in the context of these sports stadia using a variety of state-of-the-art localisation approaches. We cover geometry-based methods to handle dynamic aspects of a stadium environment, as well as appearance-based methods, and compare them to a state-of-the-art SfM system to identify the most applicable methods for server-based and on-device localisation. Recent work in SfM has shown that the type of stationary motion that we target can be reliably estimated by applying spherical constraints to the pose estimation. In this thesis, we extend these concepts into a real-time keyframe-based SLAM system for the purposes of AR, and develop a unique data structure for simplifying keyframe selection. We show that our constrained approach can track more robustly in these challenging stationary scenarios compared to state-of-the-art SLAM through both synthetic and real-data tests. In the application of capturing stereo panoramas for VR, this thesis demonstrates the unsuitability of standard SfM techniques for reconstructing these circular videos. We apply and extend recent research in spherically constrained SfM to creating stereo panoramas and compare this with state-of-the-art general SfM in a technical evaluation. With a user study, we show that the motion requirements of our SfM approach are similar to the natural motion of users, and that a constrained SfM approach is sufficient for providing stereoscopic effects when viewing the panoramas in VR

    Implementation of a distributed real-time video panorama pipeline for creating high quality virtual views

    Get PDF
    Today, we are continuously looking for more immersive video systems. Such systems, however, require more content, which can be costly to produce. A full panorama, covering regions of interest, can contain all the information required, but can be difficult to view in its entirety. In this thesis, we discuss a method for creating virtual views from a cylindrical panorama, allowing multiple users to create individual virtual cameras from the same panorama video. We discuss how this method can be used for video delivery, but emphasize on the creation of the initial panorama. The panorama must be created in real-time, and with very high quality. We design and implement a prototype recording pipeline, installed at a soccer stadium, as a part of the Bagadus project. We describe a pipeline capable of producing 4K panorama videos from five HD cameras, in real-time, with possibilities for further upscaling. We explain how the cylindrical panorama can be created, with minimal computational cost and without visible seams. The cameras of our prototype system record video in the incomplete Bayer format, and we also investigate which debayering algorithms are best suited for recording multiple high resolution video streams in real-time

    Drone-based panorama stitching: A study of SIFT, FLANN, and RANSAC techniques

    Get PDF
    This paper documents the tasks I accomplished during my internship and project at UPC. It provides an overview of the project's structure, objectives, and task distribution. A summary is given for the Web Application part of the project, which was handled by my teammate. This paper also details the drone and payloads used in the project and their functionalities. In the parts I was responsible for, I conducted thorough investigations and tests on the Raspberry Pi camera to obtain the best image quality during every flight test. I delved into the entire process of basic panorama stitching, encompassing features detection, descriptors matching, and transformation estimation based on the homography matrix. I compared popular feature detectors and descriptor matchers in terms of processing speed and performance, subsequently developing a panorama stitching algorithm for images captured by the drone. Finally, I provided a detailed discussion on some extra tasks that were not completed and points that could be improved upon. The paper not only stands as a detailed account of our contributions but also serves as an inspiration and a guide for future enhancements of drone-based panorama stitching
    • …
    corecore