3,355 research outputs found

    Event-based Vision: A Survey

    Get PDF
    Event cameras are bio-inspired sensors that differ from conventional frame cameras: Instead of capturing images at a fixed rate, they asynchronously measure per-pixel brightness changes, and output a stream of events that encode the time, location and sign of the brightness changes. Event cameras offer attractive properties compared to traditional cameras: high temporal resolution (in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low power consumption, and high pixel bandwidth (on the order of kHz) resulting in reduced motion blur. Hence, event cameras have a large potential for robotics and computer vision in challenging scenarios for traditional cameras, such as low-latency, high speed, and high dynamic range. However, novel methods are required to process the unconventional output of these sensors in order to unlock their potential. This paper provides a comprehensive overview of the emerging field of event-based vision, with a focus on the applications and the algorithms developed to unlock the outstanding properties of event cameras. We present event cameras from their working principle, the actual sensors that are available and the tasks that they have been used for, from low-level vision (feature detection and tracking, optic flow, etc.) to high-level vision (reconstruction, segmentation, recognition). We also discuss the techniques developed to process events, including learning-based techniques, as well as specialized processors for these novel sensors, such as spiking neural networks. Additionally, we highlight the challenges that remain to be tackled and the opportunities that lie ahead in the search for a more efficient, bio-inspired way for machines to perceive and interact with the world

    Calibration and Sensitivity Analysis of a Stereo Vision-Based Driver Assistance System

    Get PDF
    Az http://intechweb.org/ alatti "Books" fül alatt kell rákeresni a "Stereo Vision" címre és az 1. fejezetre

    DefSLAM: Tracking and Mapping of Deforming Scenes from Monocular Sequences

    Get PDF
    Monocular simultaneous localization and mapping (SLAM) algorithms perform robustly when observing rigid scenes; however, they fail when the observed scene deforms, for example, in medical endoscopy applications. In this article, we present DefSLAM, the first monocular SLAM capable of operating in deforming scenes in real time. Our approach intertwines Shape-from-Template (SfT) and Non-Rigid Structure-from-Motion (NRSfM) techniques to deal with the exploratory sequences typical of SLAM. A deformation tracking thread recovers the pose of the camera and the deformation of the observed map, at frame rate, by means of SfT processing a template that models the scene shape-at-rest. A deformation mapping thread runs in parallel with the tracking to update the template, at keyframe rate, by means of an isometric NRSfM processing a batch of full perspective keyframes. In our experiments, DefSLAM processes close-up sequences of deforming scenes, both in a laboratory-controlled experiment and in medical endoscopy sequences, producing accurate 3-D models of the scene with respect to the moving camera

    Retrospective Motion Correction in Magnetic Resonance Imaging of the Brain

    Get PDF
    Magnetic Resonance Imaging (MRI) is a tremendously useful diagnostic imaging modality that provides outstanding soft tissue contrast. However, subject motion is a significant unsolved problem; motion during image acquisition can cause blurring and distortions in the image, limiting its diagnostic utility. Current techniques for addressing head motion include optical tracking which can be impractical in clinical settings due to challenges associated with camera cross-calibration and marker fixation. Another category of techniques is MRI navigators, which use specially acquired MRI data to track the motion of the head. This thesis presents two techniques for motion correction in MRI: the first is spherical navigator echoes (SNAVs), which are rapidly acquired k-space navigators. The second is a deep convolutional neural network trained to predict an artefact-free image from motion-corrupted data. Prior to this thesis, SNAVs had been demonstrated for motion measurement but not motion correction, and they required the acquisition of a 26s baseline scan during which the subject could not move. In this work, a novel baseline approach is developed where the acquisition is reduced to 2.6s. Spherical navigators were interleaved into a spoiled gradient echo sequence (SPGR) on a stand-alone MRI system and a turbo-FLASH sequence (tfl) on a hybrid PET/MRI system to enable motion measurement throughout image acquisition. The SNAV motion measurements were then used to retrospectively correct the image data. While MRI navigator methods, particularly SNAVs that can be acquired very rapidly, are useful for motion correction, they do require pulse sequence modifications. A deep learning technique may be a more general solution. In this thesis, a conditional generative adversarial network (cGAN) is trained to perform motion correction on image data with simulated motion artefacts. We simulate motion in previously acquired brain images and use the image pairs (corrupted + original) to train the cGAN. MR image data was qualitatively and quantitatively improved following correction using the SNAV motion estimates. This was also true for the simultaneously acquired MR and PET data on the hybrid system. Motion corrected images were more similar than the uncorrected to the no-motion reference images. The deep learning approach was also successful for motion correction. The trained cGAN was evaluated on 5 subjects; and artefact suppression was observed in all images

    A model-based approach for combined tracking and resolution enhancement of faces in low resolution video

    Get PDF
    Wide area surveillance situations require many sensors, thus making the use of highresolution cameras prohibitive because of high costs and exponential growth in storage. Small and low cost CCTV cameras may produce poor quality video, and high-resolution CCD cameras in wide area surveillance can still yield low-resolution images of the object o

    Monocular slam for deformable scenarios.

    Get PDF
    El problema de localizar la posición de un sensor en un mapa incierto que se estima simultáneamente se conoce como Localización y Mapeo Simultáneo --SLAM--. Es un problema desafiante comparable al paradigma del huevo y la gallina. Para ubicar el sensor necesitamos conocer el mapa, pero para construir el mapa, necesitamos la posición del sensor. Cuando se utiliza un sensor visual, por ejemplo, una cámara, se denomina Visual SLAM o VSLAM. Los sensores visuales para SLAM se dividen entre los que proporcionan información de profundidad (por ejemplo, cámaras RGB-D o equipos estéreo) y los que no (por ejemplo, cámaras monoculares o cámaras de eventos). En esta tesis hemos centrado nuestra investigación en SLAM con cámaras monoculares.Debido a la falta de percepción de profundidad, el SLAM monocular es intrínsecamente más duro en comparación con el SLAM con sensores de profundidad. Los trabajos estado del arte en VSLAM monocular han asumido normalmente que la escena permanece rígida durante toda la secuencia, lo que es una suposición factible para entornos industriales y urbanos. El supuesto de rigidez aporta las restricciones suficientes al problema y permite reconstruir un mapa fiable tras procesar varias imágenes. En los últimos años, el interés por el SLAM ha llegado a las áreas médicas donde los algoritmos SLAM podrían ayudar a orientar al cirujano o localizar la posición de un robot. Sin embargo, a diferencia de los escenarios industriales o urbanos, en secuencias dentro del cuerpo, todo puede deformarse eventualmente y la suposición de rigidez acaba siendo inválida en la práctica, y por extensión, también los algoritmos de SLAM monoculares. Por lo tanto, nuestro objetivo es ampliar los límites de los algoritmos de SLAM y concebir el primer sistema SLAM monocular capaz de hacer frente a la deformación de la escena.Los sistemas de SLAM actuales calculan la posición de la cámara y la estructura del mapa en dos subprocesos concurrentes: la localización y el mapeo. La localización se encarga de procesar cada imagen para ubicar el sensor de forma continua, en cambio el mapeo se encarga de construir el mapa de la escena. Nosotros hemos adoptado esta estructura y concebimos tanto la localización deformable como el mapeo deformable ahora capaces de recuperar la escena incluso con deformación.Nuestra primera contribución es la localización deformable. La localización deformable utiliza la estructura del mapa para recuperar la pose de la cámara con una única imagen. Simultáneamente, a medida que el mapa se deforma durante la secuencia, también recupera la deformación del mapa para cada fotograma. Hemos propuesto dos familias de localización deformable. En el primer algoritmo de localización deformable, asumimos que todos los puntos están embebidos en una superficie denominada plantilla. Podemos recuperar la deformación de la superficie gracias a un modelo de deformación global que permite estimar la deformación más probable del objeto. Con nuestro segundo algoritmo de localización deformable, demostramos que es posible recuperar la deformación del mapa sin un modelo de deformación global, representando el mapa como surfels individuales. Nuestros resultados experimentales mostraron que, recuperando la deformación del mapa, ambos métodos superan tanto en robustez como en precisión a los métodos rígidos.Nuestra segunda contribución es la concepción del mapeo deformable. Es el back-end del algoritmo SLAM y procesa un lote de imágenes para recuperar la estructura del mapa para todas las imágenes y hacer crecer el mapa ensamblando las observaciones parciales del mismo. Tanto la localización deformable como el mapeo que se ejecutan en paralelo y juntos ensamblan el primer SLAM monocular deformable: \emph{DefSLAM}. Una evaluación ampliada de nuestro método demostró, tanto en secuencias controladas por laboratorio como en secuencias médicas, que nuestro método procesa con éxito secuencias en las que falla el sistema monocular SLAM actual.Nuestra tercera contribución son dos métodos para explotar la información fotométrica en SLAM monocular deformable. Por un lado, SD-DefSLAM que aprovecha el emparejamiento semi-directo para obtener un emparejamiento mucho más fiable de los puntos del mapa en las nuevas imágenes, como consecuencia, se demostró que es más robusto y estable en secuencias médicas. Por otro lado, proponemos un método de Localización Deformable Directa y Dispersa en el que usamos un error fotométrico directo para rastrear la deformación de un mapa modelado como un conjunto de surfels 3D desconectados. Podemos recuperar la deformación de múltiples superficies desconectadas, deformaciones no isométricas o superficies con una topología cambiante.<br /

    Model based methods for locating, enhancing and recognising low resolution objects in video

    Get PDF
    Visual perception is our most important sense which enables us to detect and recognise objects even in low detail video scenes. While humans are able to perform such object detection and recognition tasks reliably, most computer vision algorithms struggle with wide angle surveillance videos that make automatic processing difficult due to low resolution and poor detail objects. Additional problems arise from varying pose and lighting conditions as well as non-cooperative subjects. All these constraints pose problems for automatic scene interpretation of surveillance video, including object detection, tracking and object recognition.Therefore, the aim of this thesis is to detect, enhance and recognise objects by incorporating a priori information and by using model based approaches. Motivated by the increasing demand for automatic methods for object detection, enhancement and recognition in video surveillance, different aspects of the video processing task are investigated with a focus on human faces. In particular, the challenge of fully automatic face pose and shape estimation by fitting a deformable 3D generic face model under varying pose and lighting conditions is tackled. Principal Component Analysis (PCA) is utilised to build an appearance model that is then used within a particle filter based approach to fit the 3D face mask to the image. This recovers face pose and person-specific shape information simultaneously. Experiments demonstrate the use in different resolution and under varying pose and lighting conditions. Following that, a combined tracking and super resolution approach enhances the quality of poor detail video objects. A 3D object mask is subdivided such that every mask triangle is smaller than a pixel when projected into the image and then used for model based tracking. The mask subdivision then allows for super resolution of the object by combining several video frames. This approach achieves better results than traditional super resolution methods without the use of interpolation or deblurring.Lastly, object recognition is performed in two different ways. The first recognition method is applied to characters and used for license plate recognition. A novel character model is proposed to create different appearances which are then matched with the image of unknown characters for recognition. This allows for simultaneous character segmentation and recognition and high recognition rates are achieved for low resolution characters down to only five pixels in size. While this approach is only feasible for objects with a limited number of different appearances, like characters, the second recognition method is applicable to any object, including human faces. Therefore, a generic 3D face model is automatically fitted to an image of a human face and recognition is performed on a mask level rather than image level. This approach does not require an initial pose estimation nor the selection of feature points, the face alignment is provided implicitly by the mask fitting process

    Combination of analysis techniques for efficient track reconstruction in high multiplicity events

    Get PDF
    A novel combination of established data analysis techniques for reconstructing all charged-particle tracks in high energy collisions is proposed. It uses all information available in a collision event while keeping competing choices open as long as possible. Suitable track candidates are selected by transforming measured hits to a binned, three- or four-dimensional, track parameter space. It is accomplished by the use of templates taking advantage of the translational and rotational symmetries of the detectors. Track candidates and their corresponding hits, the nodes, form a usually highly connected network, a bipartite graph, where we allow for multiple hit to track assignments, edges. The graph is cut into very many minigraphs by removing a few of its vulnerable components, edged and nodes. Finally the hits are distributed among the track candidates by exploring a deterministic decision tree. A depth-limited search is performed maximising the number of hits on tracks, and minimising the sum of track-fit χ2\chi^2. Simplified models of LHC silicon trackers, as well as the relevant physics processes, are employed to study the performance (efficiency, purity, timing) of the proposed method in the case of single or many simultaneous proton-proton collisions (high pileup), and for single heavy-ion collisions at the highest available energies.Comment: 11 pages, 12 figures, submitted to EPJ
    • …