1,577 research outputs found

    Forward model for quantitative pulse-echo speed-of-sound imaging

    Get PDF
    Computed ultrasound tomography in echo mode (CUTE) allows determining the spatial distribution of speed-of-sound (SoS) inside tissue using handheld pulse-echo ultrasound (US). This technique is based on measuring the changing phase of beamformed echoes obtained under varying transmit (Tx) and/or receive (Rx) steering angles. The SoS is reconstructed by inverting a forward model describing how the spatial distribution of SoS is related to the spatial distribution of the echo phase shift. CUTE holds promise as a novel diagnostic modality that complements conventional US in a single, real-time handheld system. Here we demonstrate that, in order to obtain robust quantitative results, the forward model must contain two features that were not taken into account so far: a) the phase shift must be detected between pairs of Tx and Rx angles that are centred around a set of common mid-angles, and b) it must account for an additional phase shift induced by the error of the reconstructed position of echoes. In a phantom study mimicking liver imaging, this new model leads to a substantially improved quantitative SoS reconstruction compared to the model that has been used so far. The importance of the new model as a prerequisite for an accurate diagnosis is corroborated in preliminary volunteer results

    Unsupervised Odometry and Depth Learning for Endoscopic Capsule Robots

    Full text link
    In the last decade, many medical companies and research groups have tried to convert passive capsule endoscopes as an emerging and minimally invasive diagnostic technology into actively steerable endoscopic capsule robots which will provide more intuitive disease detection, targeted drug delivery and biopsy-like operations in the gastrointestinal(GI) tract. In this study, we introduce a fully unsupervised, real-time odometry and depth learner for monocular endoscopic capsule robots. We establish the supervision by warping view sequences and assigning the re-projection minimization to the loss function, which we adopt in multi-view pose estimation and single-view depth estimation network. Detailed quantitative and qualitative analyses of the proposed framework performed on non-rigidly deformable ex-vivo porcine stomach datasets proves the effectiveness of the method in terms of motion estimation and depth recovery.Comment: submitted to IROS 201

    Utilizing radiation for smart robotic applications using visible, thermal, and polarization images.

    Get PDF
    The domain of this research is the use of computer vision methodologies in utilizing radiation for smart robotic applications for driving assistance. Radiation can be emitted by an object, reflected or transmitted. Understanding the nature and the properties of the radiation forming an image is essential in interpreting the information in that image which can then be used by a machine e.g. a smart vehicle to make a decision and perform an action. Throughout this work, different types of images are used to help a robotic vehicle make a decision and perform a certain action. This work presents three smart robotic applications; the first one deals with polarization images, the second one deals with thermal images and the third one deals with visible images. Each type of these images is formed by light (radiation) but in a way different from other types where the information embedded in an image depends on the way it was formed and how the light was generated. For polarization imaging, a direct method utilizing shading and polarization for unambiguous shape recovery without the need for nonlinear optimization routines is proposed. The proposed method utilizes simultaneously polarization and shading to find the surface normals, thus eliminating the reconstruction ambiguity. This can be useful to help a smart vehicle gain knowledge about the terrain surface geometry. Regarding thermal imaging, an automatic method for constructing an annotated thermal imaging pedestrian dataset is proposed. This is done by transferring detections from registered visible images simultaneously captured at day-time where pedestrian detection is well developed in visible images. Histogram of Oriented Gradients (HOG) features are extracted from the constructed dataset and then fed to a discriminatively trained deformable part based classifier that can be used to detect pedestrians at night. The resulting classifier was tested for night driving assistance and succeeded in detecting pedestrians even in the situations where visible imaging pedestrian detectors failed because of low light or glare of oncoming traffic. For visible images, a new feature based on HOG is proposed to be used for pedestrian detection. The proposed feature was augmented to two state of the art pedestrian detectors; the discriminatively trained Deformable Part based models (DPM) and the Integral Channel Features (ICF) using fast feature pyramids. The proposed approach is based on computing the image mixed partial derivatives to be used to redefine the gradients of some pixels and to reweigh the vote at all pixels with respect to the original HOG. The approach was tested on the PASCAL2007, INRIA and Caltech datasets and showed to have an outstanding performance

    Event-based Vision: A Survey

    Get PDF
    Event cameras are bio-inspired sensors that differ from conventional frame cameras: Instead of capturing images at a fixed rate, they asynchronously measure per-pixel brightness changes, and output a stream of events that encode the time, location and sign of the brightness changes. Event cameras offer attractive properties compared to traditional cameras: high temporal resolution (in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low power consumption, and high pixel bandwidth (on the order of kHz) resulting in reduced motion blur. Hence, event cameras have a large potential for robotics and computer vision in challenging scenarios for traditional cameras, such as low-latency, high speed, and high dynamic range. However, novel methods are required to process the unconventional output of these sensors in order to unlock their potential. This paper provides a comprehensive overview of the emerging field of event-based vision, with a focus on the applications and the algorithms developed to unlock the outstanding properties of event cameras. We present event cameras from their working principle, the actual sensors that are available and the tasks that they have been used for, from low-level vision (feature detection and tracking, optic flow, etc.) to high-level vision (reconstruction, segmentation, recognition). We also discuss the techniques developed to process events, including learning-based techniques, as well as specialized processors for these novel sensors, such as spiking neural networks. Additionally, we highlight the challenges that remain to be tackled and the opportunities that lie ahead in the search for a more efficient, bio-inspired way for machines to perceive and interact with the world

    Face modeling for face recognition in the wild.

    Get PDF
    Face understanding is considered one of the most important topics in computer vision field since the face is a rich source of information in social interaction. Not only does the face provide information about the identity of people, but also of their membership in broad demographic categories (including sex, race, and age), and about their current emotional state. Facial landmarks extraction is the corner stone in the success of different facial analyses and understanding applications. In this dissertation, a novel facial modeling is designed for facial landmarks detection in unconstrained real life environment from different image modalities including infra-red and visible images. In the proposed facial landmarks detector, a part based model is incorporated with holistic face information. In the part based model, the face is modeled by the appearance of different face part(e.g., right eye, left eye, left eyebrow, nose, mouth) and their geometric relation. The appearance is described by a novel feature referred to as pixel difference feature. This representation is three times faster than the state-of-art in feature representation. On the other hand, to model the geometric relation between the face parts, the complex Bingham distribution is adapted from the statistical community into computer vision for modeling the geometric relationship between the facial elements. The global information is incorporated with the local part model using a regression model. The model results outperform the state-of-art in detecting facial landmarks. The proposed facial landmark detector is tested in two computer vision problems: boosting the performance of face detectors by rejecting pseudo faces and camera steering in multi-camera network. To highlight the applicability of the proposed model for different image modalities, it has been studied in two face understanding applications which are face recognition from visible images and physiological measurements for autistic individuals from thermal images. Recognizing identities from faces under different poses, expressions and lighting conditions from a complex background is an still unsolved problem even with accurate detection of landmark. Therefore, a learning similarity measure is proposed. The proposed measure responds only to the difference in identities and filter illuminations and pose variations. similarity measure makes use of statistical inference in the image plane. Additionally, the pose challenge is tackled by two new approaches: assigning different weights for different face part based on their visibility in image plane at different pose angles and synthesizing virtual facial images for each subject at different poses from single frontal image. The proposed framework is demonstrated to be competitive with top performing state-of-art methods which is evaluated on standard benchmarks in face recognition in the wild. The other framework for the face understanding application, which is a physiological measures for autistic individual from infra-red images. In this framework, accurate detecting and tracking Superficial Temporal Arteria (STA) while the subject is moving, playing, and interacting in social communication is a must. It is very challenging to track and detect STA since the appearance of the STA region changes over time and it is not discriminative enough from other areas in face region. A novel concept in detection, called supporter collaboration, is introduced. In support collaboration, the STA is detected and tracked with the help of face landmarks and geometric constraint. This research advanced the field of the emotion recognition

    Forum Bildverarbeitung 2022

    Get PDF
    Bildverarbeitung verknĂŒpft das Fachgebiet die Sensorik von Kameras – bildgebender Sensorik – mit der Verarbeitung der Sensordaten – den Bildern. Daraus resultiert der besondere Reiz dieser Disziplin. Der vorliegende Tagungsband des „Forums Bildverarbeitung“, das am 24. und 25.11.2022 in Karlsruhe als Veranstaltung des Karlsruher Instituts fĂŒr Technologie und des Fraunhofer-Instituts fĂŒr Optronik, Systemtechnik und Bildauswertung stattfand, enthĂ€lt die AufsĂ€tze der eingegangenen BeitrĂ€ge
    • 

    corecore