28 research outputs found

    Recovering 3D Shape with Absolute Size from Endoscope Images Using RBF Neural Network

    Get PDF
    Medical diagnosis judges the status of polyp from the size and the 3D shape of the polyp from its medical endoscope image. However the medical doctor judges the status empirically from the endoscope image and more accurate 3D shape recovery from its 2D image has been demanded to support this judgment. As a method to recover 3D shape with high speed, VBW (Vogel-BreuĂź-Weickert) model is proposed to recover 3D shape under the condition of point light source illumination and perspective projection. However, VBW model recovers the relative shape but there is a problem that the shape cannot be recovered with the exact size. Here, shape modification is introduced to recover the exact shape with modification from that with VBW model. RBF-NN is introduced for the mapping between input and output. Input is given as the output of gradient parameters of VBW model for the generated sphere. Output is given as the true gradient parameters of true values of the generated sphere. Learning mapping with NN can modify the gradient and the depth can be recovered according to the modified gradient parameters. Performance of the proposed approach is confirmed via computer simulation and real experiment

    Integrated multipoint-laser endoscopic airway measurements by transoral approach

    Get PDF
    Objectives: Optical and technical characteristics usually do not allow objective endoscopic distance measurements. So far no standardized method for endoscopic distance measurement is available. The aim of this study was to evaluate the feasibility and accuracy of transoral airway measurements with a multipoint-laser endoscope. Methods: The semirigid endoscope includes a multipoint laser measurement system that projects 49 laser points (wavelength 639 nm, power < 5mW) into the optical axis of the endoscopic view. Distances, areas, and depths can be measured in real-time. Transoral endoscopic airway measurements were performed on nine human cadavers, which were correlated with CT measurements. Results: The preliminary experiment showed an optimum distance between the endoscope tip and the object of 5 to 6 cm. There was a mean measurement error of 3.26% ± 2.53%. A Spearman correlation coefficient of 0.95

    Integrated Multipoint-Laser Endoscopic Airway Measurements by Transoral Approach

    Get PDF

    Laser-endoskopische Vermessung der oberen Atemwege mit einem Multipoint-Laser-Endoskop

    Get PDF
    Referat: Starre und flexible Endoskopien gehören heutzutage zu den Standarduntersuchungen in der HNO-Heilkunde. Mit der Weiterentwicklung und vermehrten Anwendung von TLM (transoral surgery) und TORS (transoral robotic surgery), gewinnt die präoperative Vermessung von Befunden an Wichtigkeit. Trotz mehrerer wissenschaftlicher Studien konnte bisher noch keine befriedigende standardisierte Methode zur endoskopischen Vermessung der oberen Atemwege gefunden werden. Bisher ließen optische und technische Umstände eine objektive endoskopische Vermessung nicht zu. Ziel dieser Studie ist es, transorale Vermessungen mittels Multipoint-Laser-Endoskop auf Genauigkeit und Anwendbarkeit zu überprüfen. In der vorliegenden Studie wurde ein Multipoint-Laser-Endoskop verwendet, das 49 Laserpunkte (Wellenlänge 639 nm, Leistung < 5 mW) in die optische Achse des Endoskopbildes projiziert. Durch die Erstellung eines 3D-Koordinatensystems können im Endoskopbild sowohl Strecken-, als auch Tiefen- und Flächenmessungen in Echtzeit vorgenommen werden. Nach einem Modellversuch erfolgten endoskopische Vermessungen an neun Leichenpräparaten im Bereich von Larynx und Trachea, die im Anschluss mit CT-Vermessungen verglichen wurden. Sieben der neun Leichenpräparate konnten für die Auswertung genutzt werden. Dabei ergab sich ein optimaler Messabstand von Endoskopspitze zu untersuchtem Objekt von 5-6 cm. Der durchschnittliche Messfehler lag bei 3.26 % ± 2.53 %. Der Vergleich mit den CT-Vermessungen ergab hervorragende Spearman-Korrelationskoeffizienten von 0.95 (p = 0.01) für die Larynxvermessung und 0.93 (p < 0.01) für die Trachealvermessung. Zusammenfassend lässt sich die Multipoint-Laser-Vermessung als eine vielversprechende Methode für den täglichen Einsatz in der diagnostischen und chirurgischen HNO einschätzen

    Multispectral image analysis in laparoscopy – A machine learning approach to live perfusion monitoring

    Get PDF
    Modern visceral surgery is often performed through small incisions. Compared to open surgery, these minimally invasive interventions result in smaller scars, fewer complications and a quicker recovery. While to the patients benefit, it has the drawback of limiting the physician’s perception largely to that of visual feedback through a camera mounted on a rod lens: the laparoscope. Conventional laparoscopes are limited by “imitating” the human eye. Multispectral cameras remove this arbitrary restriction of recording only red, green and blue colors. Instead, they capture many specific bands of light. Although these could help characterize important indications such as ischemia and early stage adenoma, the lack of powerful digital image processing prevents realizing the technique’s full potential. The primary objective of this thesis was to pioneer fluent functional multispectral imaging (MSI) in laparoscopy. The main technical obstacles were: (1) The lack of image analysis concepts that provide both high accuracy and speed. (2) Multispectral image recording is slow, typically ranging from seconds to minutes. (3) Obtaining a quantitative ground truth for the measurements is hard or even impossible. To overcome these hurdles and enable functional laparoscopy, for the first time in this field physical models are combined with powerful machine learning techniques. The physical model is employed to create highly accurate simulations, which in turn teach the algorithm to rapidly relate multispectral pixels to underlying functional changes. To reduce the domain shift introduced by learning from simulations, a novel transfer learning approach automatically adapts generic simulations to match almost arbitrary recordings of visceral tissue. In combination with the only available video-rate capable multispectral sensor, the method pioneers fluent perfusion monitoring with MSI. This system was carefully tested in a multistage process, involving in silico quantitative evaluations, tissue phantoms and a porcine study. Clinical applicability was ensured through in-patient recordings in the context of partial nephrectomy; in these, the novel system characterized ischemia live during the intervention. Verified against a fluorescence reference, the results indicate that fluent, non-invasive ischemia detection and monitoring is now possible. In conclusion, this thesis presents the first multispectral laparoscope capable of videorate functional analysis. The system was successfully evaluated in in-patient trials, and future work should be directed towards evaluation of the system in a larger study. Due to the broad applicability and the large potential clinical benefit of the presented functional estimation approach, I am confident the descendants of this system are an integral part of the next generation OR

    Pattern Recognition

    Get PDF
    Pattern recognition is a very wide research field. It involves factors as diverse as sensors, feature extraction, pattern classification, decision fusion, applications and others. The signals processed are commonly one, two or three dimensional, the processing is done in real- time or takes hours and days, some systems look for one narrow object class, others search huge databases for entries with at least a small amount of similarity. No single person can claim expertise across the whole field, which develops rapidly, updates its paradigms and comprehends several philosophical approaches. This book reflects this diversity by presenting a selection of recent developments within the area of pattern recognition and related fields. It covers theoretical advances in classification and feature extraction as well as application-oriented works. Authors of these 25 works present and advocate recent achievements of their research related to the field of pattern recognition

    Accurate depth from defocus estimation with video-rate implementation

    Get PDF
    The science of measuring depth from images at video rate using „defocus‟ has been investigated. The method required two differently focussed images acquired from a single view point using a single camera. The relative blur between the images was used to determine the in-focus axial points of each pixel and hence depth. The depth estimation algorithm researched by Watanabe and Nayar was employed to recover the depth estimates, but the broadband filters, referred as the Rational filters were designed using a new procedure: the Two Step Polynomial Approach. The filters designed by the new model were largely insensitive to object texture and were shown to model the blur more precisely than the previous method. Experiments with real planar images demonstrated a maximum RMS depth error of 1.18% for the proposed filters, compared to 1.54% for the previous design. The researched software program required five 2D convolutions to be processed in parallel and these convolutions were effectively implemented on a FPGA using a two channel, five stage pipelined architecture, however the precision of the filter coefficients and the variables had to be limited within the processor. The number of multipliers required for each convolution was reduced from 49 to 10 (79.5% reduction) using a Triangular design procedure. Experimental results suggested that the pipelined processor provided depth estimates comparable in accuracy to the full precision Matlab‟s output, and generated depth maps of size 400 x 400 pixels in 13.06msec, that is faster than the video rate. The defocused images (near and far-focused) were optically registered for magnification using Telecentric optics. A frequency domain approach based on phase correlation was employed to measure the radial shifts due to magnification and also to optimally position the external aperture. The telecentric optics ensured pixel to pixel registration between the defocused images was correct and provided more accurate depth estimates

    Machine Learning and Deep Learning applications for the protection of nuclear fusion devices

    Get PDF
    This Thesis addresses the use of artificial intelligence methods for the protection of nuclear fusion devices with reference to the Joint European Torus (JET) Tokamak and the Wendenstein 7-X (W7-X) Stellarator. JET is currently the world's largest operational Tokamak and the only one operated with the Deuterium-Tritium fuel, while W7-X is the world's largest and most advanced Stellarator. For the work on JET, research focused on the prediction of “disruptions”, and sudden terminations of plasma confinement. For the development and testing of machine learning classifiers, a total of 198 disrupted discharges and 219 regularly terminated discharges from JET. Convolutional Neural Networks (CNNs) were proposed to extract the spatiotemporal characteristics from plasma temperature, density and radiation profiles. Since the CNN is a supervised algorithm, it is necessary to explicitly assign a label to the time windows of the dataset during training. All segments belonging to regularly terminated discharges were labelled as 'stable'. For each disrupted discharge, the labelling of 'unstable' was performed by automatically identifying the pre-disruption phase using an algorithm developed during the PhD. The CNN performance has been evaluated using disrupted and regularly terminated discharges from a decade of JET experimental campaigns, from 2011 to 2020, showing the robustness of the algorithm. Concerning W7-X, the research involved the real-time measurement of heat fluxes on plasma-facing components. THEODOR is a code currently used at W7-X for computing heat fluxes offline. However, for heat load control, fast heat flux estimation in real-time is required. Part of the PhD work was dedicated to refactoring and optimizing the THEODOR code, with the aim of speeding up calculation times and making it compatible with real-time use. In addition, a Physics Informed Neural Network (PINN) model was proposed to bring thermal flow computation to GPUs for real-time implementation

    Image-set, Temporal and Spatiotemporal Representations of Videos for Recognizing, Localizing and Quantifying Actions

    Get PDF
    This dissertation addresses the problem of learning video representations, which is defined here as transforming the video so that its essential structure is made more visible or accessible for action recognition and quantification. In the literature, a video can be represented by a set of images, by modeling motion or temporal dynamics, and by a 3D graph with pixels as nodes. This dissertation contributes in proposing a set of models to localize, track, segment, recognize and assess actions such as (1) image-set models via aggregating subset features given by regularizing normalized CNNs, (2) image-set models via inter-frame principal recovery and sparsely coding residual actions, (3) temporally local models with spatially global motion estimated by robust feature matching and local motion estimated by action detection with motion model added, (4) spatiotemporal models 3D graph and 3D CNN to model time as a space dimension, (5) supervised hashing by jointly learning embedding and quantization, respectively. State-of-the-art performances are achieved for tasks such as quantifying facial pain and human diving. Primary conclusions of this dissertation are categorized as follows: (i) Image set can capture facial actions that are about collective representation; (ii) Sparse and low-rank representations can have the expression, identity and pose cues untangled and can be learned via an image-set model and also a linear model; (iii) Norm is related with recognizability; similarity metrics and loss functions matter; (v) Combining the MIL based boosting tracker with the Particle Filter motion model induces a good trade-off between the appearance similarity and motion consistence; (iv) Segmenting object locally makes it amenable to assign shape priors; it is feasible to learn knowledge such as shape priors online from Web data with weak supervision; (v) It works locally in both space and time to represent videos as 3D graphs; 3D CNNs work effectively when inputted with temporally meaningful clips; (vi) the rich labeled images or videos help to learn better hash functions after learning binary embedded codes than the random projections. In addition, models proposed for videos can be adapted to other sequential images such as volumetric medical images which are not included in this dissertation

    Accurate depth from defocus estimation with video-rate implementation

    Get PDF
    The science of measuring depth from images at video rate using „defocus‟ has been investigated. The method required two differently focussed images acquired from a single view point using a single camera. The relative blur between the images was used to determine the in-focus axial points of each pixel and hence depth. The depth estimation algorithm researched by Watanabe and Nayar was employed to recover the depth estimates, but the broadband filters, referred as the Rational filters were designed using a new procedure: the Two Step Polynomial Approach. The filters designed by the new model were largely insensitive to object texture and were shown to model the blur more precisely than the previous method. Experiments with real planar images demonstrated a maximum RMS depth error of 1.18% for the proposed filters, compared to 1.54% for the previous design. The researched software program required five 2D convolutions to be processed in parallel and these convolutions were effectively implemented on a FPGA using a two channel, five stage pipelined architecture, however the precision of the filter coefficients and the variables had to be limited within the processor. The number of multipliers required for each convolution was reduced from 49 to 10 (79.5% reduction) using a Triangular design procedure. Experimental results suggested that the pipelined processor provided depth estimates comparable in accuracy to the full precision Matlab‟s output, and generated depth maps of size 400 x 400 pixels in 13.06msec, that is faster than the video rate. The defocused images (near and far-focused) were optically registered for magnification using Telecentric optics. A frequency domain approach based on phase correlation was employed to measure the radial shifts due to magnification and also to optimally position the external aperture. The telecentric optics ensured pixel to pixel registration between the defocused images was correct and provided more accurate depth estimates.EThOS - Electronic Theses Online ServiceUniversity of Warwick (UoW)GBUnited Kingdo
    corecore