10 research outputs found

    TelsNet: temporal lesion network embedding in a transformer model to detect cervical cancer through colposcope images

    Get PDF
    Cervical cancer ranks as the fourth most prevalent malignancy among women globally. Timely identification and intervention in cases of cervical cancer hold the potential for achieving complete remission and cure. In this study, we built a deep learning model based on self-attention mechanism using transformer architecture to classify the cervix images to help in diagnosis of cervical cancer. We have used techniques like an enhanced multivariate gaussian mixture model optimized with mexican axolotl algorithm for segmenting the colposcope images prior to the Temporal Lesion Convolution Neural Network (TelsNet) classifying the images. TelsNet is a transformer-based neural network that uses temporal convolutional neural networks to identify cancerous regions in colposcope images. Our experiments show that TelsNet achieved an accuracy of 92.7%, with a sensitivity of 73.4% and a specificity of 82.1%. We compared the performance of our model with various state-of-the-art methods, and our results demonstrate that TelsNet outperformed the other methods. The findings have the potential to significantly simplify the process of detecting and accurately classifying cervical cancers at an early stage, leading to improved rates of remission and better overall outcomes for patients globally

    Design of a Novel Low Cost Point of Care Tampon (POCkeT) Colposcope for Use in Resource Limited Settings

    Get PDF
    Introduction: Current guidelines by WHO for cervical cancer screening in low- and middle-income countries involves visual inspection with acetic acid (VIA) of the cervix, followed by treatment during the same visit or a subsequent visit with cryotherapy if a suspicious lesion is found. Implementation of these guidelines is hampered by a lack of: trained health workers, reliable technology, and access to screening facilities. A low cost ultra-portable Point of Care Tampon based digital colposcope (POCkeT Colposcope) for use at the community level setting, which has the unique form factor of a tampon, can be inserted into the vagina to capture images of the cervix, which are on par with that of a state of the art colposcope, at a fraction of the cost. A repository of images to be compiled that can be used to empower front line workers to become more effective through virtual dynamic training. By task shifting to the community setting, this technology could potentially provide significantly greater cervical screening access to where the most vulnerable women live. The POCkeT Colposcope’s concentric LED ring provides comparable white and green field illumination at a fraction of the electrical power required in commercial colposcopes. Evaluation with standard optical imaging targets to assess the POCkeT Colposcope against the state of the art digital colposcope and other VIAM technologies. Results: Our POCkeT Colposcope has comparable resolving power, color reproduction accuracy, minimal lens distortion, and illumination when compared to commercially available colposcopes. In vitro and pilot in vivo imaging results are promising with our POCkeT Colposcope capturing comparable quality images to commercial systems. Methods: Rapid 3D printing, consumer grade light sources, and cameras were used to construct the TVDC. The TVDC’s concentric LED ring provides comparable white and green field illumination at a fraction of the electrical power required in commercial colposcopes, and crossed polarizers provide a reduction in glare. Evaluation was performed using standard optical imaging targets to assess the TVDC against the state of the art digital colposcope and other VIA technologies. Results: Our TVDC has comparable resolving power, color reproduction accuracy, minimal lens distortion, and illumination when compared to commercially available colposcopes. In vitro and pilot in vivo imaging results are promising with our TVDC capturing images of comparable quality to commercial systems. Conclusion: The TVDC is capable of capturing images suitable for cervical lesion analysis. Our portable low cost system will be useful for increasing access to cervical cancer screening and diagnostics in resource-limited settings by providing a more readily portable and easy to use device for medical personnel.The image data and support information that is published in the article "Design of a Novel Low Cost Trans-Vaginal Digital Colposcope for use in Resource Limited Settings" are available at: http://dukespace.lib.duke.edu/dspace/handle/10161/8357.National Institutes of Health (US) 5R21CA162747-0

    Multisensory Imagery Cues for Object Separation, Specularity Detection and Deep Learning based Inpainting

    Full text link
    Multisensory imagery cues have been actively investigated in diverse applications in the computer vision community to provide additional geometric information that is either absent or difficult to capture from mainstream two-dimensional imaging. The inherent features of multispectral polarimetric light field imagery (MSPLFI) include object distribution over spectra, surface properties, shape, shading and pixel flow in light space. The aim of this dissertation is to explore these inherent properties to exploit new structures and methodologies for the tasks of object separation, specularity detection and deep learning-based inpainting in MSPLFI. In the first part of this research, an application to separate foreground objects from the background in both outdoor and indoor scenes using multispectral polarimetric imagery (MSPI) cues is examined. Based on the pixel neighbourhood relationship, an on-demand clustering technique is proposed and implemented to separate artificial objects from natural background in a complex outdoor scene. However, due to indoor scenes only containing artificial objects, with vast variations in energy levels among spectra, a multiband fusion technique followed by a background segmentation algorithm is proposed to separate the foreground from the background. In this regard, first, each spectrum is decomposed into low and high frequencies using the fast Fourier transform (FFT) method. Second, principal component analysis (PCA) is applied on both frequency images of the individual spectrum and then combined with the first principal components as a fused image. Finally, a polarimetric background segmentation (BS) algorithm based on the Stokes vector is proposed and implemented on the fused image. The performance of the proposed approaches are evaluated and compared using publicly available MSPI datasets and the dice similarity coefficient (DSC). The proposed multiband fusion and BS methods demonstrate better fusion quality and higher segmentation accuracy compared with other studies for several metrics, including mean absolute percentage error (MAPE), peak signal-to-noise ratio (PSNR), Pearson correlation coefficient (PCOR) mutual information (MI), accuracy, Geometric Mean (G-mean), precision, recall and F1-score. In the second part of this work, a twofold framework for specular reflection detection (SRD) and specular reflection inpainting (SRI) in transparent objects is proposed. The SRD algorithm is based on the mean, the covariance and the Mahalanobis distance for predicting anomalous pixels in MSPLFI. The SRI algorithm first selects four-connected neighbouring pixels from sub-aperture images and then replaces the SRD pixel with the closest matched pixel. For both algorithms, a 6D MSPLFI transparent object dataset is captured from multisensory imagery cues due to the unavailability of this kind of dataset. The experimental results demonstrate that the proposed algorithms predict higher SRD accuracy and better SRI quality than the existing approaches reported in this part in terms of F1-score, G-mean, accuracy, the structural similarity index (SSIM), the PSNR, the mean squared error (IMMSE) and the mean absolute deviation (MAD). However, due to synthesising SRD pixels based on the pixel neighbourhood relationship, the proposed inpainting method in this research produces artefacts and errors when inpainting large specularity areas with irregular holes. Therefore, in the last part of this research, the emphasis is on inpainting large specularity areas with irregular holes based on the deep feature extraction from multisensory imagery cues. The proposed six-stage deep learning inpainting (DLI) framework is based on the generative adversarial network (GAN) architecture and consists of a generator network and a discriminator network. First, pixels’ global flow in the sub-aperture images is calculated by applying the large displacement optical flow (LDOF) method. The proposed training algorithm combines global flow with local flow and coarse inpainting results predicted from the baseline method. The generator attempts to generate best-matched features, while the discriminator seeks to predict the maximum difference between the predicted results and the actual results. The experimental results demonstrate that in terms of the PSNR, MSSIM, IMMSE and MAD, the proposed DLI framework predicts superior inpainting quality to the baseline method and the previous part of this research

    Laparoscopic Image Recovery and Stereo Matching

    Get PDF
    Laparoscopic imaging can play a significant role in the minimally invasive surgical procedure. However, laparoscopic images often suffer from insufficient and irregular light sources, specular highlight surfaces, and a lack of depth information. These problems can negatively influence the surgeons during surgery, and lead to erroneous visual tracking and potential surgical risks. Thus, developing effective image-processing algorithms for laparoscopic vision recovery and stereo matching is of significant importance. Most related algorithms are effective on nature images, but less effective on laparoscopic images. The first purpose of this thesis is to restore low-light laparoscopic vision, where an effective image enhancement method is proposed by identifying different illumination regions and designing the enhancement criteria for desired image quality. This method can enhance the low-light region by reducing noise amplification during the enhancement process. In addition, this thesis also proposes a simplified Retinex optimization method for non-uniform illumination enhancement. By integrating the prior information of the illumination and reflectance into the optimization process, this method can significantly enhance the dark region while preserving naturalness, texture details, and image structures. Moreover, due to the replacement of the total variation term with two l2l_2-norm terms, the proposed algorithm has a significant computational advantage. Second, a global optimization method for specular highlight removal from a single laparoscopic image is proposed. This method consists of a modified dichromatic reflection model and a novel diffuse chromaticity estimation technique. Due to utilizing the limited color variation of the laparoscopic image, the estimated diffuse chromaticity can approximate the true diffuse chromaticity, which allows us to effectively remove the specular highlight with texture detail preservation. Third, a robust edge-preserving stereo matching method is proposed, based on sparse feature matching, left and right illumination equalization, and refined disparity optimization processes. The sparse feature matching and illumination equalization techniques can provide a good disparity map initialization so that our refined disparity optimization can quickly obtain an accurate disparity map. This approach is particularly promising on surgical tool edges, smooth soft tissues, and surfaces with strong specular highlight

    Epälambertilaiset pinnat ja niiden haasteet konenäössä

    Get PDF
    This thesis regards non-Lambertian surfaces and their challenges, solutions and study in computer vision. The physical theory for understanding the phenomenon is built first, using the Lambertian reflectance model, which defines Lambertian surfaces as ideally diffuse surfaces, whose luminance is isotropic and the luminous intensity obeys Lambert's cosine law. From these two assumptions, non-Lambertian surfaces violate at least the cosine law and are consequently specularly reflecting surfaces, whose perceived brightness is dependent from the viewpoint. Thus non-Lambertian surfaces violate also brightness and colour constancies, which assume that the brightness and colour of same real-world points stays constant across images. These assumptions are used, for example, in tracking and feature matching and thus non-Lambertian surfaces pose complications for object reconstruction and navigation among other tasks in the field of computer vision. After formulating the theoretical foundation of necessary physics and a more general reflectance model called the bi-directional reflectance distribution function, a comprehensive literature review into significant studies regarding non-Lambertian surfaces is conducted. The primary topics of the survey include photometric stereo and navigation systems, while considering other potential fields, such as fusion methods and illumination invariance. The goal of the survey is to formulate a detailed and in-depth answer to what methods can be used to solve the challenges posed by non-Lambertian surfaces, what are these methods' strengths and weaknesses, what are the used datasets and what remains to be answered by further research. After the survey, a dataset is collected and presented, and an outline of another dataset to be published in an upcoming paper is presented. Then a general discussion about the survey and the study is undertaken and conclusions along with proposed future steps are introduced
    corecore