38 research outputs found
A Survey on Segmentation Techniques in Skin Cancer Images
Skin cancer is the most hazardous and a typical sort of growth. The deadliest type of skin tumor is melanoma. Because of the expenses for dermatologists to screen each patient, there is a requirement for an automated framework to assess a patient's danger of melanoma utilizing pictures of their skin sores caught utilizing dermatoscope. Division have significance to distinguish skin sore from pictures. Diverse technique for division of dermoscopic pictures of skin disease and other pigmented sores is introduced. Division is the grouping of the information picture into skin and non-skin pixels in light of skin surface. In this paper comprises of an audit of six unique kinds of skin sore division methods. Fundamental point of division is exactness, speed and computational productivity
SmartBrush: Text and Shape Guided Object Inpainting with Diffusion Model
Generic image inpainting aims to complete a corrupted image by borrowing
surrounding information, which barely generates novel content. By contrast,
multi-modal inpainting provides more flexible and useful controls on the
inpainted content, \eg, a text prompt can be used to describe an object with
richer attributes, and a mask can be used to constrain the shape of the
inpainted object rather than being only considered as a missing area. We
propose a new diffusion-based model named SmartBrush for completing a missing
region with an object using both text and shape-guidance. While previous work
such as DALLE-2 and Stable Diffusion can do text-guided inapinting they do not
support shape guidance and tend to modify background texture surrounding the
generated object. Our model incorporates both text and shape guidance with
precision control. To preserve the background better, we propose a novel
training and sampling strategy by augmenting the diffusion U-net with
object-mask prediction. Lastly, we introduce a multi-task training strategy by
jointly training inpainting with text-to-image generation to leverage more
training data. We conduct extensive experiments showing that our model
outperforms all baselines in terms of visual quality, mask controllability, and
background preservation
SCDNET: A novel convolutional network for semantic change detection in high resolution optical remote sensing imagery
Abstract With the continuing improvement of remote-sensing (RS) sensors, it is crucial to monitor Earth surface changes at fine scale and in great detail. Thus, semantic change detection (SCD), which is capable of locating and identifying "from-to" change information simultaneously, is gaining growing attention in RS community. However, due to the limitation of large-scale SCD datasets, most existing SCD methods are focused on scene-level changes, where semantic change maps are generated with only coarse boundary or scarce category information. To address this issue, we propose a novel convolutional network for large-scale SCD (SCDNet). It is based on a Siamese UNet architecture, which consists of two encoders and two decoders with shared weights. First, multi-temporal images are given as input to the encoders to extract multi-scale deep representations. A multi-scale atrous convolution (MAC) unit is inserted at the end of the encoders to enlarge the receptive field as well as capturing multi-scale information. Then, difference feature maps are generated for each scale, which are combined with feature maps from the encoders to serve as inputs for the decoders. Attention mechanism and deep supervision strategy are further introduced to improve network performance. Finally, we utilize softmax layer to produce a semantic change map for each time image. Extensive experiments are carried out on two large-scale high-resolution SCD datasets, which demonstrates the effectiveness and superiority of the proposed method
Side information in robust principal component analysis: algorithms and applications
Dimensionality reduction and noise removal are fundamental machine learning tasks that are vital to artificial intelligence applications. Principal component analysis has long been utilised in computer vision to achieve the above mentioned goals. Recently, it has been enhanced in terms of robustness to outliers in robust principal component analysis. Both convex and non-convex programs have been developed to solve this new formulation, some with exact convergence guarantees. Its effectiveness can be witnessed in image and video applications ranging from image denoising and alignment to background separation and face recognition. However, robust principal component analysis is by no means perfect. This dissertation identifies its limitations, explores various promising options for improvement and validates the proposed algorithms on both synthetic and real-world datasets.
Common algorithms approximate the NP-hard formulation of robust principal component analysis with convex envelopes. Though under certain assumptions exact recovery can be guaranteed, the relaxation margin is too big to be squandered. In this work, we propose to apply gradient descent on the Burer-Monteiro bilinear matrix factorisation to squeeze this margin given available subspaces. This non-convex approach improves upon conventional convex approaches both in terms of accuracy and speed. On the other hand, oftentimes there is accompanying side information when an observation is made. The ability to assimilate such auxiliary sources of data can ameliorate the recovery process. In this work, we investigate in-depth such possibilities for incorporating side information in restoring the true underlining low-rank component from gross sparse noise. Lastly, tensors, also known as multi-dimensional arrays, represent real-world data more naturally than matrices. It is thus advantageous to adapt robust principal component analysis to tensors. Since there is no exact equivalence between tensor rank and matrix rank, we employ the notions of Tucker rank and CP rank as our optimisation objectives. Overall, this dissertation carefully defines the problems when facing real-world computer vision challenges, extensively and impartially evaluates the state-of-the-art approaches, proposes novel solutions and provides sufficient validations on both simulated data and popular real-world datasets for various mainstream computer vision tasks.Open Acces
Recommended from our members
Deep learning for cardiac image segmentation: A review
Deep learning has become the most widely used approach for cardiac image segmentation in recent years. In this paper, we provide a review of over 100 cardiac image segmentation papers using deep learning, which covers common imaging modalities including magnetic resonance imaging (MRI), computed tomography (CT), and ultrasound (US) and major anatomical structures of interest (ventricles, atria and vessels). In addition, a summary of publicly available cardiac image datasets and code repositories are included to provide a base for encouraging reproducible research. Finally, we discuss the challenges and limitations with current deep learning-based approaches (scarcity of labels, model generalizability across different domains, interpretability) and suggest potential directions for future research