326 research outputs found

    Local Manifold Augmentation for Multiview Semantic Consistency

    Full text link
    Multiview self-supervised representation learning roots in exploring semantic consistency across data of complex intra-class variation. Such variation is not directly accessible and therefore simulated by data augmentations. However, commonly adopted augmentations are handcrafted and limited to simple geometrical and color changes, which are unable to cover the abundant intra-class variation. In this paper, we propose to extract the underlying data variation from datasets and construct a novel augmentation operator, named local manifold augmentation (LMA). LMA is achieved by training an instance-conditioned generator to fit the distribution on the local manifold of data and sampling multiview data using it. LMA shows the ability to create an infinite number of data views, preserve semantics, and simulate complicated variations in object pose, viewpoint, lighting condition, background etc. Experiments show that with LMA integrated, self-supervised learning methods such as MoCov2 and SimSiam gain consistent improvement on prevalent benchmarks including CIFAR10, CIFAR100, STL10, ImageNet100, and ImageNet. Furthermore, LMA leads to representations that obtain more significant invariance to the viewpoint, object pose, and illumination changes and stronger robustness to various real distribution shifts reflected by ImageNet-V2, ImageNet-R, ImageNet Sketch etc

    ILSGAN: Independent Layer Synthesis for Unsupervised Foreground-Background Segmentation

    Full text link
    Unsupervised foreground-background segmentation aims at extracting salient objects from cluttered backgrounds, where Generative Adversarial Network (GAN) approaches, especially layered GANs, show great promise. However, without human annotations, they are typically prone to produce foreground and background layers with non-negligible semantic and visual confusion, dubbed "information leakage", resulting in notable degeneration of the generated segmentation mask. To alleviate this issue, we propose a simple-yet-effective explicit layer independence modeling approach, termed Independent Layer Synthesis GAN (ILSGAN), pursuing independent foreground-background layer generation by encouraging their discrepancy. Specifically, it targets minimizing the mutual information between visible and invisible regions of the foreground and background to spur interlayer independence. Through in-depth theoretical and experimental analyses, we justify that explicit layer independence modeling is critical to suppressing information leakage and contributes to impressive segmentation performance gains. Also, our ILSGAN achieves strong state-of-the-art generation quality and segmentation performance on complex real-world data.Comment: Accepted by AAAI 202

    Learning Foreground-Background Segmentation from Improved Layered GANs

    Get PDF
    Deep learning approaches heavily rely on high-quality human supervision which is nonetheless expensive, time-consuming, and error-prone, especially for image segmentation task. In this paper, we propose a method to automatically synthesize paired photo-realistic images and segmentation masks for the use of training a foreground-background segmentation network. In particular, we learn a generative adversarial network that decomposes an image into foreground and background layers, and avoid trivial decompositions by maximizing mutual information between generated images and latent variables. The improved layered GANs can synthesize higher quality datasets from which segmentation networks of higher performance can be learned. Moreover, the segmentation networks are employed to stabilize the training of layered GANs in return, which are further alternately trained with Layered GANs. Experiments on a variety of single-object datasets show that our method achieves competitive generation quality and segmentation performance compared to related methods

    Bidirectionally Deformable Motion Modulation For Video-based Human Pose Transfer

    Full text link
    Video-based human pose transfer is a video-to-video generation task that animates a plain source human image based on a series of target human poses. Considering the difficulties in transferring highly structural patterns on the garments and discontinuous poses, existing methods often generate unsatisfactory results such as distorted textures and flickering artifacts. To address these issues, we propose a novel Deformable Motion Modulation (DMM) that utilizes geometric kernel offset with adaptive weight modulation to simultaneously perform feature alignment and style transfer. Different from normal style modulation used in style transfer, the proposed modulation mechanism adaptively reconstructs smoothed frames from style codes according to the object shape through an irregular receptive field of view. To enhance the spatio-temporal consistency, we leverage bidirectional propagation to extract the hidden motion information from a warped image sequence generated by noisy poses. The proposed feature propagation significantly enhances the motion prediction ability by forward and backward propagation. Both quantitative and qualitative experimental results demonstrate superiority over the state-of-the-arts in terms of image fidelity and visual continuity. The source code is publicly available at github.com/rocketappslab/bdmm.Comment: ICCV 202

    Sarcoma of the Larynx: Treatment Results and Literature Review

    Get PDF
    BackgroundSarcomas of the larynx are rare neoplasms that constitute less than 1% of laryngeal malignancies. A Medline search found no large series focusing on laryngeal sarcomas. We reviewed the cases of laryngeal sarcomas treated in our cancer center and compared our experiences and treatment results with those from other centers.MethodsA retrospective review of 10 patients with laryngeal sarcoma treated in our institute between 1980 and 2000 was done to identify tumor characteristics, therapeutic modalities, and treatment outcomes.ResultsThe patients showed a male predominance (9/10) and presented 8 types of pathology. Nine patients underwent surgery, including 2 total laryngectomy, 4 partial laryngectomy, and 3 endoscopic laser cordectomy. During a median follow-up of 92 months, the 5-year overall survival and disease-specific survival were 76% and 90%, respectively. Two patients developed recurrence, including 1 local recurrence and 1 distant metastasis.ConclusionSurgical intervention was the first choice in the treatment of laryngeal sarcomas. The prognosis is relatively good when compared with sarcoma originating from other anatomic sites

    Locally Advanced Oncocytic Carcinoma of the Nasal Cavity Treated With Surgery and Intensity-modulated Radiotherapy

    Get PDF
    Oncocytic carcinomas of the nasal cavity are extremely rare. We report 1 patient whose primary tumor and neck lymphadenopathies were under control nearly 2 years after combined surgery and radiotherapy. An 80-year-old man with a history of nasal oncocytoma had received excision twice previously. Computed tomography demonstrated locally advanced recurrent tumor invading the paranasal sinuses and orbit with lymphadenopathies in the right neck. Skull base surgery was performed. Pathological examination revealed oncocytic carcinoma. Positron emission tomography showed hypermetabolic lesions in the surgical bed and right neck. The patient subsequently received intensity-modulated radiotherapy to the primary site and the whole neck. Follow-up computed tomography 4 months later showed marked shrinkage of the neck lymphadenopathies. There was no progression after nearly 2 years. Although these tumors have historically been regarded as radioresistant, the combined treatment of surgery followed by radiotherapy may offer the best chance for control of locally advanced disease

    Speciation of toxic pollutants in Pb/Zn smelter slags by X-ray Absorption Spectroscopy in the context of the literature

    Get PDF
    Pb/Zn smelter slag is a hazardous industrial waste from the Imperial Smelting Process (ISP). The speciation of zinc, lead, copper and arsenic in the slag controls their recovery or fate in the environment but has been little investigated. X-ray Absorption Spectroscopy (XAS) was applied to this complex poorly crystalline material for the first time to gain new insights about speciation of elements at low concentration. Zn, Cu, As K-edge and Pb L3-edge XAS was carried out for a Pb/Zn slag from a closed ISP facility in England, supported by Fe, S and P K-edge XAS. Results are presented in the context of a full review of the literature. X-ray fluorescence showed that concentrations of Zn, Pb, Cu and As were 8.4, 1.6, 0.48 and 0.45 wt.%, respectively. Wüstite (FeO) was the only crystalline phase identified by X-ray diffraction, but XAS provided a more complete understanding of the matrix. Zn was found to be mainly present in glass, ZnS, and possibly solid solutions with Fe oxides; Pb was mainly present in glass and apatite minerals (e.g., Pb5(PO4)3OH); Cu was mainly speciated as Cu2S, with some metallic Cu and a weathering product, Cu(OH)2; As speciation was likely dominated by arsenic (III) and (V) oxides and sulfides
    corecore