679 research outputs found

    Coarse-to-Fine Annotation Enrichment for Semantic Segmentation Learning

    Full text link
    Rich high-quality annotated data is critical for semantic segmentation learning, yet acquiring dense and pixel-wise ground-truth is both labor- and time-consuming. Coarse annotations (e.g., scribbles, coarse polygons) offer an economical alternative, with which training phase could hardly generate satisfactory performance unfortunately. In order to generate high-quality annotated data with a low time cost for accurate segmentation, in this paper, we propose a novel annotation enrichment strategy, which expands existing coarse annotations of training data to a finer scale. Extensive experiments on the Cityscapes and PASCAL VOC 2012 benchmarks have shown that the neural networks trained with the enriched annotations from our framework yield a significant improvement over that trained with the original coarse labels. It is highly competitive to the performance obtained by using human annotated dense annotations. The proposed method also outperforms among other state-of-the-art weakly-supervised segmentation methods.Comment: CIKM 2018 International Conference on Information and Knowledge Managemen

    Weakly-supervised Semantic Segmentation in Cityscape via Hyperspectral Image

    Full text link
    High-resolution hyperspectral images (HSIs) contain the response of each pixel in different spectral bands, which can be used to effectively distinguish various objects in complex scenes. While HSI cameras have become low cost, algorithms based on it have not been well exploited. In this paper, we focus on a novel topic, weakly-supervised semantic segmentation in cityscape via HSIs. It is based on the idea that high-resolution HSIs in city scenes contain rich spectral information, which can be easily associated to semantics without manual labeling. Therefore, it enables low cost, highly reliable semantic segmentation in complex scenes. Specifically, in this paper, we theoretically analyze the HSIs and introduce a weakly-supervised HSI semantic segmentation framework, which utilizes spectral information to improve the coarse labels to a finer degree. The experimental results show that our method can obtain highly competitive labels and even have higher edge fineness than artificial fine labels in some classes. At the same time, the results also show that the refined labels can effectively improve the effect of semantic segmentation. The combination of HSIs and semantic segmentation proves that HSIs have great potential in high-level visual tasks

    Combining Visual and Textual Features for Semantic Segmentation of Historical Newspapers

    Full text link
    The massive amounts of digitized historical documents acquired over the last decades naturally lend themselves to automatic processing and exploration. Research work seeking to automatically process facsimiles and extract information thereby are multiplying with, as a first essential step, document layout analysis. If the identification and categorization of segments of interest in document images have seen significant progress over the last years thanks to deep learning techniques, many challenges remain with, among others, the use of finer-grained segmentation typologies and the consideration of complex, heterogeneous documents such as historical newspapers. Besides, most approaches consider visual features only, ignoring textual signal. In this context, we introduce a multimodal approach for the semantic segmentation of historical newspapers that combines visual and textual features. Based on a series of experiments on diachronic Swiss and Luxembourgish newspapers, we investigate, among others, the predictive power of visual and textual features and their capacity to generalize across time and sources. Results show consistent improvement of multimodal models in comparison to a strong visual baseline, as well as better robustness to high material variance

    κ°•μΈν•œ λŒ€ν™”ν˜• μ˜μƒ λΆ„ν•  μ•Œκ³ λ¦¬μ¦˜μ„ μœ„ν•œ μ‹œλ“œ 정보 ν™•μž₯ 기법에 λŒ€ν•œ 연ꡬ

    Get PDF
    ν•™μœ„λ…Όλ¬Έ (박사) -- μ„œμšΈλŒ€ν•™κ΅ λŒ€ν•™μ› : κ³΅κ³ΌλŒ€ν•™ 전기·컴퓨터곡학뢀, 2021. 2. 이경무.Segmentation of an area corresponding to a desired object in an image is essential to computer vision problems. This is because most algorithms are performed in semantic units when interpreting or analyzing images. However, segmenting the desired object from a given image is an ambiguous issue. The target object varies depending on user and purpose. To solve this problem, an interactive segmentation technique has been proposed. In this approach, segmentation was performed in the desired direction according to interaction with the user. In this case, seed information provided by the user plays an important role. If the seed provided by a user contain abundant information, the accuracy of segmentation increases. However, providing rich seed information places much burden on the users. Therefore, the main goal of the present study was to obtain satisfactory segmentation results using simple seed information. We primarily focused on converting the provided sparse seed information to a rich state so that accurate segmentation results can be derived. To this end, a minimum user input was taken and enriched it through various seed enrichment techniques. A total of three interactive segmentation techniques was proposed based on: (1) Seed Expansion, (2) Seed Generation, (3) Seed Attention. Our seed enriching type comprised expansion of area around a seed, generation of new seed in a new position, and attention to semantic information. First, in seed expansion, we expanded the scope of the seed. We integrated reliable pixels around the initial seed into the seed set through an expansion step composed of two stages. Through the extended seed covering a wider area than the initial seed, the seed's scarcity and imbalance problems was resolved. Next, in seed generation, we created a seed at a new point, but not around the seed. We trained the system by imitating the user behavior through providing a new seed point in the erroneous region. By learning the user's intention, our model could e ciently create a new seed point. The generated seed helped segmentation and could be used as additional information for weakly supervised learning. Finally, through seed attention, we put semantic information in the seed. Unlike the previous models, we integrated both the segmentation process and seed enrichment process. We reinforced the seed information by adding semantic information to the seed instead of spatial expansion. The seed information was enriched through mutual attention with feature maps generated during the segmentation process. The proposed models show superiority compared to the existing techniques through various experiments. To note, even with sparse seed information, our proposed seed enrichment technique gave by far more accurate segmentation results than the other existing methods.μ˜μƒμ—μ„œ μ›ν•˜λŠ” 물체 μ˜μ—­μ„ μž˜λΌλ‚΄λŠ” 것은 컴퓨터 λΉ„μ „ λ¬Έμ œμ—μ„œ ν•„μˆ˜μ μΈ μš”μ†Œμ΄λ‹€. μ˜μƒμ„ ν•΄μ„ν•˜κ±°λ‚˜ 뢄석할 λ•Œ, λŒ€λΆ€λΆ„μ˜ μ•Œκ³ λ¦¬μ¦˜λ“€μ΄ 의미둠적인 λ‹¨μœ„ 기반으둜 λ™μž‘ν•˜κΈ° λ•Œλ¬Έμ΄λ‹€. κ·ΈλŸ¬λ‚˜ μ˜μƒμ—μ„œ 물체 μ˜μ—­μ„ λΆ„ν• ν•˜λŠ” 것은 λͺ¨ν˜Έν•œ λ¬Έμ œμ΄λ‹€. μ‚¬μš©μžμ™€ λͺ©μ μ— 따라 μ›ν•˜λŠ” 물체 μ˜μ—­μ΄ 달라지기 λ•Œλ¬Έμ΄λ‹€. 이λ₯Ό ν•΄κ²°ν•˜κΈ° μœ„ν•΄ μ‚¬μš©μžμ™€μ˜ ꡐλ₯˜λ₯Ό 톡해 μ›ν•˜λŠ” λ°©ν–₯으둜 μ˜μƒ 뢄할을 μ§„ν–‰ν•˜λŠ” λŒ€ν™”ν˜• μ˜μƒ λΆ„ν•  기법이 μ‚¬μš©λœλ‹€. μ—¬κΈ°μ„œ μ‚¬μš©μžκ°€ μ œκ³΅ν•˜λŠ” μ‹œλ“œ 정보가 μ€‘μš”ν•œ 역할을 ν•œλ‹€. μ‚¬μš©μžμ˜ μ˜λ„λ₯Ό λ‹΄κ³  μžˆλŠ” μ‹œλ“œ 정보가 μ •ν™•ν• μˆ˜λ‘ μ˜μƒ λΆ„ν• μ˜ 정확도도 μ¦κ°€ν•˜κ²Œ λœλ‹€. κ·ΈλŸ¬λ‚˜ ν’λΆ€ν•œ μ‹œλ“œ 정보λ₯Ό μ œκ³΅ν•˜λŠ” 것은 μ‚¬μš©μžμ—κ²Œ λ§Žμ€ 뢀담을 주게 λœλ‹€. κ·ΈλŸ¬λ―€λ‘œ κ°„λ‹¨ν•œ μ‹œλ“œ 정보λ₯Ό μ‚¬μš©ν•˜μ—¬ λ§Œμ‘±ν• λ§Œν•œ λΆ„ν•  κ²°κ³Όλ₯Ό μ–»λŠ” 것이 μ£Όμš” λͺ©μ μ΄ λœλ‹€. μš°λ¦¬λŠ” 제곡된 ν¬μ†Œν•œ μ‹œλ“œ 정보λ₯Ό λ³€ν™˜ν•˜λŠ” μž‘μ—…μ— μ΄ˆμ μ„ λ‘μ—ˆλ‹€. λ§Œμ•½ μ‹œλ“œ 정보가 ν’λΆ€ν•˜κ²Œ λ³€ν™˜λœλ‹€λ©΄ μ •ν™•ν•œ μ˜μƒ λΆ„ν•  κ²°κ³Όλ₯Ό 얻을 수 있기 λ•Œλ¬Έμ΄λ‹€. κ·ΈλŸ¬λ―€λ‘œ λ³Έ ν•™μœ„ λ…Όλ¬Έμ—μ„œλŠ” μ‹œλ“œ 정보λ₯Ό ν’λΆ€ν•˜κ²Œ ν•˜λŠ” 기법듀을 μ œμ•ˆν•œλ‹€. μ΅œμ†Œν•œμ˜ μ‚¬μš©μž μž…λ ₯을 κ°€μ •ν•˜κ³  이λ₯Ό λ‹€μ–‘ν•œ μ‹œλ“œ ν™•μž₯ 기법을 톡해 λ³€ν™˜ν•œλ‹€. μš°λ¦¬λŠ” μ‹œλ“œ ν™•λŒ€, μ‹œλ“œ 생성, μ‹œλ“œ 주의 집쀑에 κΈ°λ°˜ν•œ 총 μ„Έ κ°€μ§€μ˜ λŒ€ν™”ν˜• μ˜μƒ λΆ„ν•  기법을 μ œμ•ˆν•œλ‹€. 각각 μ‹œλ“œ μ£Όλ³€μœΌλ‘œμ˜ μ˜μ—­ ν™•λŒ€, μƒˆλ‘œμš΄ 지점에 μ‹œλ“œ 생성, 의미둠적 정보에 μ£Όλͺ©ν•˜λŠ” ν˜•νƒœμ˜ μ‹œλ“œ ν™•μž₯ 기법을 μ‚¬μš©ν•œλ‹€. λ¨Όμ € μ‹œλ“œ ν™•λŒ€μ— κΈ°λ°˜ν•œ κΈ°λ²•μ—μ„œ μš°λ¦¬λŠ” μ‹œλ“œμ˜ μ˜μ—­ ν™•μž₯을 λͺ©ν‘œλ‘œ ν•œλ‹€. 두 λ‹¨κ³„λ‘œ κ΅¬μ„±λœ ν™•λŒ€ 과정을 톡해 처음 μ‹œλ“œ μ£Όλ³€μ˜ λΉ„μŠ·ν•œ 픽셀듀을 μ‹œλ“œ μ˜μ—­μœΌλ‘œ νŽΈμž…ν•œλ‹€. μ΄λ ‡κ²Œ ν™•μž₯된 μ‹œλ“œλ₯Ό μ‚¬μš©ν•¨μœΌλ‘œμ¨ μ‹œλ“œμ˜ ν¬μ†Œν•¨κ³Ό λΆˆκ· ν˜•μœΌλ‘œ μΈν•œ 문제λ₯Ό ν•΄κ²°ν•  수 μžˆλ‹€. λ‹€μŒμœΌλ‘œ μ‹œλ“œ 생성에 κΈ°λ°˜ν•œ κΈ°λ²•μ—μ„œ μš°λ¦¬λŠ” μ‹œλ“œ 주변이 μ•„λ‹Œ μƒˆλ‘œμš΄ 지점에 μ‹œλ“œλ₯Ό μƒμ„±ν•œλ‹€. μš°λ¦¬λŠ” μ˜€μ°¨κ°€ λ°œμƒν•œ μ˜μ—­μ— μ‚¬μš©μžκ°€ μƒˆλ‘œμš΄ μ‹œλ“œλ₯Ό μ œκ³΅ν•˜λŠ” λ™μž‘μ„ λͺ¨λ°©ν•˜μ—¬ μ‹œμŠ€ν…œμ„ ν•™μŠ΅ν•˜μ˜€λ‹€. μ‚¬μš©μžμ˜ μ˜λ„λ₯Ό ν•™μŠ΅ν•¨μœΌλ‘œμ¨ 효과적으둜 μ‹œλ“œλ₯Ό 생성할 수 μžˆλ‹€. μƒμ„±λœ μ‹œλ“œλŠ” μ˜μƒ λΆ„ν• μ˜ 정확도λ₯Ό 높일 뿐만 μ•„λ‹ˆλΌ μ•½μ§€λ„ν•™μŠ΅μ„ μœ„ν•œ λ°μ΄ν„°λ‘œμ¨ ν™œμš©λ  수 μžˆλ‹€. λ§ˆμ§€λ§‰μœΌλ‘œ μ‹œλ“œ 주의 집쀑을 ν™œμš©ν•œ κΈ°λ²•μ—μ„œ μš°λ¦¬λŠ” 의미둠적 정보λ₯Ό μ‹œλ“œμ— λ‹΄λŠ”λ‹€. 기쑴에 μ œμ•ˆν•œ 기법듀과 달리 μ˜μƒ λΆ„ν•  λ™μž‘κ³Ό μ‹œλ“œ ν™•μž₯ λ™μž‘μ΄ ν†΅ν•©λœ λͺ¨λΈμ„ μ œμ•ˆν•œλ‹€. μ‹œλ“œ μ •λ³΄λŠ” μ˜μƒ λΆ„ν•  λ„€νŠΈμ›Œν¬μ˜ νŠΉμ§•λ§΅κ³Ό μƒν˜Έ ꡐλ₯˜ν•˜λ©° κ·Έ 정보가 풍뢀해진닀. μ œμ•ˆν•œ λͺ¨λΈλ“€μ€ λ‹€μ–‘ν•œ μ‹€ν—˜μ„ 톡해 κΈ°μ‘΄ 기법 λŒ€λΉ„ μš°μˆ˜ν•œ μ„±λŠ₯을 κΈ°λ‘ν•˜μ˜€λ‹€. 특히 μ‹œλ“œκ°€ λΆ€μ‘±ν•œ μƒν™©μ—μ„œ μ‹œλ“œ ν™•μž₯ 기법듀은 ν›Œλ₯­ν•œ λŒ€ν™”ν˜• μ˜μƒ λΆ„ν•  μ„±λŠ₯을 λ³΄μ˜€λ‹€.1 Introduction 1 1.1 Previous Works 2 1.2 Proposed Methods 4 2 Interactive Segmentation with Seed Expansion 9 2.1 Introduction 9 2.2 Proposed Method 12 2.2.1 Background 13 2.2.2 Pyramidal RWR 16 2.2.3 Seed Expansion 19 2.2.4 Re nement with Global Information 24 2.3 Experiments 27 2.3.1 Dataset 27 2.3.2 Implement Details 28 2.3.3 Performance 29 2.3.4 Contribution of Each Part 30 2.3.5 Seed Consistency 31 2.3.6 Running Time 33 2.4 Summary 34 3 Interactive Segmentation with Seed Generation 37 3.1 Introduction 37 3.2 Related Works 40 3.3 Proposed Method 41 3.3.1 System Overview 41 3.3.2 Markov Decision Process 42 3.3.3 Deep Q-Network 46 3.3.4 Model Architecture 47 3.4 Experiments 48 3.4.1 Implement Details 48 3.4.2 Performance 49 3.4.3 Ablation Study 53 3.4.4 Other Datasets 55 3.5 Summary 58 4 Interactive Segmentation with Seed Attention 61 4.1 Introduction 61 4.2 Related Works 64 4.3 Proposed Method 65 4.3.1 Interactive Segmentation Network 65 4.3.2 Bi-directional Seed Attention Module 67 4.4 Experiments 70 4.4.1 Datasets 70 4.4.2 Metrics 70 4.4.3 Implement Details 71 4.4.4 Performance 71 4.4.5 Ablation Study 76 4.4.6 Seed enrichment methods 79 4.5 Summary 82 5 Conclusions 87 5.1 Summary 89 Bibliography 90 ꡭ문초둝 103Docto

    Spott : on-the-spot e-commerce for television using deep learning-based video analysis techniques

    Get PDF
    Spott is an innovative second screen mobile multimedia application which offers viewers relevant information on objects (e.g., clothing, furniture, food) they see and like on their television screens. The application enables interaction between TV audiences and brands, so producers and advertisers can offer potential consumers tailored promotions, e-shop items, and/or free samples. In line with the current views on innovation management, the technological excellence of the Spott application is coupled with iterative user involvement throughout the entire development process. This article discusses both of these aspects and how they impact each other. First, we focus on the technological building blocks that facilitate the (semi-) automatic interactive tagging process of objects in the video streams. The majority of these building blocks extensively make use of novel and state-of-the-art deep learning concepts and methodologies. We show how these deep learning based video analysis techniques facilitate video summarization, semantic keyframe clustering, and (similar) object retrieval. Secondly, we provide insights in user tests that have been performed to evaluate and optimize the application's user experience. The lessons learned from these open field tests have already been an essential input in the technology development and will further shape the future modifications to the Spott application

    Iterative Few-shot Semantic Segmentation from Image Label Text

    Full text link
    Few-shot semantic segmentation aims to learn to segment unseen class objects with the guidance of only a few support images. Most previous methods rely on the pixel-level label of support images. In this paper, we focus on a more challenging setting, in which only the image-level labels are available. We propose a general framework to firstly generate coarse masks with the help of the powerful vision-language model CLIP, and then iteratively and mutually refine the mask predictions of support and query images. Extensive experiments on PASCAL-5i and COCO-20i datasets demonstrate that our method not only outperforms the state-of-the-art weakly supervised approaches by a significant margin, but also achieves comparable or better results to recent supervised methods. Moreover, our method owns an excellent generalization ability for the images in the wild and uncommon classes. Code will be available at https://github.com/Whileherham/IMR-HSNet.Comment: ijcai 202

    Methods for the acquisition and analysis of volume electron microscopy data

    Get PDF
    • …
    corecore