217 research outputs found

    TAG ME: An Accurate Name Tagging System for Web Facial Images using Search-Based Face Annotation

    Get PDF
    Now a day the demand of social media is increases rapidly and most of the part of social media is made up of multimedia content cognate as images, audio, video. Hence for taking this as a motivation we have proffer a framework for Name tagging or labeling For Web Facial Images, which are easily obtainable on the internet. TAG ME system does that name tagging by utilizing search-based face annotation (SBFA). Here we are going to select an image from a database which are weakly labeled on the internet and the "TAG ME" assign a correct and accurate names or tags to that facial image, for doing this a few challenges have to be faced the One exigent difficulty for search-based face annotation strategy is how to effectually conduct annotation by utilizing the list of nearly all identical face images and its labels which is weak that are habitually rowdy and deficient. In TAGME we have resolve this problem by utilizing an effectual semi supervised label refinement (SSLR) method for purify the labels of web and nonweb facial images with the help of machine learning techniques. Secondly we used convex optimization techniques to resolve learning problem and used effectual optimization algorithms to resolve the learning task which is based on the large scale integration productively. For additionally quicken the given system, finally TAGME system proposed clustering-based approximation algorithm which boost the scalability considerably

    Socializing the Semantic Gap: A Comparative Survey on Image Tag Assignment, Refinement and Retrieval

    Get PDF
    Where previous reviews on content-based image retrieval emphasize on what can be seen in an image to bridge the semantic gap, this survey considers what people tag about an image. A comprehensive treatise of three closely linked problems, i.e., image tag assignment, refinement, and tag-based image retrieval is presented. While existing works vary in terms of their targeted tasks and methodology, they rely on the key functionality of tag relevance, i.e. estimating the relevance of a specific tag with respect to the visual content of a given image and its social context. By analyzing what information a specific method exploits to construct its tag relevance function and how such information is exploited, this paper introduces a taxonomy to structure the growing literature, understand the ingredients of the main works, clarify their connections and difference, and recognize their merits and limitations. For a head-to-head comparison between the state-of-the-art, a new experimental protocol is presented, with training sets containing 10k, 100k and 1m images and an evaluation on three test sets, contributed by various research groups. Eleven representative works are implemented and evaluated. Putting all this together, the survey aims to provide an overview of the past and foster progress for the near future.Comment: to appear in ACM Computing Survey

    Multimedia ontology matching by using visual and textual modalities

    Get PDF
    International audienceOntologies have been intensively applied for improving multimedia search and retrieval by providing explicit meaning to visual content. Several multimedia ontologies have been recently proposed as knowledge models suitable for narrowing the well known semantic gap and for enabling the semantic interpretation of images. Since these ontologies have been created in different application contexts, establishing links between them, a task known as ontology matching, promises to fully unlock their potential in support of multimedia search and retrieval. This paper proposes and compares empirically two extensional ontology matching techniques applied to an important semantic image retrieval issue: automatically associating common-sense knowledge to multimedia concepts. First, we extend a previously introduced textual concept matching approach to use both textual and visual representation of images. In addition, a novel matching technique based on a multi-modal graph is proposed. We argue that the textual and visual modalities have to be seen as complementary rather than as exclusive sources of extensional information in order to improve the efficiency of the application of an ontology matching approach in the multimedia domain. An experimental evaluation is included in the paper

    A graph theory-based online keywords model for image semantic extraction

    Get PDF
    Image captions and keywords are the semantic descriptions of the dominant visual content features in a targeted visual scene. Traditional image keywords extraction processes involves intensive data- and knowledge-level operations by using computer vision and machine learning techniques. However, recent studies have shown that the gap between pixel-level processing and the semantic definition of an image is difficult to bridge by counting only the visual features. In this paper, augmented image semantic information has been introduced through harnessing functions of online image search engines. A graphical model named as the โ€œHead-words Relationship Networkโ€ (HWRN) has been devised for tackling the aforementioned problems. The proposed algorithm starts from retrieving online images of similarly visual features from the input image, the text content of their hosting webpages are then extracted, classified and analysed for semantic clues. The relationships of those โ€œhead-wordsโ€ from relevant webpages can then be modelled and quantified using linguistic tools. Experiments on the prototype system have proven the effectiveness of this novel approach. Performance evaluation over benchmarking state-of-the-art approaches has also shown satisfactory results and promising future applications

    A Two-View Learning Approach for Image Tag Ranking

    Get PDF
    Singapore Ministry of Education Academic Research Fund Tier

    ๊ฐ•์ธํ•œ ๋Œ€ํ™”ํ˜• ์˜์ƒ ๋ถ„ํ•  ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์œ„ํ•œ ์‹œ๋“œ ์ •๋ณด ํ™•์žฅ ๊ธฐ๋ฒ•์— ๋Œ€ํ•œ ์—ฐ๊ตฌ

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ „๊ธฐยท์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2021. 2. ์ด๊ฒฝ๋ฌด.Segmentation of an area corresponding to a desired object in an image is essential to computer vision problems. This is because most algorithms are performed in semantic units when interpreting or analyzing images. However, segmenting the desired object from a given image is an ambiguous issue. The target object varies depending on user and purpose. To solve this problem, an interactive segmentation technique has been proposed. In this approach, segmentation was performed in the desired direction according to interaction with the user. In this case, seed information provided by the user plays an important role. If the seed provided by a user contain abundant information, the accuracy of segmentation increases. However, providing rich seed information places much burden on the users. Therefore, the main goal of the present study was to obtain satisfactory segmentation results using simple seed information. We primarily focused on converting the provided sparse seed information to a rich state so that accurate segmentation results can be derived. To this end, a minimum user input was taken and enriched it through various seed enrichment techniques. A total of three interactive segmentation techniques was proposed based on: (1) Seed Expansion, (2) Seed Generation, (3) Seed Attention. Our seed enriching type comprised expansion of area around a seed, generation of new seed in a new position, and attention to semantic information. First, in seed expansion, we expanded the scope of the seed. We integrated reliable pixels around the initial seed into the seed set through an expansion step composed of two stages. Through the extended seed covering a wider area than the initial seed, the seed's scarcity and imbalance problems was resolved. Next, in seed generation, we created a seed at a new point, but not around the seed. We trained the system by imitating the user behavior through providing a new seed point in the erroneous region. By learning the user's intention, our model could e ciently create a new seed point. The generated seed helped segmentation and could be used as additional information for weakly supervised learning. Finally, through seed attention, we put semantic information in the seed. Unlike the previous models, we integrated both the segmentation process and seed enrichment process. We reinforced the seed information by adding semantic information to the seed instead of spatial expansion. The seed information was enriched through mutual attention with feature maps generated during the segmentation process. The proposed models show superiority compared to the existing techniques through various experiments. To note, even with sparse seed information, our proposed seed enrichment technique gave by far more accurate segmentation results than the other existing methods.์˜์ƒ์—์„œ ์›ํ•˜๋Š” ๋ฌผ์ฒด ์˜์—ญ์„ ์ž˜๋ผ๋‚ด๋Š” ๊ฒƒ์€ ์ปดํ“จํ„ฐ ๋น„์ „ ๋ฌธ์ œ์—์„œ ํ•„์ˆ˜์ ์ธ ์š”์†Œ์ด๋‹ค. ์˜์ƒ์„ ํ•ด์„ํ•˜๊ฑฐ๋‚˜ ๋ถ„์„ํ•  ๋•Œ, ๋Œ€๋ถ€๋ถ„์˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜๋“ค์ด ์˜๋ฏธ๋ก ์ ์ธ ๋‹จ์œ„ ๊ธฐ๋ฐ˜์œผ๋กœ ๋™์ž‘ํ•˜๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์˜์ƒ์—์„œ ๋ฌผ์ฒด ์˜์—ญ์„ ๋ถ„ํ• ํ•˜๋Š” ๊ฒƒ์€ ๋ชจํ˜ธํ•œ ๋ฌธ์ œ์ด๋‹ค. ์‚ฌ์šฉ์ž์™€ ๋ชฉ์ ์— ๋”ฐ๋ผ ์›ํ•˜๋Š” ๋ฌผ์ฒด ์˜์—ญ์ด ๋‹ฌ๋ผ์ง€๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค. ์ด๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ์‚ฌ์šฉ์ž์™€์˜ ๊ต๋ฅ˜๋ฅผ ํ†ตํ•ด ์›ํ•˜๋Š” ๋ฐฉํ–ฅ์œผ๋กœ ์˜์ƒ ๋ถ„ํ• ์„ ์ง„ํ–‰ํ•˜๋Š” ๋Œ€ํ™”ํ˜• ์˜์ƒ ๋ถ„ํ•  ๊ธฐ๋ฒ•์ด ์‚ฌ์šฉ๋œ๋‹ค. ์—ฌ๊ธฐ์„œ ์‚ฌ์šฉ์ž๊ฐ€ ์ œ๊ณตํ•˜๋Š” ์‹œ๋“œ ์ •๋ณด๊ฐ€ ์ค‘์š”ํ•œ ์—ญํ• ์„ ํ•œ๋‹ค. ์‚ฌ์šฉ์ž์˜ ์˜๋„๋ฅผ ๋‹ด๊ณ  ์žˆ๋Š” ์‹œ๋“œ ์ •๋ณด๊ฐ€ ์ •ํ™•ํ• ์ˆ˜๋ก ์˜์ƒ ๋ถ„ํ• ์˜ ์ •ํ™•๋„๋„ ์ฆ๊ฐ€ํ•˜๊ฒŒ ๋œ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ํ’๋ถ€ํ•œ ์‹œ๋“œ ์ •๋ณด๋ฅผ ์ œ๊ณตํ•˜๋Š” ๊ฒƒ์€ ์‚ฌ์šฉ์ž์—๊ฒŒ ๋งŽ์€ ๋ถ€๋‹ด์„ ์ฃผ๊ฒŒ ๋œ๋‹ค. ๊ทธ๋Ÿฌ๋ฏ€๋กœ ๊ฐ„๋‹จํ•œ ์‹œ๋“œ ์ •๋ณด๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋งŒ์กฑํ• ๋งŒํ•œ ๋ถ„ํ•  ๊ฒฐ๊ณผ๋ฅผ ์–ป๋Š” ๊ฒƒ์ด ์ฃผ์š” ๋ชฉ์ ์ด ๋œ๋‹ค. ์šฐ๋ฆฌ๋Š” ์ œ๊ณต๋œ ํฌ์†Œํ•œ ์‹œ๋“œ ์ •๋ณด๋ฅผ ๋ณ€ํ™˜ํ•˜๋Š” ์ž‘์—…์— ์ดˆ์ ์„ ๋‘์—ˆ๋‹ค. ๋งŒ์•ฝ ์‹œ๋“œ ์ •๋ณด๊ฐ€ ํ’๋ถ€ํ•˜๊ฒŒ ๋ณ€ํ™˜๋œ๋‹ค๋ฉด ์ •ํ™•ํ•œ ์˜์ƒ ๋ถ„ํ•  ๊ฒฐ๊ณผ๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค. ๊ทธ๋Ÿฌ๋ฏ€๋กœ ๋ณธ ํ•™์œ„ ๋…ผ๋ฌธ์—์„œ๋Š” ์‹œ๋“œ ์ •๋ณด๋ฅผ ํ’๋ถ€ํ•˜๊ฒŒ ํ•˜๋Š” ๊ธฐ๋ฒ•๋“ค์„ ์ œ์•ˆํ•œ๋‹ค. ์ตœ์†Œํ•œ์˜ ์‚ฌ์šฉ์ž ์ž…๋ ฅ์„ ๊ฐ€์ •ํ•˜๊ณ  ์ด๋ฅผ ๋‹ค์–‘ํ•œ ์‹œ๋“œ ํ™•์žฅ ๊ธฐ๋ฒ•์„ ํ†ตํ•ด ๋ณ€ํ™˜ํ•œ๋‹ค. ์šฐ๋ฆฌ๋Š” ์‹œ๋“œ ํ™•๋Œ€, ์‹œ๋“œ ์ƒ์„ฑ, ์‹œ๋“œ ์ฃผ์˜ ์ง‘์ค‘์— ๊ธฐ๋ฐ˜ํ•œ ์ด ์„ธ ๊ฐ€์ง€์˜ ๋Œ€ํ™”ํ˜• ์˜์ƒ ๋ถ„ํ•  ๊ธฐ๋ฒ•์„ ์ œ์•ˆํ•œ๋‹ค. ๊ฐ๊ฐ ์‹œ๋“œ ์ฃผ๋ณ€์œผ๋กœ์˜ ์˜์—ญ ํ™•๋Œ€, ์ƒˆ๋กœ์šด ์ง€์ ์— ์‹œ๋“œ ์ƒ์„ฑ, ์˜๋ฏธ๋ก ์  ์ •๋ณด์— ์ฃผ๋ชฉํ•˜๋Š” ํ˜•ํƒœ์˜ ์‹œ๋“œ ํ™•์žฅ ๊ธฐ๋ฒ•์„ ์‚ฌ์šฉํ•œ๋‹ค. ๋จผ์ € ์‹œ๋“œ ํ™•๋Œ€์— ๊ธฐ๋ฐ˜ํ•œ ๊ธฐ๋ฒ•์—์„œ ์šฐ๋ฆฌ๋Š” ์‹œ๋“œ์˜ ์˜์—ญ ํ™•์žฅ์„ ๋ชฉํ‘œ๋กœ ํ•œ๋‹ค. ๋‘ ๋‹จ๊ณ„๋กœ ๊ตฌ์„ฑ๋œ ํ™•๋Œ€ ๊ณผ์ •์„ ํ†ตํ•ด ์ฒ˜์Œ ์‹œ๋“œ ์ฃผ๋ณ€์˜ ๋น„์Šทํ•œ ํ”ฝ์…€๋“ค์„ ์‹œ๋“œ ์˜์—ญ์œผ๋กœ ํŽธ์ž…ํ•œ๋‹ค. ์ด๋ ‡๊ฒŒ ํ™•์žฅ๋œ ์‹œ๋“œ๋ฅผ ์‚ฌ์šฉํ•จ์œผ๋กœ์จ ์‹œ๋“œ์˜ ํฌ์†Œํ•จ๊ณผ ๋ถˆ๊ท ํ˜•์œผ๋กœ ์ธํ•œ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ๋‹ค. ๋‹ค์Œ์œผ๋กœ ์‹œ๋“œ ์ƒ์„ฑ์— ๊ธฐ๋ฐ˜ํ•œ ๊ธฐ๋ฒ•์—์„œ ์šฐ๋ฆฌ๋Š” ์‹œ๋“œ ์ฃผ๋ณ€์ด ์•„๋‹Œ ์ƒˆ๋กœ์šด ์ง€์ ์— ์‹œ๋“œ๋ฅผ ์ƒ์„ฑํ•œ๋‹ค. ์šฐ๋ฆฌ๋Š” ์˜ค์ฐจ๊ฐ€ ๋ฐœ์ƒํ•œ ์˜์—ญ์— ์‚ฌ์šฉ์ž๊ฐ€ ์ƒˆ๋กœ์šด ์‹œ๋“œ๋ฅผ ์ œ๊ณตํ•˜๋Š” ๋™์ž‘์„ ๋ชจ๋ฐฉํ•˜์—ฌ ์‹œ์Šคํ…œ์„ ํ•™์Šตํ•˜์˜€๋‹ค. ์‚ฌ์šฉ์ž์˜ ์˜๋„๋ฅผ ํ•™์Šตํ•จ์œผ๋กœ์จ ํšจ๊ณผ์ ์œผ๋กœ ์‹œ๋“œ๋ฅผ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ๋‹ค. ์ƒ์„ฑ๋œ ์‹œ๋“œ๋Š” ์˜์ƒ ๋ถ„ํ• ์˜ ์ •ํ™•๋„๋ฅผ ๋†’์ผ ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ์•ฝ์ง€๋„ํ•™์Šต์„ ์œ„ํ•œ ๋ฐ์ดํ„ฐ๋กœ์จ ํ™œ์šฉ๋  ์ˆ˜ ์žˆ๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ ์‹œ๋“œ ์ฃผ์˜ ์ง‘์ค‘์„ ํ™œ์šฉํ•œ ๊ธฐ๋ฒ•์—์„œ ์šฐ๋ฆฌ๋Š” ์˜๋ฏธ๋ก ์  ์ •๋ณด๋ฅผ ์‹œ๋“œ์— ๋‹ด๋Š”๋‹ค. ๊ธฐ์กด์— ์ œ์•ˆํ•œ ๊ธฐ๋ฒ•๋“ค๊ณผ ๋‹ฌ๋ฆฌ ์˜์ƒ ๋ถ„ํ•  ๋™์ž‘๊ณผ ์‹œ๋“œ ํ™•์žฅ ๋™์ž‘์ด ํ†ตํ•ฉ๋œ ๋ชจ๋ธ์„ ์ œ์•ˆํ•œ๋‹ค. ์‹œ๋“œ ์ •๋ณด๋Š” ์˜์ƒ ๋ถ„ํ•  ๋„คํŠธ์›Œํฌ์˜ ํŠน์ง•๋งต๊ณผ ์ƒํ˜ธ ๊ต๋ฅ˜ํ•˜๋ฉฐ ๊ทธ ์ •๋ณด๊ฐ€ ํ’๋ถ€ํ•ด์ง„๋‹ค. ์ œ์•ˆํ•œ ๋ชจ๋ธ๋“ค์€ ๋‹ค์–‘ํ•œ ์‹คํ—˜์„ ํ†ตํ•ด ๊ธฐ์กด ๊ธฐ๋ฒ• ๋Œ€๋น„ ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ์„ ๊ธฐ๋กํ•˜์˜€๋‹ค. ํŠนํžˆ ์‹œ๋“œ๊ฐ€ ๋ถ€์กฑํ•œ ์ƒํ™ฉ์—์„œ ์‹œ๋“œ ํ™•์žฅ ๊ธฐ๋ฒ•๋“ค์€ ํ›Œ๋ฅญํ•œ ๋Œ€ํ™”ํ˜• ์˜์ƒ ๋ถ„ํ•  ์„ฑ๋Šฅ์„ ๋ณด์˜€๋‹ค.1 Introduction 1 1.1 Previous Works 2 1.2 Proposed Methods 4 2 Interactive Segmentation with Seed Expansion 9 2.1 Introduction 9 2.2 Proposed Method 12 2.2.1 Background 13 2.2.2 Pyramidal RWR 16 2.2.3 Seed Expansion 19 2.2.4 Re nement with Global Information 24 2.3 Experiments 27 2.3.1 Dataset 27 2.3.2 Implement Details 28 2.3.3 Performance 29 2.3.4 Contribution of Each Part 30 2.3.5 Seed Consistency 31 2.3.6 Running Time 33 2.4 Summary 34 3 Interactive Segmentation with Seed Generation 37 3.1 Introduction 37 3.2 Related Works 40 3.3 Proposed Method 41 3.3.1 System Overview 41 3.3.2 Markov Decision Process 42 3.3.3 Deep Q-Network 46 3.3.4 Model Architecture 47 3.4 Experiments 48 3.4.1 Implement Details 48 3.4.2 Performance 49 3.4.3 Ablation Study 53 3.4.4 Other Datasets 55 3.5 Summary 58 4 Interactive Segmentation with Seed Attention 61 4.1 Introduction 61 4.2 Related Works 64 4.3 Proposed Method 65 4.3.1 Interactive Segmentation Network 65 4.3.2 Bi-directional Seed Attention Module 67 4.4 Experiments 70 4.4.1 Datasets 70 4.4.2 Metrics 70 4.4.3 Implement Details 71 4.4.4 Performance 71 4.4.5 Ablation Study 76 4.4.6 Seed enrichment methods 79 4.5 Summary 82 5 Conclusions 87 5.1 Summary 89 Bibliography 90 ๊ตญ๋ฌธ์ดˆ๋ก 103Docto
    • โ€ฆ
    corecore