1,255 research outputs found
Temporal Attention-Gated Model for Robust Sequence Classification
Typical techniques for sequence classification are designed for
well-segmented sequences which have been edited to remove noisy or irrelevant
parts. Therefore, such methods cannot be easily applied on noisy sequences
expected in real-world applications. In this paper, we present the Temporal
Attention-Gated Model (TAGM) which integrates ideas from attention models and
gated recurrent networks to better deal with noisy or unsegmented sequences.
Specifically, we extend the concept of attention model to measure the relevance
of each observation (time step) of a sequence. We then use a novel gated
recurrent network to learn the hidden representation for the final prediction.
An important advantage of our approach is interpretability since the temporal
attention weights provide a meaningful value for the salience of each time step
in the sequence. We demonstrate the merits of our TAGM approach, both for
prediction accuracy and interpretability, on three different tasks: spoken
digit recognition, text-based sentiment analysis and visual event recognition.Comment: Accepted by CVPR 201
κ°μΈν λνν μμ λΆν μκ³ λ¦¬μ¦μ μν μλ μ 보 νμ₯ κΈ°λ²μ λν μ°κ΅¬
νμλ
Όλ¬Έ (λ°μ¬) -- μμΈλνκ΅ λνμ : 곡과λν μ κΈ°Β·μ»΄ν¨ν°κ³΅νλΆ, 2021. 2. μ΄κ²½λ¬΄.Segmentation of an area corresponding to a desired object in an image is essential
to computer vision problems. This is because most algorithms are performed in
semantic units when interpreting or analyzing images. However, segmenting the
desired object from a given image is an ambiguous issue. The target object varies
depending on user and purpose. To solve this problem, an interactive segmentation
technique has been proposed. In this approach, segmentation was performed in the
desired direction according to interaction with the user. In this case, seed information
provided by the user plays an important role. If the seed provided by a user contain
abundant information, the accuracy of segmentation increases. However, providing
rich seed information places much burden on the users. Therefore, the main goal of
the present study was to obtain satisfactory segmentation results using simple seed
information.
We primarily focused on converting the provided sparse seed information to a rich
state so that accurate segmentation results can be derived. To this end, a minimum
user input was taken and enriched it through various seed enrichment techniques.
A total of three interactive segmentation techniques was proposed based on: (1)
Seed Expansion, (2) Seed Generation, (3) Seed Attention. Our seed enriching type
comprised expansion of area around a seed, generation of new seed in a new position,
and attention to semantic information.
First, in seed expansion, we expanded the scope of the seed. We integrated reliable
pixels around the initial seed into the seed set through an expansion step
composed of two stages. Through the extended seed covering a wider area than the
initial seed, the seed's scarcity and imbalance problems was resolved. Next, in seed
generation, we created a seed at a new point, but not around the seed. We trained
the system by imitating the user behavior through providing a new seed point in the
erroneous region. By learning the user's intention, our model could e ciently create
a new seed point. The generated seed helped segmentation and could be used as additional
information for weakly supervised learning. Finally, through seed attention,
we put semantic information in the seed. Unlike the previous models, we integrated
both the segmentation process and seed enrichment process. We reinforced the seed
information by adding semantic information to the seed instead of spatial expansion.
The seed information was enriched through mutual attention with feature maps
generated during the segmentation process.
The proposed models show superiority compared to the existing techniques
through various experiments. To note, even with sparse seed information, our proposed
seed enrichment technique gave by far more accurate segmentation results
than the other existing methods.μμμμ μνλ 물체 μμμ μλΌλ΄λ κ²μ μ»΄ν¨ν° λΉμ λ¬Έμ μμ νμμ μΈ μμμ΄λ€. μμμ ν΄μνκ±°λ λΆμν λ, λλΆλΆμ μκ³ λ¦¬μ¦λ€μ΄ μλ―Έλ‘ μ μΈ λ¨μ κΈ°λ°μΌλ‘ λμνκΈ° λλ¬Έμ΄λ€. κ·Έλ¬λ μμμμ 물체 μμμ λΆν νλ κ²μ λͺ¨νΈν λ¬Έμ μ΄λ€. μ¬μ©μμ λͺ©μ μ λ°λΌ μνλ 물체 μμμ΄ λ¬λΌμ§κΈ° λλ¬Έμ΄λ€. μ΄λ₯Ό ν΄κ²°νκΈ° μν΄ μ¬μ©μμμ κ΅λ₯λ₯Ό ν΅ν΄ μνλ λ°©ν₯μΌλ‘ μμ λΆν μ μ§ννλ λνν μμ λΆν κΈ°λ²μ΄ μ¬μ©λλ€. μ¬κΈ°μ μ¬μ©μκ° μ 곡νλ μλ μ λ³΄κ° μ€μν μν μ νλ€. μ¬μ©μμ μλλ₯Ό λ΄κ³ μλ μλ μ λ³΄κ° μ νν μλ‘ μμ λΆν μ μ νλλ μ¦κ°νκ² λλ€. κ·Έλ¬λ νλΆν μλ μ 보λ₯Ό μ 곡νλ κ²μ μ¬μ©μμκ² λ§μ λΆλ΄μ μ£Όκ² λλ€. κ·Έλ¬λ―λ‘ κ°λ¨ν μλ μ 보λ₯Ό μ¬μ©νμ¬ λ§μ‘±ν λ§ν λΆν κ²°κ³Όλ₯Ό μ»λ κ²μ΄ μ£Όμ λͺ©μ μ΄ λλ€.
μ°λ¦¬λ μ 곡λ ν¬μν μλ μ 보λ₯Ό λ³ννλ μμ
μ μ΄μ μ λμλ€. λ§μ½ μλ μ λ³΄κ° νλΆνκ² λ³νλλ€λ©΄ μ νν μμ λΆν κ²°κ³Όλ₯Ό μ»μ μ μκΈ° λλ¬Έμ΄λ€. κ·Έλ¬λ―λ‘ λ³Έ νμ λ
Όλ¬Έμμλ μλ μ 보λ₯Ό νλΆνκ² νλ κΈ°λ²λ€μ μ μνλ€. μ΅μνμ μ¬μ©μ μ
λ ₯μ κ°μ νκ³ μ΄λ₯Ό λ€μν μλ νμ₯ κΈ°λ²μ ν΅ν΄ λ³ννλ€. μ°λ¦¬λ μλ νλ, μλ μμ±, μλ μ£Όμ μ§μ€μ κΈ°λ°ν μ΄ μΈ κ°μ§μ λνν μμ λΆν κΈ°λ²μ μ μνλ€. κ°κ° μλ μ£Όλ³μΌλ‘μ μμ νλ, μλ‘μ΄ μ§μ μ μλ μμ±, μλ―Έλ‘ μ μ 보μ μ£Όλͺ©νλ ννμ μλ νμ₯ κΈ°λ²μ μ¬μ©νλ€.
λ¨Όμ μλ νλμ κΈ°λ°ν κΈ°λ²μμ μ°λ¦¬λ μλμ μμ νμ₯μ λͺ©νλ‘ νλ€. λ λ¨κ³λ‘ ꡬμ±λ νλ κ³Όμ μ ν΅ν΄ μ²μ μλ μ£Όλ³μ λΉμ·ν ν½μ
λ€μ μλ μμμΌλ‘ νΈμ
νλ€. μ΄λ κ² νμ₯λ μλλ₯Ό μ¬μ©ν¨μΌλ‘μ¨ μλμ ν¬μν¨κ³Ό λΆκ· νμΌλ‘ μΈν λ¬Έμ λ₯Ό ν΄κ²°ν μ μλ€. λ€μμΌλ‘ μλ μμ±μ κΈ°λ°ν κΈ°λ²μμ μ°λ¦¬λ μλ μ£Όλ³μ΄ μλ μλ‘μ΄ μ§μ μ μλλ₯Ό μμ±νλ€. μ°λ¦¬λ μ€μ°¨κ° λ°μν μμμ μ¬μ©μκ° μλ‘μ΄ μλλ₯Ό μ 곡νλ λμμ λͺ¨λ°©νμ¬ μμ€ν
μ νμ΅νμλ€. μ¬μ©μμ μλλ₯Ό νμ΅ν¨μΌλ‘μ¨ ν¨κ³Όμ μΌλ‘ μλλ₯Ό μμ±ν μ μλ€. μμ±λ μλλ μμ λΆν μ μ νλλ₯Ό λμΌ λΏλ§ μλλΌ μ½μ§λνμ΅μ μν λ°μ΄ν°λ‘μ¨ νμ©λ μ μλ€. λ§μ§λ§μΌλ‘ μλ μ£Όμ μ§μ€μ νμ©ν κΈ°λ²μμ μ°λ¦¬λ μλ―Έλ‘ μ μ 보λ₯Ό μλμ λ΄λλ€. κΈ°μ‘΄μ μ μν κΈ°λ²λ€κ³Ό λ¬λ¦¬ μμ λΆν λμκ³Ό μλ νμ₯ λμμ΄ ν΅ν©λ λͺ¨λΈμ μ μνλ€. μλ μ 보λ μμ λΆν λ€νΈμν¬μ νΉμ§λ§΅κ³Ό μνΈ κ΅λ₯νλ©° κ·Έ μ λ³΄κ° νλΆν΄μ§λ€.
μ μν λͺ¨λΈλ€μ λ€μν μ€νμ ν΅ν΄ κΈ°μ‘΄ κΈ°λ² λλΉ μ°μν μ±λ₯μ κΈ°λ‘νμλ€. νΉν μλκ° λΆμ‘±ν μν©μμ μλ νμ₯ κΈ°λ²λ€μ νλ₯ν λνν μμ λΆν μ±λ₯μ 보μλ€.1 Introduction 1
1.1 Previous Works 2
1.2 Proposed Methods 4
2 Interactive Segmentation with Seed Expansion 9
2.1 Introduction 9
2.2 Proposed Method 12
2.2.1 Background 13
2.2.2 Pyramidal RWR 16
2.2.3 Seed Expansion 19
2.2.4 Re nement with Global Information 24
2.3 Experiments 27
2.3.1 Dataset 27
2.3.2 Implement Details 28
2.3.3 Performance 29
2.3.4 Contribution of Each Part 30
2.3.5 Seed Consistency 31
2.3.6 Running Time 33
2.4 Summary 34
3 Interactive Segmentation with Seed Generation 37
3.1 Introduction 37
3.2 Related Works 40
3.3 Proposed Method 41
3.3.1 System Overview 41
3.3.2 Markov Decision Process 42
3.3.3 Deep Q-Network 46
3.3.4 Model Architecture 47
3.4 Experiments 48
3.4.1 Implement Details 48
3.4.2 Performance 49
3.4.3 Ablation Study 53
3.4.4 Other Datasets 55
3.5 Summary 58
4 Interactive Segmentation with Seed Attention 61
4.1 Introduction 61
4.2 Related Works 64
4.3 Proposed Method 65
4.3.1 Interactive Segmentation Network 65
4.3.2 Bi-directional Seed Attention Module 67
4.4 Experiments 70
4.4.1 Datasets 70
4.4.2 Metrics 70
4.4.3 Implement Details 71
4.4.4 Performance 71
4.4.5 Ablation Study 76
4.4.6 Seed enrichment methods 79
4.5 Summary 82
5 Conclusions 87
5.1 Summary 89
Bibliography 90
κ΅λ¬Έμ΄λ‘ 103Docto
Semi-supervised Salient Object Detection with Effective Confidence Estimation
The success of existing salient object detection models relies on a large
pixel-wise labeled training dataset, which is time-consuming and expensive to
obtain. We study semi-supervised salient object detection, with access to a
small number of labeled samples and a large number of unlabeled samples.
Specifically, we present a pseudo label based learn-ing framework with a
Conditional Energy-based Model. We model the stochastic nature of human
saliency labels using the stochastic latent variable of the Conditional
Energy-based Model. It further enables generation of a high-quality pixel-wise
uncertainty map, highlighting the reliability of corresponding pseudo label
generated for the unlabeled sample. This minimises the contribution of
low-certainty pseudo labels in optimising the model, preventing the error
propagation. Experimental results show that the proposed strategy can
effectively explore the contribution of unlabeled data. With only 1/16 labeled
samples, our model achieves competitive performance compared with
state-of-the-art fully-supervised models
- β¦