1,116 research outputs found
μ¬μ©μ μμ± μ½ν μΈ μΉ μ¬μ΄νΈμ λμμ νμ°-BilibiliκΈ°λ°μΌλ‘ μ€μ¦ λΆμ
νμλ
Όλ¬Έ(μμ¬)--μμΈλνκ΅ λνμ :곡과λν νλκ³Όμ κΈ°μ κ²½μΒ·κ²½μ Β·μ μ±
μ 곡,2019. 8. ν©μ€μ.User-generated content emphasis user value, and based on the development of Web 2.0, it takes an important position whatever in information diffusion and market management part. Therefore, it is necessary to identify which factors would give influence to information diffusion in UGC.
This thesis based on a video UGC website β Bilibili, try to find influential factors during the video diffusion process. Different from previous research which mainly focuses on the social network, this thesis mainly used video characteristic data to explore effective factors to video diffusion. First, after the draw the number of views increase trend within one month, views trend in Bilibili indicates that video diffusion showed a different diffusion curve.
Then through careful analysis of each different diffusion periods, this thesis found the influential factors are different during different diffusion periods. The analysis result showed that higher interactive among users and contents could attract more people to watch the video which improves video diffusion rate, and also showed the impacts of general comment below the video and sharing activity, video quality to video diffusion.
Based on this thesis, some marginal implications also introduced such as it could provide some basis for web designers, people who use UGC as a marketing tool and users who want to be a UGC content producer.μ¬μ©μ μμ± μ½ν
μΈ (UGC) μ¬μ©μμ κ°μΉλ₯Ό κ°μ‘°νλ©° Web 2.0μ κ°λ°μ κΈ°λ°μΌλ‘ μ 보 νμ° λ° μμ₯ κ΄λ¦¬μ μ€μν μν μ κ°μ§κ³ μλ€. λ°λΌμ UGCμΉ μ¬μ΄νΈμμ μ΄λ€ μμκ° μ 보 νμ°μ μν₯μ λ―ΈμΉ κ²μΈμ§λ₯Ό μλ³νλ κ²μ΄ νμνλ€.
μ΄ λ
Όλ¬Έμ BilibiliμΉ μ¬μ΄νΈ κΈ°λ°μΌλ‘, λΉλμ€ νμ° κ³Όμ μμ μν₯λ ₯μλ μμΈμ μ°Ύμ λ³΄λ €κ³ νλ€. μ£Όλ‘ μμ
λ€νΈμν¬μ μ΄μ μ λ§μΆ μ ν λ¬Ένμλ λ¬λ¦¬, μ΄ λ
Όλ¬Έμμλ μ£Όλ‘ λΉλμ€ νΉμ± λ°μ΄ν°λ₯Ό μ¬μ©νμ¬ λΉλμ€ νμ°μ ν¨κ³Όμ μΈ μμλ₯Ό νꡬνλ€. 첫째, 1 κ°μ μ΄λ΄μ λΉλμ€ μ‘°νμ μ¦κ° μΆμΈ νμ
νκ³ λΉλμ€ νμ°μλ μΈ κ°μ§ λ¨κ³κ° μλ€. λμ§Έ, κ° λ¨κ³μμ λΉλμ€ νμ°μ λ―ΈμΉλ μμΈμ΄ 무μμΈμ§ λΆμνλ€.
λΆμκ²°κ³Ό 보면 μ¬μ©μμ μ½ν
μΈ μ μνΈ μμ©μ΄ λμμλ‘ λ λ§μ μ¬λλ€μ΄ λμμμ λ³Ό μ μμ΄ λμμ νμ° μλκ° ν₯μ λ μ μμμ 보μ¬μ£Όμλ€.
μ΄ λ
Όλ¬Έμ λ°νμΌλ‘ μΉ λμμ΄λ, UGCλ₯Ό λ§μΌν
λκ΅¬λ‘ μ¬μ©νλ μ¬λλ€ λ° UGC μ½ν
μΈ μ μμκ°λκΈ°λ₯Ό μνλ μ¬μ©μμκ² κΈ°λ³Έμ μΈ κ·Όκ±°λ₯Ό μ 곡 ν μ μλ€λ μ κ³Ό κ°μ λͺ κ°μ§ μ€μν ν¨μκ° μκ°λμλ€.Table of content:
1. Introduction 1
2. Literature review 4
2.1 UGC and Bilibili. 4
2.2 Video diffusion in UGC 6
2.3 Hypotheses 9
3 Data and Methodology. 12
3.1 Data 12
3.2 Methodology. 16
4. Analysis results and Conclusion. 19
4.1 Analysis result. 19
4.2 Conclusion 21
4.2.1 Conclusion of daily views. 21
4.2.2 Conclusion of video attributes. 22
5. Implications and limitations 26
5.1 Implications 26
5.2 Limitations. 28
Reference: 29
Appendix: 37
κ΅λ¬Έμ΄λ‘ 43Maste
μμ μ μΈ, λΉλμ€ κΈ°λ°μ 곡μ μ΄μ νμ§
νμλ
Όλ¬Έ(μμ¬) -- μμΈλνκ΅λνμ : 곡과λν κΈ°κ³κ³΅νλΆ, 2021.8. λ°μ’
μ°.Industrial video anomaly detection is an important problem in industrial inspection, possessing features that are distinct from video anomaly detection in other application domains like surveillance. No public datasets pertinent to the problem have been developed, and accordingly, robust models suited for industrial video anomaly detection have yet to be developed. In this thesis, the key differences that distinguish the industrial video anomaly detection problem from its generic counterparts are examined: the relatively small amount of video data available, the lack of diversity among frames within the video clips, and the absence of labels that indicate anomalies. We then propose a robust framework for industrial video inspection that addresses these specific challenges. One novel aspect of our framework includes a model that masks regions in frames that are irrelevant to the inspection task. We show that our framework outperforms existing methods when validated on a novel database that replicates video clips of real-world automated tasks.곡μ κ²μ¬λΌλ λ€μ λ°©λν λΆμΌμ μ¬λ¬ λ¬Έμ μ€μμ, μ°μ
μ© λΉλμ€ μ΄μ νμ§λ ν° μ€μμ±μ μ§λ λ¬Έμ μ΄μ§λ§, κ·Έ μ€μμ±μ λΉν΄ μΆ©λΆν μ£Όλͺ©μ λ°μ§ λͺ»νκ³ μλ€. μ΄ λ¬Έμ λ₯Ό μ°κ΅¬ν λ μ¬μ©ν 곡μ μΈ λ°μ΄ν°μ
μ΄ λΆμ¬νλ©°, μ΄λ₯Ό κΈ°λ°μΌλ‘ κ³ μλ μ°μ
μ© λΉλμ€ μ΄μ νμ§μ νΉνλ κΈ°λ²μ λν μ ν μ°κ΅¬λ μ§ν λ μ μ΄ μμλ€. λ³Έ λ
Όλ¬Έμμλ μΌλ°μ μΈ λΉλμ€ μ΄μ νμ§ λ¬Έμ μ μ°μ
μ© λΉλμ€ μ΄μ νμ§ λ¬Έμ μ μμ΄ν νΉμ±λ€μ λΆμνμ¬ κ·λͺ
νμλ€. μΌλ°μ μΈ λΉλμ€ μ΄μ νμ§μμμ λ¬λ¦¬, μ°μ
μ© λΉλμ€ μ΄μνμ§ λ¬Έμ μμλ μ¬μ© κ°λ₯ν λ°μ΄ν°μ μμ΄ νμ λμ΄ μμΌλ©°, νμ΅μ νμν λΌλ²¨μ΄ μκΈ° λλ¬Έμ μ΄λ₯Ό νμ©ν λͺ¨λΈμ κ°λ°νλ κ²μ΄ λΆκ°λ₯νλ€. μ΄μ κ°μ μ΄μ λ‘ μΈν΄, κΈ°μ‘΄ λͺ¨λΈμ μ°μ
μ© λΉλμ€ μ΄μ νμ§ λ¬Έμ μ μ μ©ν μ,
κ²μ¬νκ³ μ νλ λμκ³Ό 무κ΄ν μμμ μΆνκ³Ό μμ§μμΌλ‘ μΈν κ±°μ§ μλμ΄ μ§λμΉκ² μμ£Ό λ°μνλ€. λΆμμ κΈ°λ°μΌλ‘, κ°κ±΄ν λΉλμ€ μ΄μ κ°μ§κ° κ°λ₯ν μ°μ
μ© λΉλμ€ μ΄μ νμ§ λ°©μμ κ³ μνμλ€. μ΄ κΈ°λ²μμλ μ΄μ νμ§λ₯Ό μν λͺ¨λΈκ³Ό λ³κ°λ‘, μμ λ΄μ μμλ€ μ€ λμ κ°μ§μ μκ΄ μλ κ²λ€μ κ°λ¦¬λ λͺ¨λΈμ νμ©νλ€. μ μνκ³ μ νλ λ°©μμ ν¨μ©μ±μ κ²μ¦νκΈ° μν΄, μ€μ 곡μ μμκ³Ό μ μ¬ν νΉμ±λ€μ 보μ΄λ λ‘λ΄ λμμ 촬μν΄ μμ§ν λ°μ΄ν°λ² μ΄μ€λ₯Ό ꡬμΆνμμΌλ©°, μ΄λ₯Ό νμ©ν΄ λͺ¨λΈμ μ±λ₯λ€μ μΈ‘μ νμλ€. λ³Έ μ°κ΅¬μμ μ μνλ κ°κ±΄ν λΉλμ€ μ΄μνμ§ λ°©μκ³Ό λ°μ΄ν° λ² μ΄μ€λ₯Ό λ
Όλ¬Έμ ν΅ν΄ 곡κ°ν¨μΌλ‘μ¨, μ΄ λΆμΌμ κ΄λ ¨ν λ λ€μν μ°κ΅¬λ₯Ό μ΄μ§νλλ° κΈ°μ¬ ν μ μμ κ²μ΄λΌ κΈ°λνλ€.1 Introduction 1
1.1 Related Works 8
1.2 Contributions of Our Work 15
1.3 Organization 16
2 Preliminaries 18
2.1 Weakly Supervised Learning 19
2.1.1 Supervised Learning and Unsupervised Learning 19
2.1.2 Demysti cation on Weakly Supervised Learning 21
2.2 Class Activation Maps 22
2.2.1 Overview on Visualizing Activations 22
2.2.2 Overview on CAM 24
2.2.3 Overview on Grad-CAM 25
2.2.4 Overview on Eigen-CAM 26
2.3 Dynamic Time Warping 27
2.4 Label Smoothing 28
2.4.1 Review on Cross Entropy Function 30
2.4.2 Summary on Label Smoothing 31
3 Robust Framework for Industrial Video Anomaly Detection 32
3.1 Components of the Framework 34
3.1.1 Anomaly Detection Model 34
3.1.2 Background Masking Model 34
3.1.3 Fusing Results from the Components of the Framework 37
3.2 Details of the Weakly Supervised Learning Method 37
3.2.1 Partition Order Prediction Task 37
3.2.2 Partition Order Labels 39
3.2.3 Conditioning the Labels 43
4 Experiments 45
4.1 Database for Industrial Video Anomaly Detection 46
4.2 Ideal Background Mask 48
4.2.1 Acquiring Ideal Masks 48
4.2.2 Enhancing Robustness in VAD Using Masks 48
4.3 Masking Using the Proposed Method vs Using an Ideal Mask 49
4.4 Performance Enhancement Using the Proposed Method 50
4.5 Ablation Study 54
4.5.1 Number of Layers for Eigen-CAM 54
4.5.2 Threshold on Attention Maps 54
4.5.3 Temporal Smoothing Window Size 56
5 Conclusion 57
A Appendix 59
A.1 Experimental Results for All Tasks in the Database 59
Bibliography 60
κ΅λ¬Έμ΄λ‘ 68μ
μ΄μΌκΈ°ν μ€λͺ λ¬Έμ νμ©ν λκ·λͺ¨ λΉλμ€ νμ΅ μ°κ΅¬
νμλ
Όλ¬Έ (λ°μ¬) -- μμΈλνκ΅ λνμ : 곡과λν μ»΄ν¨ν°κ³΅νλΆ, 2021. 2. κΉκ±΄ν¬.Extensive contributions are being made to develop intelligent agents that can recognize and communicate with the world. In this sense, various video-language tasks have drawn a lot of interests in computer vision research, including image/video captioning, video retrieval and video question answering.
It can be applied to high-level computer vision tasks and various future industries such as search engines, social marketing, automated driving, and robotics support through QA / dialog generation for the surrounding environment.
However, despite these developments, video-language learning suffers from a higher degree of complexity.
This thesis investigates methodologies for learning the relationship between videos and free-formed languages, including explanations, conversations, and question-and-answers, so that the machine can easily adapt to target downstream tasks.
First, we introduce several methods to learn the relationship between long sentences and videos efficiently. We introduce the approaches for supervising human attention transfer for the video attention model, which shows the video attention mechanism can benefit from explicit human gaze labels. Next, we introduce the end-to-end semantic attention method, which further reduces the visual attention algorithm's complexity by using the representative visual concept word detected by the attention-based detector. As a follow-up study on previous methods, we introduce a JSFusion (Joint Sequence Fusion) method that enables efficient video search and QA by enabling many-to-many matching of attention model.
Next, we introduce the CiSIN(Character in Story Identification Network), which uses Attention to increase the performance of character grounding and character re-identification in the movie. Finally, we introduce Transitional Adaptation, which promotes the caption generation models to generates coherent narratives for long videos.
In summary, this thesis presents a novel approaches for automatic video description generation/retrieval and shows the benefits of extracting linguistic knowledge for object and motion in the video as well as the advantage of multimodal audio-visual learning for understanding videos. Since the proposed methods are easily adapted to any video-language tasks, it is expected to be applied to the latest models, bringing additional performance improvements.
Moving forward, we plan to design an unsupervised video learning framework that can solve many challenges in the industry by integrating an unlimited amount of video, audio, and free-formed language data from the web.μκ°-μΈμ΄ νμ΅μ μ΄λ―Έμ§/λΉλμ€ μΊ‘μ
(Image/Video captioning), μκ° μ§μμλ΅(Visual Question and Answering), λΉλμ€ κ²μ(Video Retrieval), μ₯λ©΄ μ΄ν΄(scene understanding), μ΄λ²€νΈ μΈμ(event detection) λ± κ³ μ°¨μμ μ»΄ν¨ν° λΉμ νμ€ν¬(task)λΏλ§ μλλΌ μ£Όλ³ νκ²½μ λν μ§μ μλ΅ λ° λν μμ±(Dialogue Generation)μΌλ‘ μΈν°λ· κ²μ λΏλ§ μλλΌ μ΅κ·Ό νλ°ν μμ
λ§μΌν
(Social Marketing) μμ¨ μ£Όν(Automated Driving), λ‘보ν±μ€(Robotics)μ 보쑰νλ λ± μ¬λ¬ λ―Έλ μ°μ
μ μ μ©λ μ μμ΄ νλ°ν μ°κ΅¬λκ³ μλ μ€μν λΆμΌμ΄λ€.
μ»΄ν¨ν° λΉμ Όκ³Ό μμ°μ΄ μ²λ¦¬λ μ΄λ¬ν μ€μμ±μ λ°νμΌλ‘ κ°μ κ³ μ ν μμμμ λ°μ μ κ±°λν΄ μμΌλ, μ΅κ·Ό λ₯λ¬λμ λ±μ₯κ³Ό ν¨κ» λλΆμκ² λ°μ νλ©΄μ μλ‘λ₯Ό 보μνλ©° νμ΅ κ²°κ³Όλ₯Ό ν₯μμν€λ λ± ν° μλμ§ ν¨κ³Όλ₯Ό λ°ννκ² λμλ€.
νμ§λ§ μ΄λ° λ°μ μλ λΆκ΅¬νκ³ , λΉλμ€-μΈμ΄κ° νμ΅μ λ¬Έμ μ 볡μ‘λκ° νμΈ΅ λμ μ΄λ €μμ κ²ͺκ² λλ κ²½μ°κ° λ§λ€.
λ³Έ λ
Όλ¬Έμμλ λΉλμ€μ μ΄μ λμνλ μ€λͺ
, λν, μ§μ μλ΅ λ± λ λμκ° μμ ννμ μΈμ΄ (Free-formed language)κ°μ κ΄κ³λ₯Ό λμ± ν¨μ¨μ μΌλ‘ νμ΅νκ³ , λͺ©ν μ무μ μ λμν μ μλλ‘ κ°μ νλ κ²μ λͺ©νλ‘ νλ€.
λ¨Όμ , μκ°μ 볡μ‘λκ° μ΄λ―Έμ§λ³΄λ€ λμ λΉλμ€μ κΈ΄ λ¬Έμ₯ μ¬μ΄μ κ΄κ³λ₯Ό ν¨μ¨μ μΌλ‘ νμ΅νκΈ° μν μ¬λ¬ λ°©λ²λ€μ μκ°νλ€. μΈκ°μ μ£Όμ μΈμ(Attention) λͺ¨λΈμ λΉλμ€-μΈμ΄ λͺ¨λΈμ μ§λ νμ΅ νλ λ°©λ²μ μκ°νκ³ , μ΄μ΄μ λΉλμ€μμ μ°μ κ²μΆλ λν μκ° λ¨μ΄λ₯Ό 맀κ°λ‘ νμ¬ μ£Όμ μΈμ(Attention) μκ³ λ¦¬μ¦μ 볡μ‘λλ₯Ό λμ± μ€μ΄λ μλ―Έ μ€μ¬ μ£Όμ μΈμ (Semantic Attention) λ°©λ², μ΄ν
μ
λͺ¨λΈμ λ€λλ€ λ§€μΉμ κΈ°λ°μΌλ‘ ν¨μ¨μ μΈ λΉλμ€ κ²μ λ° μ§μμλ΅μ κ°λ₯μΌ νλ λΉλμ€-μΈμ΄κ° μ΅ν© (Joint Sequence Fusion) λ°©λ² λ± λΉλμ€ μ£Όμ μΈμμ ν¨μ¨μ μΌλ‘ νμ΅μν¬ μ μλ λ°©λ²λ€μ μ μνλ€.
λ€μμΌλ‘λ, μ£Όμ μΈμ(Attention) λͺ¨λΈμ΄ 물체-λ¨μ΄ κ° κ΄κ³λ₯Ό λμ΄ λΉλμ€ μμμ μΈλ¬Ό κ²μ (Person Searching) κ·Έλ¦¬κ³ μΈλ¬Ό μ¬ μλ³ (Person Re-Identification)μ λμμ μννλ©° μμΉμμ©μ μΌμΌν€λ μ€ν 리 μ μΊλ¦ν° μΈμ μ κ²½λ§ (Character in Story Identification Network) μ μκ°νλ©°, λ§μ§λ§μΌλ‘ μκΈ° μ§λ νμ΅(Self-supervised Learning)μ ν΅ν΄ μ£Όμ μΈμ(Attention) κΈ°λ° μΈμ΄ λͺ¨λΈμ΄ κΈ΄ λΉλμ€μ λν μ€λͺ
μ μ°κ΄μ± μκ² μ μμ±ν μ μλλ‘ μ λνλ λ°©λ²μ μκ°νλ€.
μμ½νμλ©΄, μ΄ νμ λ
Όλ¬Έμμ μ μν μλ‘μ΄ λ°©λ²λ‘ λ€μ λΉλμ€-μΈμ΄ νμ΅μ ν΄λΉνλ λΉλμ€ μΊ‘μ
(Video captioning), λΉλμ€ κ²μ(Video Retrieval), μκ° μ§μμλ΅(Video Question and Answering)λ±μ ν΄κ²°ν μ μλ κΈ°μ μ λλ€λμ΄ λλ©°, λΉλμ€ μΊ‘μ
νμ΅μ ν΅ν΄ νμ΅λ μ£Όμ μΈμ λͺ¨λμ κ²μ λ° μ§μμλ΅, μΈλ¬Ό κ²μ λ± κ° λ€νΈμν¬μ μ΄μλλ©΄μ μλ‘μ΄ λ¬Έμ λ€μ λν΄ λμμ μ΅κ³ μμ€(State-of-the-art)μ μ±λ₯μ λ¬μ±νμλ€. μ΄λ₯Ό ν΅ν΄ λΉλμ€-μΈμ΄ νμ΅μΌλ‘ μ»μ μΈμ΄ μ§μμ μ΄μ μ μκ°-μ²κ°μ μμ°λ₯΄λ λΉλμ€ λ©ν°λͺ¨λ¬ νμ΅μ ν° λμμ΄ λλ κ²μ μ€νμ μΌλ‘ 보μ¬μ€λ€. ν₯ν μμ
λ°©ν₯ (Future Work)μΌλ‘λ μμ μ°κ΅¬ν λ΄μ©λ€μ κΈ°λ°μΌλ‘ μΉ μμ μ‘΄μ¬νλ λκ·λͺ¨μ μΈμ΄, λΉλμ€, μ€λμ€ λ°μ΄ν°λ₯Ό ν΅ν©ν΄ νμ΅μ νμ©νμ¬ μ°μ
κ³μ λ§μ λμ λ₯Ό ν΄κ²°ν μ μλ λΉμ§λ νμ΅ λͺ¨λΈμ λ§λ€κ³ μ νλ€.Chapter 1
Introduction
1.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4
1.2 Outline of the thesis . . . . . . . . . . . . . . . . . . . . . . . . .8
Chapter 2
Related Work
2.1 Video Captioning . . . . . . . . . . . . . . . . . . . . . . . . . . .9
2.2 Video Retrieval with Natural Language . . . . . . . . . . . . . . 12
2.3 Video Question and Answering . . . . . . . . . . . . . . . . . . . 13
2.4 Cross-modal Representation Learning for Vision and LanguageTasks . . . . 15
Chapter 3 Human Attention Transfer for Video Captioning18
3.1 Introduction
3.2 Video Datasets for Caption and Gaze . . . . . . . . . . . . . . . 21
3.3 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.3.1 Video Pre-processing and Description . . . . . . . . . . . 22
3.3.2The Recurrent Gaze Prediction (RGP) Model . . . . . . . 23
3.3.3Construction of Visual Feature Pools . . . . . . . . . . . . 24
3.3.4The Decoder for Caption Generation . . . . . . . . . . . . 26
3.3.5Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.4.1Evaluation of Gaze Prediction . . . . . . . . . . . . . . . . 29
3.4.2Evaluation of Video Captioning . . . . . . . . . . . . . . . 32
3.4.3Human Evaluation via AMT . . . . . . . . . . . . . . . . 35
3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Chapter 4 Semantic Word Attention for Video QA and VideoCaptioning
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.1.1Related Work . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.1.2Contributions . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.2 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.2.1Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.2.2An Attention Model for Concept Detection . . . . . . . . 42
4.2.3Video-to-Language Models . . . . . . . . . . . . . . . . . 45
4.2.4A Model for Description . . . . . . . . . . . . . . . . . . . 45
4.2.5A Model for Fill-in-the-Blank . . . . . . . . . . . . . . . . 48
4.2.6A Model for Multiple-Choice Test . . . . . . . . . . . . . 50
4.2.7A Model for Retrieval . . . . . . . . . . . . . . . . . . . . 51
4.3 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.3.1The LSMDC Dataset and Tasks . . . . . . . . . . . . . . 52
4.3.2Quantitative Results . . . . . . . . . . . . . . . . . . . . . 54
4.3.3Qualitative Results . . . . . . . . . . . . . . . . . . . . . . 56
4.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Chapter 5 Joint Sequnece Fusion Attention for Multimodal Sequence Data
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.3 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.3.1Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.3.2The Joint Semantic Tensor . . . . . . . . . . . . . . . . . 65
5.3.3The Convolutional Hierarchical Decoder . . . . . . . . . . 66
5.3.4An Illustrative Example of How the JSFusion Model Works 68
5.3.5Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5.3.6Implementation of Video-Language Models . . . . . . . . 69
5.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.4.1LSMDC Dataset and Tasks . . . . . . . . . . . . . . . . . 71
5.4.2MSR-VTT-(RET/MC) Dataset and Tasks . . . . . . . . . 73
5.4.3Quantitative Results . . . . . . . . . . . . . . . . . . . . . 74
5.4.4Qualitative Results . . . . . . . . . . . . . . . . . . . . . . 76
5.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
Chapter 6 Character Re-Identification and Character Ground-ing for Movie Understanding
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
6.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
6.3 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
6.3.1Video Preprocessing . . . . . . . . . . . . . . . . . . . . . 84
6.3.2Visual Track Embedding . . . . . . . . . . . . . . . . . . . 85
6.3.3Textual Character Embedding . . . . . . . . . . . . . . . 86
6.3.4Character Grounding . . . . . . . . . . . . . . . . . . . . 87
6.3.5Re-Identification . . . . . . . . . . . . . . . . . . . . . . . 88
6.3.6Joint Training . . . . . . . . . . . . . . . . . . . . . . . . 90
6.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
6.4.1Experimental Setup . . . . . . . . . . . . . . . . . . . . . 92
6.4.2Quantitative Results . . . . . . . . . . . . . . . . . . . . . 93
6.4.3Qualitative Results . . . . . . . . . . . . . . . . . . . . . . 95
6.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Chapter 7 Transitional Adaptation of Pretrained Models forVisual Storytelling
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
7.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
7.3 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
7.3.1The Visual Encoder . . . . . . . . . . . . . . . . . . . . . 104
7.3.2The Language Generator . . . . . . . . . . . . . . . . . . 104
7.3.3Adaptation training . . . . . . . . . . . . . . . . . . . . . 105
7.3.4The Sequential Coherence Loss . . . . . . . . . . . . . . . 105
7.3.5Training with the adaptation Loss . . . . . . . . . . . . . 107
7.3.6Fine-tuning and Inference . . . . . . . . . . . . . . . . . . 107
7.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
7.4.1Experimental Setup . . . . . . . . . . . . . . . . . . . . . 109
7.4.2Quantitative Results . . . . . . . . . . . . . . . . . . . . . 112
7.4.3Further Analyses . . . . . . . . . . . . . . . . . . . . . . . 112
7.4.4Human Evaluation Results . . . . . . . . . . . . . . . . . 115
7.4.5Qualitative Results . . . . . . . . . . . . . . . . . . . . . . 116
7.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
Chapter 8 Conclusion
8.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
8.2 Future Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Bibliography ... 123
μμ½ ... 148
Acknowledgements ... 150Docto
Dysphgia in Parkinsonβs Disease
Dysphagia is a frequent symptom in Parkinsonβs disease (PD) and the main cause of aspiration pneumonia and death of patients with PD. It is also associated with nutritional problems, pulmonary complications and quality of life of PD patients. The prevalence is very high in PD patients, varying from 77% to 95%, but exact pathophysiology and mechanism remains obscure. Dysphagia associated with PD has been reported to affect all stages of swallowing including oral, pharyngeal, esophageal phase, but oral and pharyngeal phases are more often abnormal than esophageal phase. There are several treatment strategies for dysphagia of PD patients such as rehabilitative treatment including speech therapy, respiratory muscle strengthening, pharmacologic treatment, deep brain stimulation and surgical treatment. But, the effects of these treatments are still limited, thus individualized and interdisciplinary approach is recommended.ope
λΉλμ€ νλ μ 보κ°μ μν ν μ€νΈ λ¨κ³μ μ μμ λ°©λ²λ‘ μ°κ΅¬
νμλ
Όλ¬Έ (λ°μ¬) -- μμΈλνκ΅ λνμ : 곡과λν μ κΈ°Β·μ»΄ν¨ν°κ³΅νλΆ, 2021. 2. μ΄κ²½λ¬΄.Computationally handling videos has been one of the foremost goals in computer vision. In particular, analyzing the complex dynamics including motion and occlusion between two frames is of fundamental importance in understanding the visual contents of a video. Research on video frame interpolation, a problem where the goal is to synthesize high-quality intermediate frames between the two input frames, specifically investigates the low-level characteristics within the consecutive frames of a video. The topic has been recently gaining increased popularity and can be applied to various real-world applications such as generating slow-motion effects, novel view synthesis, or video stabilization. Existing methods for video frame interpolation aim to design complex new architectures to effectively estimate and compensate for the motion between two input frames. However, natural videos contain a wide variety of different scenarios, including foreground/background appearance and motion, frame rate, and occlusion. Therefore, even with a huge amount of training data, it is difficult for a single model to generalize well on all possible situations.
This dissertation introduces novel methodologies for test-time adaptation for tackling the problem of video frame interpolation. In particular, I propose to enable three different aspects of the deep-learning-based framework to be adaptive: (1) feature activation, (2) network weights, and (3) architectural structures. Specifically, I first present how adaptively scaling the feature activations of a deep neural network with respect to each input frame using attention models allows for accurate interpolation. Unlike the previous approaches that heavily depend on optical flow estimation models, the proposed channel-attention-based model can achieve high-quality frame synthesis without explicit motion estimation. Then, meta-learning is employed for fast adaptation of the parameter values of the frame interpolation models. By learning to adapt for each input video clips, the proposed framework can consistently improve the performance of many existing models with just a single gradient update to its parameters. Lastly, I introduce an input-adaptive dynamic architecture that can assign different inference paths with respect to each local region of the input frames. By deciding the scaling factors of the inputs and the network depth of the early exit in the interpolation model, the dynamic framework can greatly improve the computational efficiency while maintaining, and sometimes even outperforming the performance of the baseline interpolation method.
The effectiveness of the proposed test-time adaptation methodologies is extensively evaluated with multiple benchmark datasets for video frame interpolation. Thorough ablation studies with various hyperparameter settings and baseline networks also demonstrate the superiority of adaptation to the test-time inputs, which is a new research direction orthogonal to the other state-of-the-art frame interpolation approaches.κ³μ°μ μΌλ‘ λΉλμ€ λ°μ΄ν°λ₯Ό μ²λ¦¬νλ κ²μ μ»΄ν¨ν° λΉμ λΆμΌμ μ€μν λͺ©ν μ€ νλμ΄κ³ , μ΄λ₯Ό μν΄μ λ λΉλμ€ νλ μ μ¬μ΄μ μμ§μκ³Ό κ°λ¦¬μ΄μ§ λ±μ 볡μ‘ν μ 보λ₯Ό λΆμνλ κ²μ΄ νμμ μ΄λ€. λΉλμ€ νλ μ 보κ°λ²μ λ μ
λ ₯ νλ μ μ¬μ΄μ μ€κ° νλ μμ μ ννκ² μμ±νλ κ²μ λͺ©νλ‘ νλ λ¬Έμ λ‘, μ°μλ λΉλμ€ νλ μ μ¬μ΄μ μ λ°ν (νμ λ¨μμ) νΉμ§λ€μ μμ§μκ³Ό κ°λ¦¬μ΄μ§μ κ³ λ €νμ¬ λΆμνλλ‘ μ°κ΅¬λμλ€. μ΄ λΆμΌλ μ¬λ‘μ°λͺ¨μ
ν¨κ³Ό μμ±, λ€λ₯Έ μμ μμ λ°λΌλ³΄λ 물체 μμ±, μλ¨λ¦Ό 보μ λ± μ€μνμ λ€μν μ΄ν리μΌμ΄μ
μ νμ©λ μ μκΈ° λλ¬Έμ μ΅κ·Όμ λ§μ κ΄μ¬μ λ°κ³ μλ€. κΈ°μ‘΄μ λ°©λ²λ€μ λ μ
λ ₯ νλ μ μ¬μ΄μ ν½μ
λ¨μ μμ§μ μ 보λ₯Ό ν¨κ³Όμ μΌλ‘ μμΈ‘νκ³ λ³΄μνλ λ°©ν₯μΌλ‘ μ°κ΅¬λμ΄μλ€. νμ§λ§ μ€μ λΉλμ€ λ°μ΄ν°λ λ€μν λ¬Όμ²΄λ€ λ° λ³΅μ‘ν λ°°κ²½μ μμ§μ, μ΄μ λ°λ₯Έ κ°λ¦¬μ΄μ§, λΉλμ€λ§λ€ λ¬λΌμ§λ νλ μμ¨ λ± λ§€μ° λ€μν νκ²½μ λ΄κ³ μλ€. λ°λΌμ νλμ λͺ¨λΈλ‘ λͺ¨λ νκ²½μ μΌλ°μ μΌλ‘ μ λμνλ λͺ¨λΈμ νμ΅νλ κ²μ μλ§μ νμ΅ λ°μ΄ν°λ₯Ό νμ©νμ¬λ λ§€μ° μ΄λ €μ΄ λ¬Έμ μ΄λ€.
λ³Έ νμ λ
Όλ¬Έμμλ λΉλμ€ νλ μ λ³΄κ° λ¬Έμ λ₯Ό ν΄κ²°νκΈ° μν ν
μ€νΈ λ¨κ³μ μ μμ λ°©λ²λ‘ λ€μ μ μνλ€. νΉν λ₯λ¬λ κΈ°λ°μ νλ μμν¬λ₯Ό μ μμ μΌλ‘ λ§λ€κΈ° μνμ¬ (1) νΌμ³ νμ±λ (feature activation), (2) λͺ¨λΈμ νλΌλ―Έν°, κ·Έλ¦¬κ³ (3) λ€νΈμν¬μ ꡬ쑰λ₯Ό λ³νν μ μλλ‘ νλ μΈ κ°μ§μ μκ³ λ¦¬μ¦μ μ μνλ€. 첫 λ²μ§Έ μκ³ λ¦¬μ¦μ λ₯ μ κ²½λ§ λ€νΈμν¬μ λ΄λΆ νΌμ³ νμ±λμ ν¬κΈ°λ₯Ό κ°κ°μ μ
λ ₯ νλ μμ λ°λΌ μ μμ μΌλ‘ μ‘°μ νλλ‘ νλ©°, μ΄ν
μ
λͺ¨λΈμ νμ©νμ¬ μ νν νλ μ λ³΄κ° μ±λ₯μ μ»μ μ μμλ€. μ΅ν°μ»¬ νλ‘μ° μμΈ‘ λͺ¨λΈμ νμ©νμ¬ ν½μ
λ¨μλ‘ μμ§μ μ 보λ₯Ό μΆμΆν λλΆλΆμ κΈ°μ‘΄ λ°©μλ€κ³Ό λ¬λ¦¬, μ μν μ±λ μ΄ν
μ
κΈ°λ°μ λͺ¨λΈμ λ³λμ λͺ¨μ
λͺ¨λΈ μμ΄λ λ§€μ° μ νν μ€κ° νλ μμ μμ±ν μ μλ€. λ λ²μ§Έλ‘ μ μνλ μκ³ λ¦¬μ¦μ νλ μ λ³΄κ° λͺ¨λΈμ κ° νλΌλ―Έν° κ°μ μ μμ μΌλ‘ λ³κ²½ν μ μλλ‘ λ©νλ¬λ (meta-learning) λ°©λ²λ‘ μ μ¬μ©νλ€. κ°κ°μ μ
λ ₯ λΉλμ€ μνμ€λ§λ€ λͺ¨λΈμ νλΌλ―Έν° κ°μ μ μμ μΌλ‘ μ
λ°μ΄νΈν μ μλλ‘ νμ΅μμΌ μ€μΌλ‘μ¨, μ μν νλ μμν¬λ κΈ°μ‘΄μ μ΄λ€ νλ μ λ³΄κ° λͺ¨λΈμ μ¬μ©νλλΌλ λ¨ ν λ²μ κ·ΈλΌλμΈνΈ μ
λ°μ΄νΈλ₯Ό ν΅ν΄ μΌκ΄λ μ±λ₯ ν₯μμ 보μλ€. λ§μ§λ§μΌλ‘, μ
λ ₯μ λ°λΌ λ€νΈμν¬μ κ΅¬μ‘°κ° λμ μΌλ‘ λ³νλλ νλ μμν¬λ₯Ό μ μνμ¬ κ³΅κ°μ μΌλ‘ λΆν λ νλ μμ κ° μ§μλ§λ€ μλ‘ λ€λ₯Έ μΆλ‘ κ²½λ‘λ₯Ό ν΅κ³Όνκ³ , λΆνμν κ³μ°λμ μλΉ λΆλΆ μ€μΌ μ μλλ‘ νλ€. μ μνλ λμ λ€νΈμν¬λ μ
λ ₯ νλ μμ ν¬κΈ°μ νλ μ λ³΄κ° λͺ¨λΈμ κΉμ΄λ₯Ό μ‘°μ ν¨μΌλ‘μ¨ λ² μ΄μ€λΌμΈ λͺ¨λΈμ μ±λ₯μ μ μ§νλ©΄μ κ³μ° ν¨μ¨μ±μ ν¬κ² μ¦κ°νμλ€.
λ³Έ νμ λ
Όλ¬Έμμ μ μν μΈ κ°μ§μ μ μμ λ°©λ²λ‘ μ ν¨κ³Όλ λΉλμ€ νλ μ 보κ°λ²μ μν μ¬λ¬ λ²€μΉλ§ν¬ λ°μ΄ν°μ
μ λ©΄λ°νκ² νκ°λμλ€. νΉν, λ€μν νμ΄νΌνλΌλ―Έν° μΈν
κ³Ό μ¬λ¬ λ² μ΄μ€λΌμΈ λͺ¨λΈμ λν λΉκ΅, λΆμ μ€νμ ν΅ν΄ ν
μ€νΈ λ¨κ³μμμ μ μμ λ°©λ²λ‘ μ λν ν¨κ³Όλ₯Ό μ
μ¦νμλ€. μ΄λ λΉλμ€ νλ μ 보κ°λ²μ λν μ΅μ κ²°κ³Όλ€μ μΆκ°μ μΌλ‘ μ μ©λ μ μλ μλ‘μ΄ μ°κ΅¬ λ°©λ²μΌλ‘, μΆν λ€λ°©λ©΄μΌλ‘μ νμ₯μ±μ΄ κΈ°λλλ€.1 Introduction 1
1.1 Motivations 1
1.2 Proposed method 3
1.3 Contributions 5
1.4 Organization of dissertation 6
2 Feature Adaptation based Approach 7
2.1 Introduction 7
2.2 Related works 10
2.2.1 Video frame interpolation 10
2.2.2 Attention mechanism 12
2.3 Proposed Method 12
2.3.1 Overview of network architecture 13
2.3.2 Main components 14
2.3.3 Loss 16
2.4 Understanding our model 17
2.4.1 Internal feature visualization 18
2.4.2 Intermediate image reconstruction 21
2.5 Experiments 23
2.5.1 Datasets 23
2.5.2 Implementation details 25
2.5.3 Comparison to the state-of-the-art 26
2.5.4 Ablation study 36
2.6 Summary 38
3 Meta-Learning based Approach 39
3.1 Introduction 39
3.2 Related works 42
3.3 Proposed method 44
3.3.1 Video frame interpolation problem set-up 44
3.3.2 Exploiting extra information at test time 45
3.3.3 Background on MAML 48
3.3.4 MetaVFI: Meta-learning for frame interpolation 49
3.4 Experiments 54
3.4.1 Settings 54
3.4.2 Meta-learning algorithm selection 56
3.4.3 Video frame interpolation results 58
3.4.4 Ablation studies 66
3.5 Summary 69
4 Dynamic Architecture based Approach 71
4.1 Introduction 71
4.2 Related works 75
4.2.1 Video frame interpolation 75
4.2.2 Adaptive inference 76
4.3 Proposed Method 77
4.3.1 Dynamic framework overview 77
4.3.2 Scale and depth finder (SD-finder) 80
4.3.3 Dynamic interpolation model 82
4.3.4 Training 83
4.4 Experiments 85
4.4.1 Datasets 85
4.4.2 Implementation details 86
4.4.3 Quantitative comparison 87
4.4.4 Visual comparison 93
4.4.5 Ablation study 97
4.5 Summary 100
5 Conclusion 103
5.1 Summary of dissertation 103
5.2 Future works 104
Bibliography 107
κ΅λ¬Έμ΄λ‘ 120Docto
Comparison of the Macintosh Laryngoscope and the GlideScope Video Laryngoscope in a Cadaver Model of Foreign Body Airway Obstruction
Purpose: The GlideScope video laryngoscope (GL) has been known to help inexperienced health care providers become able to manage even difficult airways. The purpose of this study was to compare foreign body removal efficacies between the Macintosh laryngoscope (ML) and the GL in a setting of airway obstruction. Methods: Participants were asked to remove the simulated foreign body (2Γ2 cm rice cake) from the supraglottic area of a freshly embalmed cadaver. This simulated a normal airway and a difficult airway with cervical spine immobilization. Participants performed the removal maneuver 4 times in random order using a Magill forceps with both the ML and the GL. We measured the time to removal (sec) and preference of the participant (5-point scale) and compared results according to the type of laryngoscope. Successful removal was defined as a removal time that was less than 120 sec. Results: Forty participants were enrolled in this simulation experiment. The success rate, time to removal and provider preference were not significantly different betweeh the two types of laryngoscope. In subgroup analysis for experienced providers, the time to removal was significantly shorter shorter in the ML group than the GL group (14 vs 20 sec, p<0.05). The preference of experienced provider was also significantly higher for ML than GL. Conclusion: This study suggests that ML has comparable efficacy for foreign body removal to GL and is acceptable to experienced providersope
Cross-Layer Adaptive Streaming over HTTP
νμλ
Όλ¬Έ (μμ¬)-- μμΈλνκ΅ λνμ : μ»΄ν¨ν°κ³΅νλΆ, 2016. 2. κΆνκ²½.μ΅κ·Ό λͺ¨λ°μΌ κΈ°κΈ° 보κΈμ μ¦κ°μ μΈν°λ· λ€νΈμν¬ μΈνλΌμ νλ λ° κΈλ‘λ² λΉλμ€ μ€νΈλ¦¬λ° μλΉμ€μ λ±μ₯μΌλ‘ λΉλμ€ μ½ν
μΈ μ λν μμκ° νλ°μ μΌλ‘ μ¦κ°νλ€. κ·Έ κ²°κ³Ό νμ¬ μ 체 μΈν°λ· νΈλν½ μ€ μ€νΈλ¦¬λ°μ΄ μ°¨μ§νλ λΉμ¨μ΄ 70%μ΄μμ μ°¨μ§νκ³ μμΌλ©° μ΄λ¬ν μΆμΈλ μ μ°¨ κ°νλκ³ μλ€. μ΄λ¬ν μν©μμ λͺ λ
μ λΆν° μ¬μ©μμ λ€νΈμν¬ μν©μ μμΈ‘νμ¬ μ΄μ λ§μΆ° μ΅μ μ νμ§μ μ 곡νλ HTTPκΈ°λ°μ μ μν μ€νΈλ¦¬λ°μ λν μ°κ΅¬κ° νλ°ν μ΄λ£¨μ΄μ Έ μκ³ , λΉλμ€μ μ€λμ€ νμ€νλ₯Ό μν μνΉ κ·Έλ£ΉμΈ MPEG(Moving Picture Expert Group)μμλ DASH (Dynamic Adaptive Streaming over HTTP) κΈ°λ²μ νμ€ν νμλ€. λ€μν μ°κ΅¬λ₯Ό ν΅ν΄ ν΄λΌμ΄μΈνΈ μΈ‘μμ μκ³ λ¦¬μ¦μ ν΅ν΄ μ’ λ μ ννκ² λ€νΈμν¬ μν©μ μμΈ‘νλ €λ μλκ° μμ΄μμ§λ§ κ·Όλ³Έμ μΌλ‘ DASHκ° TCPκΈ°λ°μ HTTPμμμ λμνκΈ° λλ¬Έμ κ·Έ μ±λ₯μ΄ TCPμ νΉμ±μ ν¬κ² μ’
μλ μ λ°μ μλ€. κ·Έλ¬λ TCPμ μΌμ°¨μ μΈ λͺ©νλ μ λ’°μ± μλ ν΅μ μΌλ‘, μ¬μ©μ QoE(Quality of Experience)λΌλ DASHμ μ΅μ’
λͺ©νμλ μμΆ©λλ κ²½μ°κ° μ‘΄μ¬νλ€. λ°λΌμ λ³Έ μ°κ΅¬μμλ ν΄λΌμ΄μΈνΈμ TCP λ μ΄μ΄μ DASH νλ μ΄μ΄κ° λμνλ μ΄ν리μΌμ΄μ
λ μ΄μ΄ κ°μ ν¬λ‘μ€λ μ΄μ΄ μ μν μ€νΈλ¦¬λ° νλ μμν¬μΈ CLASHλ₯Ό μ μ νκ³ μ νλ€. μ΄λ₯Ό ν΅ν΄ ν¨ν· μμ€κ³Ό κ°μ΄ TCPμ μν₯μ μ£Όλ λ¬Έμ κ° λ°μνμ λμλ μ€νΈλ¦¬λ°μ νμ§μ μν₯μ μ΅μν ν μ μλλ‘ νλ μμ€ν
μ ꡬννμκ³ , μ€νμ ν΅ν΄ μ΄λ₯Ό κ²μ¦ν΄ 보μλ€.μ 1μ₯ μ λ‘ 4
μ 1μ μ°κ΅¬μ λ°°κ²½ 4
μ 2μ λ
Όλ¬Έμ κ΅¬μ± 6
μ 2μ₯ κΈ°μ‘΄ μ°κ΅¬μ νκ³μ 7
μ 1μ μ μ‘ νλ‘ν μ½ (TCP) 7
μ 2μ HTTP μ μν μ€νΈλ¦¬λ° νλ‘ν μ½ (DASH) 8
μ 3μ₯ CLASH νλ μμν¬ 11
μ 1μ μ¬μ©μ μν©μ κ³ λ €ν ν¬λ‘μ€ λ μ΄μ΄ μ€νΈλ¦¬λ° 11
μ 2μ CLASHμ ꡬμ±μμ λ° λμ μ리 14
μ 3μ μμ€ν
μ ꡬν 17
1. ν¨ν· λͺ¨λν°λ§ μμ€ν
17
2. ν΄λΌμ΄μΈνΈ DASH μκ³ λ¦¬μ¦ 19
3. λ²νΌ λͺ¨λν°λ§ κΈ°λ² 19
4. Raw μμΌμ μ΄μ©ν λ‘μ€ νμ΄λ κΈ°λ² 20
μ 4μ₯ μμ€ν
λΆμ λ° νκ° 21
μ 1μ μ€ν νκ²½ 21
μ 2μ μ±λ₯ νκ° 23
μ 3μ κ²°κ³Ό λ
Όμ 27
μ 5μ₯ κ²° λ‘ 29
μ°Έκ³ λ¬Έν 31
Abstract 33Maste
무μ νκ²½μμμ RTPλ₯Ό μ΄μ©ν μ€μκ° μ격 μ§λ£ μμ€ν μ μ€κ³ λ° νκ°
μ체곡ν νλκ³Όμ /μμ¬[νκΈ]νκ²½μ μ μ½μ΄ μλ μΈν°λ· κΈ°λ°μ μ€μκ° λ©ν°λ―Έλμ΄ μλΉμ€λ₯Ό μ΄μ©ν μ격 μλ£λ μ, 곡κ°μ μ μ½ μμ΄ νμλ₯Ό μ§λ¨νκ³ μ μ ν μ‘°μΉλ₯Ό μ·¨ν μ μκ² νμ¬ νμμ μμ‘΄μ¨, ν볡λ₯ μ κΈμ μ ν¨κ³Όλ₯Ό λΌμΉλ€. μ΄λ₯Ό μν΄ μκΈ μλ£ μλΉμ€λ λ€μν μ λ¬Έμμ μ’
ν©μ μ΄κ³ λμμ μΈ μλΉμ€κ° μ 곡λμ΄μΌ νλ€. μΈν°λ·μ ν΅νμ¬ μ€μκ° λΉλμ€λ₯Ό μ μ‘νκΈ° μν΄μλ μ±λμ μΆ©λΆν λμν, μ μ μ§μ°κ³Ό μ μ ν¨ν· μμ€ λ±μ μ‘°κ±΄μ΄ λ³΄μ₯λμ΄μΌ νλ€. κ·Έλ¬λ νμ¬μ μΈν°λ·μ λΉλμ€ μ μ‘μ νμν QoS (Quality of Service)λ₯Ό λ§μ‘±μν€κΈ° μν΄ λ€νΈμν¬ κ³μΈ΅μμ μ΄λ ν κΈ°λ₯λ μ 곡νμ§ λͺ»νλ€. λ°λΌμ μ¬μ©μκ° μνλ μλΉμ€μ νμ§ λ³΄μ₯μ λ€νΈμν¬ κ³μΈ΅ μμμμ μνλμ΄μΌλ§ νλ€. μ΄λ₯Ό μν΄ μ μλ κ²μ΄ μμ‘κ³μΈ΅(transport layer) μμμ λμνλ μ€μκ° μμ‘ νλ‘ν μ½(RTP : Real-time Transport Protocol)κ³Ό μ€μκ° μμ‘μ μ΄ νλ‘ν μ½(RTCP : Real-time Transport Control Protocol)μ΄λ€. μ΄λ€μ μμ© κ³μΈ΅κ³Ό μμ‘κ³μΈ΅ μ¬μ΄μμ ν¨ν· μμ± μ μ½μ
λλ λΆλΆμ΄λΌκ³ μ΄ν΄νλ κ²μ΄ νλΉν κ²μ΄λ€. μ€μκ° μμ‘ νλ‘ν μ½(RTP)κ³Ό μ€μκ° μμ‘μ μ΄ νλ‘ν μ½(RTCP)μ μ¬μ©νλ©΄, λΉλμ€ μ μ‘μμ μκ°μ μ½μ λ°λ₯Έ νΉμ±μ κ³ λ €ν΄ μ€ μ μκ³ , λ€νΈμν¬ λ΄μμ λ°μνλ μμ€μ λμ²ν μ μλ€.λ³Έ λ
Όλ¬Έμμλ RTP νλ‘ν μ½μ μ¬μ©ν μ€μκ° μ격 μλ£ μ§λ£μμ€ν
μ μ€κ³, ꡬννμλ€. μ°μ μλ£ μ 보 λ°μ΄ν°μ μ€μκ°μ±μ λΆμ¬νκΈ° μνμ¬ UDP κΈ°λ°μ RTPλ₯Ό μ μ©νμλ€. 무μ λ§μμ TCPλ₯Ό μ¬μ©νμμ κ²½μ° μ€μκ°μ±μ΄ μλ²½ν 보μ₯λμ§ μκ³ λ¬΄μ λ§μμμ λ§ν¬ μλ¬λ₯Ό λ³λͺ©νμ λ±μ λ€λ₯Έ μλ¬λ‘ μΈμνκΈ° λλ¬Έμ λΆνμν μλμ°μ¬μ΄μ¦ μ‘°μ λ‘ μΈν λμν λλΉνμμ΄ μμλ€. λ°λ©΄ UDPλ 무쑰건μ μΈ μ μ‘μΌλ‘ μΈν λΉ μ λ’°μ μΈ μ μ‘κ³Ό QoSκ° λ³΄μ₯ λμ§ μλ λ¨μ μ΄ μλ€. λλ¬Έμ λ³Έ μμ€ν
μμλ λμνμ μΆ©λΆν νμ©ν μ μλ UDP κΈ°λ°μμ RTP νλ‘ν μ½μ μ μ©νμ¬ μ€μκ°μ±μ λΆμ¬νκ³ μ€μν νμ λ°μ΄ν°λ μ°μ κΆμ λΆμ¬νμ¬ μ μ‘νμλ€. μ΄μ κ°μ μκ³ λ¦¬μ¦μ μ¬μ©ν¨μΌλ‘μ¨ μμ©κ³μΈ΅ μλ λ¨μμμ μ€μκ°μ± 보μ₯ ν¨κ³Όλ₯Ό νμΈν μ μμλ€.λ λ²μ§Έλ‘λ λ€νΈμν¬μ μνμ μ λμ μΌλ‘ λμνκΈ° μν μ μ‘λ₯ μ μ΄ μκ³ λ¦¬μ¦μ μ μνμλ€. μ ννκ³ μ£ΌκΈ°μ μΈ λ€νΈμν¬ μ±λ₯ μμ§μ μνμ¬ RTCPλ₯Ό μ¬μ©νμλ€. RTCPλ νμ¬ λ€νΈμν¬μ λμν, μ§μ°, PER λ±μ μ 보λ₯Ό μμ§νμ¬ μ΅λ μ μ‘ λ³΄μ₯ κ°λ₯ν λ°μ΄ν° μ¬μ΄μ¦λ₯Ό μ‘μ λ¨μκ² μλ €μ€λ€. μ£Όμ΄μ§ μ 보λ₯Ό λΆμνμ¬ μ ν©ν μ μ‘λ₯ μ κ²°μ , μ μ‘ν¨μΌλ‘μ¨ μμ μ μΈ λμνμ μ¬μ©ν μ μκ³ , 무쑰건μ μΈ μ μ‘μΌλ‘ μΈν λ³λͺ©νμμ λ°©μ§ν¨μΌλ‘μ¨ νμ μμ νμ§μ 보μ₯μ μ°μν μ±λ₯μ λ³΄μΌ μ μμμ νμΈνμλ€.
[μλ¬Έ]For quick and correct treatment of disease, real-time telemedicine system that enables medical examination and treatments made by several specialists simultaneously was designed. The system must transmit good quality data without concerning network that has different bandwidth by real time. The important issue of real-time telemedicine system is end-to-end delay constraint. Therefore, UDP(User Datagram Protocol) is more suitable for transporting multimedia data than TCP(Transmission Control Protocol). But UDP does not control network congestion and guarantee QoS(Quality of Service). There are many research results to complement UDP. One of these results is RTP(Real-Time Transport Protocol) and RTCP(Real-Time Transport Protocol). RTP and RTCP is mainly designed for use with UDP(User Datagram Protocol) for multimedia transport over the Internet. It has the capability of media-synchronization and network''s QoS feedback to compensate for the weakness of UDP. In this thesis, the RTP/RTCP UDP protocol is implemented over a windows pc system, and integrate with a MPEG-4 video codec system. The performance of channel rate control algorithms for VBR(Variable Bit Rate) video transmission is compared by the implemented test system. They are a TCP-Friendly rate control algorithm and a modified algorithm with a hounded minimum and maximum bitrate. It is shown that the modified algorithm provide wider range of controlled bit rates depending on the packet loss ratio and the throughput than the TCP or UDP algorithm.In this paper, we designed and evaluated the real-time multimedia telemedicine system using RTP over CDMA 1X-EVDO. To evaluate this system, we designed the telemedicine system which is based on the RTP. RTP can guarantee realtime transmission at the transport layer, that can''t guarantee at the network layer. and then we designed the RTCP protocol. The RTCP Packets can analyze network traffic and packet loss rate. Using RTCP report, we can control the QoS(Quality of Service) such as transmission rate control or priority control. The performances of the proposed algorithm and system in terms of throughput variation, RTT(Round Trip Time), jitter and PSNR(Peak Signal to Noise Ratio) were shown better performance more than UDP or native RTP.ope
λ°νκΈ°μ μ΄κΈ° μνμ λνλ ννλ©΄μ κ²½ν₯μ λν μ°κ΅¬
νμλ
Όλ¬Έ (μμ¬)-- μμΈλνκ΅ λνμ : λ―Έμ λν νλκ³Όμ λ―Έμ κ²½μ, 2018. 8. μ μλͺ©.λ³Έ λ
Όλ¬Έμ λ°νκΈ°(1942-2000)μ 1974λ
λΆν° 1985λ
κΉμ§ μ μλ μ΄κΈ° μνμ μ€μ¬μΌλ‘ λΆμν μ°κ΅¬μ΄λ€. λ°νκΈ°λ νκ΅ λΉλμ€ μνΈμ μ ꡬμλΌκ³ λΆλ¦¬λ©° λ―Έμ μ¬μ μΌλ‘ μ€μν μμΉμ μμμλ λΆκ΅¬νκ³ , κ·Έμ λν μ°κ΅¬λ λΉκ΅μ μ κ² μ§νλμμΌλ©° μ°κ΅¬μ λ²μ λν λΉλμ€ μμ
λΆλΆμ νΈμ€λμ΄ μμλ€. μ΄μ λ°λΌ λ°νκΈ°μ μνμΈκ³μ λν ν΄μμ λ²μλ₯Ό λνκ³ μ μ°κ΅¬λ₯Ό μμνκ² λμμΌλ©°, κ·Έμ μνμΈκ³ μ€μμλ λΉλμ€λΏλ§ μλλΌ μ€λΈμ , νκ²½, νμμ κ°μ λ€μν ννλ©΄ λ―Έμ κ²½ν₯μ μλν΄ λ³΄μλ 1974λ
λΆν° 1985λ
κΉμ§μ μ΄κΈ° μ€ν λ¨κ³μ μ§μ€νλ€.
μ΄λ₯Ό μν΄ λ¨Όμ λ°νκΈ° μνμ νμ± λ°°κ²½μ΄ λλ 1970λ
λλΆν° 1980λ
λκΉμ§μ νκ΅λ―Έμ μ μν©μ λν΄ μ΄ν΄λ³΄μλ€. μ΅ν¬λ₯΄λ© μ΄ν 1960λ
λ λ§ νκ΅νλ¨μμλ ννλΌλ νλ©΄μμ λ²μ΄λ μλ‘μ΄ λ―Έμ μ μμ©νκ³ μ νλ μμ§μμ΄ μμλμλ€. 본격μ μΌλ‘ 1970λ
λλΆν° λͺλͺ μ€νμ μΈ μκ·Έλ£Ήλ€μ μν΄ μ€λΈμ , νκ²½, νμ λ±μ ννλ©΄μ λ―Έμ κ²½ν₯μ΄ μλ‘κ² μλ λμκ³ , κ·Έλ¬ν κ²½ν₯μ 1980λ
λκΉμ§ μ΄μ΄μ‘λ€. κ·Έλ¦¬κ³ 1970λ
λ μ΄μ€λ°λΆν° λꡬνλ¨μμλ γλꡬνλλ―Έμ μ γλ₯Ό μ€μ¬μΌλ‘ λ―Έμ μ ννλ©΄μ κ²½ν₯μ΄ μλ‘κ² λͺ¨μλκΈ° μμνμλλ°, μ΄λ 1980λ
λκΉμ§ μ΄μ΄μ Έ λꡬλ₯Ό κΈ°λ°μΌλ‘ μ΄κΈ° μκΈ° μκ° νλμ ν λ°νκΈ°μκ² μν₯μ λ―Έμ³€λ€.
μ΄λ¬ν λ°°κ²½μ λ°νμΌλ‘ λ³Έλ¬Έμμλ 1974λ
λΆν° 1985λ
κΉμ§μ λ°νκΈ°μ μ΄κΈ° μνμ ν¬κ² μΈ κ°μ§ κ²½ν₯μΌλ‘ λλμ΄ μ΄ν΄λ³΄μλ€. μνμ΄ μ μλ μκΈ° μμΌλ‘ μ€λΈμ , νκ²½, νμμ ν΄λΉνλ μνμ λ©΄λ°ν μ΄ν΄λ΄μΌλ‘μ¨, κ²°κ³Όμ μΌλ‘ λ°νκΈ°κ° λ§νκ³ μ νλ λΆμμ ν μ§κ°κ³Ό μΈμμ λν λ¬Έμ μ λν΄ κ³ μ°°ν΄ λ³Έ κ²μ΄λ€. λ°λΌμ λ³Έ μ°κ΅¬λ λ°νκΈ°μ μνμΈκ³λ₯Ό λ³΄λ€ λ€μνκ² κ³ μ°°ν΄λ³΄μλ€λ μ μ μμλ₯Ό λκ³ μ νλ€.
μ£Όμμ΄ : λ°νκΈ°, ννλ©΄μ κ²½ν₯, μ€λΈμ , νκ²½, νμ, λꡬνλλ―Έμ μ 1 μ₯ μλ‘ 1
μ 1 μ μ°κ΅¬ λ°°κ²½ λ° λͺ©μ 1
μ 2 μ μ°κ΅¬ λ΄μ© 2
μ 3 μ μ ν μ°κ΅¬ 3
μ 2 μ₯ λ°νκΈ°μ μν νμ± λ°°κ²½ 6
μ 1 μ 1970-80λ
λ ννλ©΄μ κ²½ν₯ 6
1. μ€λΈμ 11
2. νκ²½ 16
3. νμ 20
μ 2 μ λꡬ미μ κ³Ό λ°νκΈ° 25
1. 1970-80λ
λ ννλ©΄μ κ²½ν₯ 25
2. λ°νκΈ°μ νλ 30
μ 3 μ₯ λ°νκΈ°μ μ΄κΈ° μν λΆμ 36
μ 1 μ μ€λΈμ 36
μ 2 μ νκ²½ 44
μ 3 μ νμ 53
μ 4 μ₯ κ²°λ‘ 61
μ°Έκ³ λ¬Έν 64
λνλͺ©λ‘ 72
λ ν 77
Abstract 92Maste
Study on Cold Air Flow Characteristics in a Domestic Refrigerator Using an Image Intensifier CCD Camera
Environmental concerns and the restriction policy on the energy efficiency require continuous improvements in refrigerator energy efficiency. Hence, it is very important to manufacture household refrigerators with the shroud form and cold air flow that are economically acceptable. This study has been performed on cold air flow characteristics in a domestic refrigerator which sizes are 400mm x 410mm x 670mm of the freezer and 510mm x 710mm x 670mm of the cold storage room with a fan. The domestic refrigerator is divided into two completely parts, which one is for cooling the freezer and the other is for cooling the food room. This domestic refrigerator has an evaporator, in which is located the freezer compartment. The experiments were conducted at three different internal temperatures in the freezer(10β, 30β). It was showed that the rate of cold air distribution into freezing room and cold storage room was almost 7 : 3. This experimental investigation indicates that the velocity of cold air flow of 40mm, 80mm, 370mm and 670mm sections are faster than other sections.
The first objective of this work was the analysis of the result to obtain insight regarding the effects of the variables and to assess the cold air flow characteristics at three different internal temperatures in the freezer and at two different internal temperatures cold storage room. The second consideration was to develop refrigerator energy consumption. in other words, This paper presents to improve that the energy efficiency of the refrigerator.(5β, 15β, 30β) and in the cold storage roomλͺ©μ°¨
Abstract = i
κΈ°νΈμ€λͺ
= iii
μ 1μ₯ μλ‘ = 1
μ 2μ₯ μ€ν λ° μμμ²λ¦¬ = 3
2.1 μ€νμ₯μΉ = 3
2.2 PIVμ κ°μ = 6
2.3 PIV μμ€ν
μ κ΅¬μ± = 17
2.3.1 μ‘°λͺ
λ° μΆμ μ
μ = 17
2.3.2 νμμ
λ ₯μ₯μΉ λ° μ μ₯μ₯μΉ = 19
2.3.3 μ΄λ―Έμ§ 보μ€λ = 20
2.4μμμ²λ¦¬ = 22
2.4.1 μ μ²λ¦¬ = 22
2.4.2 λμΌμ
μ μΆμ = 27
2.4.3 νμ²λ¦¬ = 34
μ 3μ₯ κ²°κ³Ό λ° κ²ν = 35
3.1 λλμ€μμκ° λ° μκ°νκ· μλλΆν¬ = 35
3.2 λμ₯μ€μ μκ° λ° μκ°νκ· μλλΆν¬ = 72
3.3 νκ· μ΄λμλμ§ λΆν¬ = 88
3.4 λλμ€ λ° λμ₯μ€ μλλΆν¬λΉκ΅ = 105
μ 4μ₯ κ²°λ‘ = 109
μ°Έκ³ λ¬Έν = 111
κ°μ¬μ κΈ = 11
- β¦