16 research outputs found

    Exploiting Deep Features for Remote Sensing Image Retrieval: A Systematic Investigation

    Full text link
    Remote sensing (RS) image retrieval is of great significant for geological information mining. Over the past two decades, a large amount of research on this task has been carried out, which mainly focuses on the following three core issues: feature extraction, similarity metric and relevance feedback. Due to the complexity and multiformity of ground objects in high-resolution remote sensing (HRRS) images, there is still room for improvement in the current retrieval approaches. In this paper, we analyze the three core issues of RS image retrieval and provide a comprehensive review on existing methods. Furthermore, for the goal to advance the state-of-the-art in HRRS image retrieval, we focus on the feature extraction issue and delve how to use powerful deep representations to address this task. We conduct systematic investigation on evaluating correlative factors that may affect the performance of deep features. By optimizing each factor, we acquire remarkable retrieval results on publicly available HRRS datasets. Finally, we explain the experimental phenomenon in detail and draw conclusions according to our analysis. Our work can serve as a guiding role for the research of content-based RS image retrieval

    Aggregated Deep Local Features for Remote Sensing Image Retrieval

    Get PDF
    Remote Sensing Image Retrieval remains a challenging topic due to the special nature of Remote Sensing Imagery. Such images contain various different semantic objects, which clearly complicates the retrieval task. In this paper, we present an image retrieval pipeline that uses attentive, local convolutional features and aggregates them using the Vector of Locally Aggregated Descriptors (VLAD) to produce a global descriptor. We study various system parameters such as the multiplicative and additive attention mechanisms and descriptor dimensionality. We propose a query expansion method that requires no external inputs. Experiments demonstrate that even without training, the local convolutional features and global representation outperform other systems. After system tuning, we can achieve state-of-the-art or competitive results. Furthermore, we observe that our query expansion method increases overall system performance by about 3%, using only the top-three retrieved images. Finally, we show how dimensionality reduction produces compact descriptors with increased retrieval performance and fast retrieval computation times, e.g. 50% faster than the current systems.Comment: Published in Remote Sensing. The first two authors have equal contributio

    Deep learning in remote sensing: a review

    Get PDF
    Standing at the paradigm shift towards data-intensive science, machine learning techniques are becoming increasingly important. In particular, as a major breakthrough in the field, deep learning has proven as an extremely powerful tool in many fields. Shall we embrace deep learning as the key to all? Or, should we resist a 'black-box' solution? There are controversial opinions in the remote sensing community. In this article, we analyze the challenges of using deep learning for remote sensing data analysis, review the recent advances, and provide resources to make deep learning in remote sensing ridiculously simple to start with. More importantly, we advocate remote sensing scientists to bring their expertise into deep learning, and use it as an implicit general model to tackle unprecedented large-scale influential challenges, such as climate change and urbanization.Comment: Accepted for publication IEEE Geoscience and Remote Sensing Magazin

    ꡬ글 포토 λ‚΄ 검색 κΈ°λŠ₯을 ν†΅ν•œ 되찾기(Retrieval)의 ν•™μŠ΅μ„ μ€‘μ‹¬μœΌλ‘œ

    Get PDF
    ν•™μœ„λ…Όλ¬Έ(석사)--μ„œμšΈλŒ€ν•™κ΅ λŒ€ν•™μ› :μœ΅ν•©κ³Όν•™κΈ°μˆ λŒ€ν•™μ› μœ΅ν•©κ³Όν•™λΆ€(λ””μ§€ν„Έμ •λ³΄μœ΅ν•©μ „κ³΅),2019. 8. 이쀑식.The practices of photo retrieving on personal smartphones have extended. Not only pictures are browsed by scrolling up and down, but also a picture is easily come out as a result of typing a certain keyword. The technology of object recognition has changed how people look and browse personal photos; it not only classifies similar photos, but also assigns the labels that represent the referent of the classification. For instance, Google Photos has applied the object recognition and search system to allow users to manage personal photos. Eventually, the novel use of searching photos on personal album is expected to change the aspects of how people retrieve a particular photo out of thousands of accumulated ones in their cloud system. However, the novel technology is on its early stage and is at lack of leaving positive impression to the users. There is a gap between object recognition executed by the device and the user. When typing a query on its search bar for instance, the result is either none or countless number of results. The purpose of this study is to identify of the points of inconvenience in smartphone photo albums using object recognition and to deliver a better photo search user experience. In this paper, previous studies and preliminary research were thoroughly reviewed and held to fully grasp understanding of the inner workings of the object recognition, and to build a general frame of how people use the personal photo search system. In the main research, six photos search tasks were designed for a week with a total of 16 participants, aged between 20s and 30s. Search strategy tips were given to the experiment participants in order to collect particular strategy when searching photos. After collecting a total of 672 search tasks and used strategies, a post-questionnaire was followed. As a result of the analysis, the learning process of search system of the users has occurred. The study was able to identify how users learned the functions of photo search through strategy. As the number of 42 retrieval tasks, the average retrieval time of 16 users gradually decreased. As a result, the average retrieval time of the last day compared to the first day decreased by 31% from 51 seconds to 35 seconds. The average success rate of searches also rose by about 11% over 42 tasks performed each day for a week. The average number of search attempts of participants decreased by 28%. As the experience of photo retrieval is accumulated with the strategy provided, it is confirmed that the image of the learning and the improvement of the retrieval are improved in the photo retrieval using object recognition technology. In the learning style of individual users, 12 out of 16 participants participated in the appearance of learning, and 3 showed that learning did not take place. The other one showed no influence on learning. If learning is not or is not affected, it can be inferred that there is a difference in learning depending on which strategies are used in the initial search and how to adapt to the search function. Finally, 44.35% of the total search strategies were applied to the most commonly used strategies: 'use the correct name by using the high-level word (abstract concept) and the low-level word (concrete concept)'. Next was followed by using '(comma)', 'using search terms that appear on the screen like color', and 'using figures (women, men)'. The strategy that users individually formed was 12.20% of the total, and there was no difference in use over time. As for the strategies that the user made, 39.47% of the strategies were 'utilization of administrative area names', followed by 'automatic classification of people' and 'utilization of buildings'. Among the strategies that have been developed, 'Word usage tailored to Google Photos' has been found to recognize the characteristics of object recognition through experiences and form a search word by predicting it. In other words, as the experience of photo retrieval using object recognition accumulates, it shows the understanding of its characteristics. Through the above analysis, the study has examined the point where object recognition technology is difficult for user when it is used as a search in the smartphone photo album, and added a brief suggestion on how to supplement it. This study approached the user 's difficulty in using the object recognition technology as applied to the smartphone photo album. In addition, HCI (Human Computer Interaction) side has focused on the process of how the strategy made through the viewpoint of the device is accepted and transformed by the user. In addition, it is meaningful that the study tried to observe the interaction that occurred when the research of object recognition, which was concentrated only on improving the accuracy of recognition, was provided to actual users. Finally, it is meaningful that the study discussed the ways to utilize object recognition in order to utilize the medium of photography and sustainable use.μ§€λ‚œ 봄에 μ΄¬μ˜ν•œ λ²šκ½ƒ 사진이 μ°Ύκ³  싢을 λ•Œ, 슀마트폰의 μ‚¬μ§„μ²©μ—μ„œ λ²šκ½ƒμ„ 검색해본닀. 뒀이어 앨범 μ•ˆμ— μžˆλŠ” λͺ¨λ“  λ²šκ½ƒ 사진이 λ‚˜νƒ€λ‚œλ‹€. 이처럼, 슀마트폰으둜 μ΄¬μ˜ν•œ λ‚˜μ˜ 사진도 이제 ν‚€μ›Œλ“œλ₯Ό 톡해 검색이 κ°€λŠ₯ν•˜λ‹€. κ³Όκ±°μ—λŠ” 앨범을 λ§Œλ“€μ–΄μ„œ κ΄€λ¦¬ν•˜κ±°λ‚˜ μˆ˜λ§Žμ€ μ‚¬μ§„μ˜ ν™μˆ˜ μ†μ—μ„œ 슀크둀 λ‚΄λ €κ°€λ©° λΈŒλΌμš°μ§•ν–ˆλ‹€λ©΄, μ΄μ œλŠ” 머리 속에 λ– μ˜€λ₯΄λŠ” 검색어λ₯Ό ν™œμš©ν•΄ μ›ν•˜λŠ” 사진을 찾을 수 μžˆλŠ” 것이닀. μ΄λŠ” 컴퓨터 λΉ„μ „ λΆ„μ•Όμ—μ„œ 객체 인식 기술이 비약적인 λ°œμ „μ„ 이루고, λΉ„μŠ·ν•œ νŠΉμ§•μ„ κ°€μ§€λŠ” 사진을 λ¬Άμ–΄λ‚Έ ν›„ μ ν•©ν•œ 이름을 λΆ™μ΄λŠ” λ°κΉŒμ§€ κ°€λŠ₯해진 덕뢄이닀. κ·ΈλŸ¬λ‚˜, 검색을 ν™œμš©ν•΄λ³΄λ©΄ κ·Έ 결과의 λ²”μœ„κ°€ λ„ˆλ¬΄ λ„“μ–΄μ„œ κ²€μƒ‰μ˜ νš¨κ³Όκ°€ μ—†κ±°λ‚˜, μ‚¬μš©μžκ°€ ν™œμš©ν•œ 검색어에 λΆ€ν•©ν•˜λŠ” κ²°κ³Όκ°€ λ‚˜μ˜€μ§€ μ•ŠλŠ” κ²½μš°λ„ μžˆλ‹€. 거기에 정확지 μ•Šμ€ κ²°κ³Όκ°€ λ‚˜νƒ€λ‚  λ•Œλ„ μžˆμ–΄ 사진 검색을 ν™œμš©ν•˜λŠ” 초기의 μ‚¬μš©μžμ—κ²Œ 뢀정적인 κ²½ν—˜μ„ λ‚¨κΈ°κ²Œλœλ‹€. 이에 λ³Έ μ—°κ΅¬λŠ” 객체 인식을 ν™œμš©ν•œ 슀마트폰 μ‚¬μ§„μ²©μ—μ„œ λ‚˜νƒ€λ‚˜λŠ” λΆˆνŽΈν•¨μ˜ 지점을 λ°ν˜€λ‚΄κ³ , 사진 검색 κ²½ν—˜μ„ κ°œμ„ ν•˜κ³ μž ν•˜λŠ” λͺ©μ μ„ κ°–κ³  μ‹œμž‘λ˜μ—ˆλ‹€. 객체 인식이 슀마트폰 사진 검색과 같은 μ‹€μƒν™œμ— μ μš©λ˜μ—ˆμ„ λ•Œ λ‚˜νƒ€λ‚˜λŠ” μ΄λŸ¬ν•œ μ–΄μƒ‰ν•œ κ²°κ³ΌλŠ” μ‚¬μ§„μ΄λΌλŠ” λŒ€μƒμ˜ νŠΉμ„±κ³Ό 기기의 내뢀적 νŠΉμ„±, 이둜 인해 λ°œμƒν•˜λŠ” 기기와 μ‚¬λžŒ μ‚¬μ΄μ˜ 인식 차이둜 인해 λ°œμƒν•œλ‹€. λ³Έλ¬Έμ—μ„œλŠ” 객체 μΈμ‹μ˜ 내뢀적 νŠΉμ„±μ„ λ¬Έν—Œ 연ꡬ와 사전 쑰사λ₯Ό 톡해 μˆ˜μ§‘ν•œ ν›„, 검색을 보닀 효율적으둜 ν•  수 μžˆλŠ” μ „λž΅μœΌλ‘œ μΌλ°˜ν™” ν•˜μ˜€λ‹€. 이후 μ‚¬μš©μžμ—κ²Œ 사진 κ²€μƒ‰μ˜ κ²½ν—˜ 쀑 μ΄λŸ¬ν•œ νŠΉμ„±μ„ κ²€μƒ‰μ˜ μ „λž΅μœΌλ‘œ ν™œμš©ν•  수 μžˆλ„λ‘ 팁(Tip)을 μž‘μ„±ν•œ ν›„ 이λ₯Ό μ‹€ν—˜μ˜ μ°Έμ—¬μžμ—κ²Œ μ œκ³΅ν•˜μ˜€λ‹€. 이후 16 λͺ…μ˜ 20-30 λŒ€λ₯Ό λŒ€μƒμœΌλ‘œ μΌμ£ΌμΌλ™μ•ˆ 맀일 μ—¬μ„―λ²ˆμ˜ 사진 μ°ΎκΈ° 과업을 μˆ˜ν–‰ν•˜λŠ” μ‹€ν—˜μ„ 톡해 연ꡬ 자료λ₯Ό μˆ˜μ§‘ν•˜μ˜€λ‹€. 이λ₯Ό 톡해 총 672개의 검색 κ³Όμ—…μ˜ 기둝과 μ‚¬μš©λœ μ „λž΅, 그리고 μ‚¬μš©μžκ°€ ν˜•μ„±ν•œ μ „λž΅μ„ μˆ˜μ§‘ν–ˆμœΌλ©°, μΌμ£ΌμΌκ°„μ˜ μ‹€ν—˜μ΄ μ™„λ£Œλœ ν›„μ—λŠ” 사후 μ„€λ¬Έ 데이터λ₯Ό 얻을 수 μžˆμ—ˆλ‹€. λΆ„μ„μ˜ 결과둜, λ¨Όμ € μ „λž΅μ„ 톡해 μ‚¬μš©μžκ°€ 사진 κ²€μƒ‰μ˜ κΈ°λŠ₯을 ν•™μŠ΅ν•˜λŠ” λͺ¨μŠ΅μ„ 확인할 수 μžˆμ—ˆλ‹€. 42 회의 검색 νƒœμŠ€ν¬κ°€ λˆ„μ λ¨μ— 따라 16 λͺ…μ˜ 평균 검색 μ†Œμš” μ‹œκ°„μ€ 점차 κ°μ†Œν•˜μ˜€μœΌλ©°, 이λ₯Ό μ‹œκ°„μ˜ 흐름에 λ”°λ₯Έ λ³€ν™”μ˜ μΆ•μ—μ„œ μ‚΄νŽ΄λ³΄λ©΄ 첫 번째 날에 λΉ„ν•΄ λ§ˆμ§€λ§‰ λ‚ μ˜ 평균 검색 μ†Œμš” μ‹œκ°„μ΄ 51 μ΄ˆμ—μ„œ 35 초둜 31% κ°€λŸ‰ κ°μ†Œν–ˆλ‹€. κ²€μƒ‰μ˜ 평균 μ„±κ³΅μœ¨ λ˜ν•œ 일주일 λ™μ•ˆ 맀일 μ§„ν–‰λœ 42 회의 νƒœμŠ€ν¬μ— κ±Έμ Έ μ•½ 11% μƒμŠΉν•˜λŠ” λͺ¨μŠ΅μ„ λ³΄μ˜€λ‹€. μ‹€ν—˜ μ°Έμ—¬μžμ˜ 평균 검색 μ‹œλ„ νšŸμˆ˜λ„ 28% κ°μ†Œν•˜μ˜€λ‹€. 이λ₯Ό 톡해 제곡된 μ „λž΅κ³Ό ν•¨κ»˜ 사진 κ²€μƒ‰μ˜ κ²½ν—˜μ΄ λˆ„μ λ μˆ˜λ‘ 객체 인식 κΈ°μˆ μ„ ν™œμš©ν•œ 사진 κ²€μƒ‰μ—μ„œ ν•™μŠ΅μ˜ λͺ¨μŠ΅κ³Ό κ²€μƒ‰μ˜ κ°œμ„ μ΄ λ‚˜νƒ€λ‚¨μ„ 확인할 수 μžˆμ—ˆλ‹€. κ°œλ³„ μ‚¬μš©μžμ˜ ν•™μŠ΅ ν˜•νƒœμ—μ„œλŠ”, 16λͺ… 쀑 12λͺ…μ˜ μ°Έμ—¬μžκ°€ ν•™μŠ΅μ˜ λͺ¨μŠ΅μ΄ λ‚˜νƒ€λ‚œ κ²½μš°μ— ν•΄λ‹Ήλ˜μ—ˆμœΌλ©°, 3 λͺ…은 ν•™μŠ΅μ΄ 이뀄지지 μ•Šμ€ λͺ¨μŠ΅μ„ λ³΄μ˜€λ‹€. λ‚˜λ¨Έμ§€ 1 λͺ…은 ν•™μŠ΅μ— 영ν–₯을 받지 μ•ŠλŠ” λͺ¨μŠ΅μ„ λ‚˜νƒ€λ‚΄μ—ˆλ‹€. ν•™μŠ΅μ΄ 이뀄지지 μ•Šκ±°λ‚˜ 영ν–₯을 받지 μ•ŠλŠ” 경우, 초반의 κ²€μƒ‰μ—μ„œ μ–΄λ–€ μ „λž΅μ„ ν™œμš©ν•˜κ³  검색 κΈ°λŠ₯에 μ μ‘ν•˜λŠ”μ§€μ— 따라 ν•™μŠ΅μ—μ„œμ˜ 차이가 μƒκΈ°λŠ” κ²ƒμœΌλ‘œ μœ μΆ”ν•΄λ³Ό 수 μžˆμ—ˆλ‹€. λ§ˆμ§€λ§‰μœΌλ‘œ 제곡된 μ „λž΅ 쀑 κ°€μž₯ 많이 μ‚¬μš©λœ μ „λž΅μ€ μƒμœ„μ–΄(μƒμœ„ κ°œλ…μ˜ 단어), ν•˜μœ„μ–΄(ꡬ체적인 단어)λ₯Ό ν™œμš©ν•˜μ—¬ μ •ν™•ν•œ λͺ…μΉ­ μ‚¬μš©ν•˜κΈ°λ‘œ 전체 κ²€μƒ‰μ˜ 44.35%이 ν•΄λ‹Ή μ „λž΅μ„ ν™œμš©ν•œ κ²ƒμœΌλ‘œ λ‚˜νƒ€λ‚¬λ‹€. κ·Έ λ‹€μŒμœΌλ‘œλŠ” ,(콀마) μ‚¬μš©ν•˜κΈ°, 색상과 같이 ν™”λ©΄μ—μ„œ λ“œλŸ¬λ‚˜λŠ” 검색어 ν™œμš©ν•˜κΈ°, 인물(μ—¬μž, λ‚¨μž) ν™œμš©ν•˜κΈ°κ°€ λ’€λ₯Ό μ΄μ—ˆλ‹€. μ‚¬μš©μžκ°€ κ°œλ³„μ μœΌλ‘œ ν˜•μ„±ν•œ μ „λž΅μ€ μ „μ²΄μ˜ 12.20%둜 λ‚˜νƒ€λ‚¬μœΌλ©°, μ‹œκ°„μ΄ 흐름에 따라 κ·Έ μ‚¬μš©μ˜ μ°¨μ΄λŠ” λ‚˜νƒ€λ‚˜μ§€ μ•Šμ•˜λ‹€. μ‚¬μš©μžκ°€ 직접 μ œμž‘ν•œ μ „λž΅μœΌλ‘œλŠ” 행정ꡬ역λͺ…μ˜ ν™œμš©μ΄ κ°œλ³„ μ „λž΅μ˜ 39.47%λ₯Ό μ°¨μ§€ν–ˆμœΌλ©°, κ·Έ λ‹€μŒμœΌλ‘œλŠ” μžλ™ λΆ„λ₯˜λœ 인물 ν™œμš©, 그리고 κ±΄μΆ•λ¬Όμ˜ ν™œμš©μ΄ 주둜 ν™œμš©λ˜μ—ˆλ‹€. ν˜•μ„±λœ μ „λž΅ 쀑 ꡬ글 포토에 맞좘 단어 ν™œμš©μ€ 과거의 κ²½ν—˜μ„ 톡해 객체 μΈμ‹μ˜ νŠΉμ„±μ„ μΈμ§€ν•˜κ³ , 이λ₯Ό μ˜ˆμΈ‘ν•˜μ—¬ 검색어λ₯Ό ν˜•μ„±ν•˜λŠ” κ²ƒμœΌλ‘œ λ‚˜νƒ€λ‚¬λ‹€. 즉, 객체 인식을 ν™œμš©ν•œ 사진 κ²€μƒ‰μ˜ κ²½ν—˜μ„ λˆ„μ ν• μˆ˜λ‘ κ·Έ νŠΉμ„±μ— λŒ€ν•œ 이해가 μƒκΈ°λŠ” λͺ¨μŠ΅μ„ 보여쀀닀. μ΄μƒμ˜ 뢄석을 톡해 객체 인식 기술이 슀마트폰 사진첩 λ‚΄μ—μ„œ κ²€μƒ‰μœΌλ‘œ ν™œμš©λ  λ•Œ μ‚¬μš©μžμ—κ²Œ 어렀움을 κ°€μ Έμ˜€λŠ” 지점에 λŒ€ν•΄ μ‚΄νŽ΄λ³Έ ν›„, 이λ₯Ό 보완할 수 μžˆλŠ” λ°©μ•ˆμ— λŒ€ν•œ κ°„λž΅ν•œ μ œμ–Έμ„ λ§λΆ™μ˜€λ‹€. λ³Έ μ—°κ΅¬λŠ” 객체 인식 기술이 슀마트폰 사진첩에 μ μš©λ˜λ©΄μ„œ λ‚˜νƒ€λ‚œ μ‚¬μš©μ˜ 어렀움을 μ‚¬μš©μžμ˜ κ΄€μ μ—μ„œ μ ‘κ·Όν•˜μ˜€λ‹€. λ”ν•˜μ—¬ HCI(Human Computer Interaction)의 μΈ‘λ³€μ—μ„œ, 기기의 관점을 톡해 μ œμž‘λœ μ „λž΅μ΄ μ–΄λ–»κ²Œ μ‚¬μš©μžμ—κ²Œ 수용되고 λ³€ν˜•λ˜λŠ”μ§€ κ·Έ 과정에 μ§‘μ€‘ν–ˆλ‹€λŠ”λ° κ·Έ μœ΅ν•©μ  의의λ₯Ό 가진닀. λ”ν•˜μ—¬ μΈμ‹μ˜ 정확도λ₯Ό λ†’μ΄λŠ”λ°μ—λ§Œ μ§‘μ€‘λ˜μ—ˆλ˜ 객체 μΈμ‹μ˜ 연ꡬ가 μ‹€μ œ μ‚¬μš©μžμ—κ²Œ μ œκ³΅λ˜μ—ˆμ„ λ•Œ λ°œμƒν•˜λŠ” μΈν„°λž™μ…˜μ„ μ‹€ν—˜μ„ 톡해 관찰을 μ‹œλ„ν–ˆλ‹€λŠ”λ° 의의λ₯Ό κ°–λŠ”λ‹€. λ§ˆμ§€λ§‰μœΌλ‘œλŠ” μ‚¬μ§„μ΄λΌλŠ” 맀체의 ν™œμš©μ„±κ³Ό 지속 κ°€λŠ₯ν•œ ν™œμš©μ„ μœ„ν•΄ 객체 인식을 ν™œμš©ν•  수 μžˆλŠ” 방식에 λŒ€ν•œ λ…Όμ˜λ₯Ό νŽΌμ³€λ‹€λŠ” 데 μ˜μ˜κ°€ μžˆλ‹€.제 1 μž₯ μ„œλ‘  1 제 1 절 μ—°κ΅¬μ˜ λ°°κ²½ 1 제 2 절 μ—°κ΅¬μ˜ λͺ©μ  2 제 2 μž₯ κ΄€λ ¨ 연ꡬ 4 제 1 절 이미지 κ²€μƒ‰μ—μ„œ 객체 μΈμ‹μ˜ ν™œμš© 8 제 2 절 기기와 μΈκ°„μ˜ 관점 차이 11 제 3 절 μ‚¬μ§„μ˜ νŠΉμ„±κ³Ό 슀마트폰 사진 μ°ΎκΈ° 14 제 3 μž₯ 연ꡬ 문제 19 제 1 절 연ꡬ 문제의 μ„€μ • 19 제 2 절 츑정을 μœ„ν•œ κ°œλ… μ •μ˜ 21 제 4 μž₯ 연ꡬ 방법 24 제 1 절 연ꡬ 방법 24 제 2 절 μ‹€ν—˜ 방법 25 제 5 μž₯ 연ꡬ κ²°κ³Ό 41 제 1 절 객체 인식을 ν™œμš©ν•œ 사진 찾기의 ν•™μŠ΅ 41 제 2 절 μ‚¬μš©μž 별 ν•™μŠ΅μ˜ ν˜•νƒœμ™€ κ·Έ νŠΉμ„± 48 제 3 절 제곡된 μ „λž΅μ˜ ν™œμš©κ³Ό λ³€ν˜• 52 제 6 μž₯ 연ꡬ λ…Όμ˜ 63 제 1 절 μ—°κ΅¬μ—μ„œ λ‚˜νƒ€λ‚œ 기기와 μΈκ°„μ˜ 차이 63 제 7 μž₯ κ²°λ‘  및 μ—°κ΅¬μ˜ 의의 67 제 1 절 μ—°κ΅¬μ˜ μš”μ•½ 67 제 2 절 μ—°κ΅¬μ˜ ν•œκ³„ 및 μ œμ–Έ 70 제 3 절 μ—°κ΅¬μ˜ 의의 72 μ°Έκ³ λ¬Έν—Œ 74 Abstract 78Maste

    Learning Low Dimensional Convolutional Neural Networks for High-Resolution Remote Sensing Image Retrieval

    No full text
    Learning powerful feature representations for image retrieval has always been a challenging task in the field of remote sensing. Traditional methods focus on extracting low-level hand-crafted features which are not only time-consuming but also tend to achieve unsatisfactory performance due to the complexity of remote sensing images. In this paper, we investigate how to extract deep feature representations based on convolutional neural networks (CNNs) for high-resolution remote sensing image retrieval (HRRSIR). To this end, several effective schemes are proposed to generate powerful feature representations for HRRSIR. In the first scheme, a CNN pre-trained on a different problem is treated as a feature extractor since there are no sufficiently-sized remote sensing datasets to train a CNN from scratch. In the second scheme, we investigate learning features that are specific to our problem by first fine-tuning the pre-trained CNN on a remote sensing dataset and then proposing a novel CNN architecture based on convolutional layers and a three-layer perceptron. The novel CNN has fewer parameters than the pre-trained and fine-tuned CNNs and can learn low dimensional features from limited labelled images. The schemes are evaluated on several challenging, publicly available datasets. The results indicate that the proposed schemes, particularly the novel CNN, achieve state-of-the-art performance

    Toward Global Localization of Unmanned Aircraft Systems using Overhead Image Registration with Deep Learning Convolutional Neural Networks

    Get PDF
    Global localization, in which an unmanned aircraft system (UAS) estimates its unknown current location without access to its take-off location or other locational data from its flight path, is a challenging problem. This research brings together aspects from the remote sensing, geoinformatics, and machine learning disciplines by framing the global localization problem as a geospatial image registration problem in which overhead aerial and satellite imagery serve as a proxy for UAS imagery. A literature review is conducted covering the use of deep learning convolutional neural networks (DLCNN) with global localization and other related geospatial imagery applications. Differences between geospatial imagery taken from the overhead perspective and terrestrial imagery are discussed, as well as difficulties in using geospatial overhead imagery for image registration due to a lack of suitable machine learning datasets. Geospatial analysis is conducted to identify suitable areas for future UAS imagery collection. One of these areas, Jerusalem northeast (JNE) is selected as the area of interest (AOI) for this research. Multi-modal, multi-temporal, and multi-resolution geospatial overhead imagery is aggregated from a variety of publicly available sources and processed to create a controlled image dataset called Jerusalem northeast rural controlled imagery (JNE RCI). JNE RCI is tested with handcrafted feature-based methods SURF and SIFT and a non-handcrafted feature-based pre-trained fine-tuned VGG-16 DLCNN on coarse-grained image registration. Both handcrafted and non-handcrafted feature based methods had difficulty with the coarse-grained registration process. The format of JNE RCI is determined to be unsuitable for the coarse-grained registration process with DLCNNs and the process to create a new supervised machine learning dataset, Jerusalem northeast machine learning (JNE ML) is covered in detail. A multi-resolution grid based approach is used, where each grid cell ID is treated as the supervised training label for that respective resolution. Pre-trained fine-tuned VGG-16 DLCNNs, two custom architecture two-channel DLCNNs, and a custom chain DLCNN are trained on JNE ML for each spatial resolution of subimages in the dataset. All DLCNNs used could more accurately coarsely register the JNE ML subimages compared to the pre-trained fine-tuned VGG-16 DLCNN on JNE RCI. This shows the process for creating JNE ML is valid and is suitable for using machine learning with the coarse-grained registration problem. All custom architecture two-channel DLCNNs and the custom chain DLCNN were able to more accurately coarsely register the JNE ML subimages compared to the fine-tuned pre-trained VGG-16 approach. Both the two-channel custom DLCNNs and the chain DLCNN were able to generalize well to new imagery that these networks had not previously trained on. Through the contributions of this research, a foundation is laid for future work to be conducted on the UAS global localization problem within the rural forested JNE AOI
    corecore