11 research outputs found

    Feature transforms for image data augmentation

    Get PDF
    A problem with convolutional neural networks (CNNs) is that they require large datasets to obtain adequate robustness; on small datasets, they are prone to overfitting. Many methods have been proposed to overcome this shortcoming with CNNs. In cases where additional samples cannot easily be collected, a common approach is to generate more data points from existing data using an augmentation technique. In image classification, many augmentation approaches utilize simple image manipulation algorithms. In this work, we propose some new methods for data augmentation based on several image transformations: the Fourier transform (FT), the Radon transform (RT), and the discrete cosine transform (DCT). These and other data augmentation methods are considered in order to quantify their effectiveness in creating ensembles of neural networks. The novelty of this research is to consider different strategies for data augmentation to generate training sets from which to train several classifiers which are combined into an ensemble. Specifically, the idea is to create an ensemble based on a kind of bagging of the training set, where each model is trained on a different training set obtained by augmenting the original training set with different approaches. We build ensembles on the data level by adding images generated by combining fourteen augmentation approaches, with three based on FT, RT, and DCT, proposed here for the first time. Pretrained ResNet50 networks are finetuned on training sets that include images derived from each augmentation method. These networks and several fusions are evaluated and compared across eleven benchmarks. Results show that building ensembles on the data level by combining different data augmentation methods produce classifiers that not only compete competitively against the state-of-the-art but often surpass the best approaches reported in the literature

    닀쀑 μ„Όμ‹± ν”Œλž«νΌκ³Ό λ”₯λŸ¬λ‹μ„ ν™œμš©ν•œ λ„μ‹œ 규λͺ¨μ˜ 수λͺ© 맡핑 및 μˆ˜μ’… 탐지

    Get PDF
    ν•™μœ„λ…Όλ¬Έ(석사) -- μ„œμšΈλŒ€ν•™κ΅λŒ€ν•™μ› : 농업생λͺ…κ³Όν•™λŒ€ν•™ μƒνƒœμ‘°κ²½Β·μ§€μ—­μ‹œμŠ€ν…œκ³΅ν•™λΆ€(μƒνƒœμ‘°κ²½ν•™), 2023. 2. λ₯˜μ˜λ ¬.Precise estimation of the number of trees and individual tree location with species information all over the city forms solid foundation for enhancing ecosystem service. However, mapping individual trees at the city scale remains challenging due to heterogeneous patterns of urban tree distribution. Here, we present a novel framework for merging multiple sensing platforms with leveraging various deep neural networks to produce a fine-grained urban tree map. We performed mapping trees and detecting species by relying only on RGB images taken by multiple sensing platforms such as airborne, citizens and vehicles, which fueled six deep learning models. We divided the entire process into three steps, since each platform has its own strengths. First, we produced individual tree location maps by converting the central points of the bounding boxes into actual coordinates from airborne imagery. Since many trees were obscured by the shadows of the buildings, we applied Generative Adversarial Network (GAN) to delineate hidden trees from the airborne images. Second, we selected tree bark photos collected by citizen for species mapping in urban parks and forests. Species information of all tree bark photos were automatically classified after non-tree parts of images were segmented. Third, we classified species of roadside trees by using a camera mounted on a car to augment our species mapping framework with street-level tree data. We estimated the distance from a car to street trees from the number of lanes detected from the images. Finally, we assessed our results by comparing it with Light Detection and Ranging (LiDAR), GPS and field data. We estimated over 1.2 million trees existed in the city of 121.04 kmΒ² and generated more accurate individual tree positions, outperforming the conventional field survey methods. Among them, we detected the species of more than 63,000 trees. The most frequently detected species was Prunus yedoensis (21.43 %) followed by Ginkgo biloba (19.44 %), Zelkova serrata (18.68 %), Pinus densiflora (7.55 %) and Metasequoia glyptostroboides (5.97 %). Comprehensive experimental results demonstrate that tree bark photos and street-level imagery taken by citizens and vehicles are conducive to delivering accurate and quantitative information on the distribution of urban tree species.λ„μ‹œ 전역에 μ‘΄μž¬ν•˜λŠ” λͺ¨λ“  수λͺ©μ˜ μˆ«μžμ™€ κ°œλ³„ μœ„μΉ˜, 그리고 μˆ˜μ’… 뢄포λ₯Ό μ •ν™•ν•˜κ²Œ νŒŒμ•…ν•˜λŠ” 것은 μƒνƒœκ³„ μ„œλΉ„μŠ€λ₯Ό ν–₯μƒμ‹œν‚€κΈ° μœ„ν•œ ν•„μˆ˜μ‘°κ±΄μ΄λ‹€. ν•˜μ§€λ§Œ, λ„μ‹œμ—μ„œλŠ” 수λͺ©μ˜ 뢄포가 맀우 λ³΅μž‘ν•˜κΈ° λ•Œλ¬Έμ— κ°œλ³„ 수λͺ©μ„ λ§΅ν•‘ν•˜λŠ” 것은 μ–΄λ €μ› λ‹€. λ³Έ μ—°κ΅¬μ—μ„œλŠ”, μ—¬λŸ¬κ°€μ§€ μ„Όμ‹± ν”Œλž«νΌμ„ μœ΅ν•©ν•¨κ³Ό λ™μ‹œμ— λ‹€μ–‘ν•œ λ”₯λŸ¬λ‹ λ„€νŠΈμ›Œν¬λ“€μ„ ν™œμš©ν•˜μ—¬ μ„Έλ°€ν•œ λ„μ‹œ 수λͺ© 지도λ₯Ό μ œμž‘ν•˜λŠ” μƒˆλ‘œμš΄ ν”„λ ˆμž„μ›Œν¬λ₯Ό μ œμ•ˆν•œλ‹€. μš°λ¦¬λŠ” 였직 항곡사진, μ‹œλ―Ό, μ°¨λŸ‰ λ“±μ˜ ν”Œλž«νΌμœΌλ‘œλΆ€ν„° μˆ˜μ§‘λœ RGB μ΄λ―Έμ§€λ§Œμ„ μ‚¬μš©ν•˜μ˜€μœΌλ©°, 6가지 λ”₯λŸ¬λ‹ λͺ¨λΈμ„ ν™œμš©ν•˜μ—¬ 수λͺ©μ„ λ§΅ν•‘ν•˜κ³  μˆ˜μ’…μ„ νƒμ§€ν•˜μ˜€λ‹€. 각각의 ν”Œλž«νΌμ€ μ €λ§ˆλ‹€μ˜ 강점이 있기 λ•Œλ¬Έμ— μ „ 과정을 μ„Έ 가지 μŠ€ν…μœΌλ‘œ ꡬ뢄할 수 μžˆλ‹€. 첫째, μš°λ¦¬λŠ” 항곡사진 μƒμ—μ„œ νƒμ§€λœ 수λͺ©μ˜ λ”₯λŸ¬λ‹ λ°”μš΄λ”© λ°•μŠ€λ‘œλΆ€ν„° 쀑심점을 μΆ”μΆœν•˜μ—¬ κ°œλ³„ 수λͺ©μ˜ μœ„μΉ˜ 지도λ₯Ό μ œμž‘ν•˜μ˜€λ‹€. λ§Žμ€ 수λͺ©μ΄ λ„μ‹œ λ‚΄ κ³ μΈ΅ λΉŒλ”©μ˜ κ·Έλ¦Όμžμ— μ˜ν•΄ κ°€λ €μ‘ŒκΈ° λ•Œλ¬Έμ—, μš°λ¦¬λŠ” 생정적 μ λŒ€μ  신경망 (Generative Adversarial Network, GAN)을 톡해 항곡사진 상에 μˆ¨κ²¨μ§„ 수λͺ©μ„ κ·Έλ €λ‚΄κ³ μž ν•˜μ˜€λ‹€. λ‘˜μ§Έ, μš°λ¦¬λŠ” μ‹œλ―Όλ“€μ΄ μˆ˜μ§‘ν•œ 수λͺ©μ˜ μˆ˜ν”Ό 사진을 ν™œμš©ν•˜μ—¬ λ„μ‹œ 곡원 및 λ„μ‹œ 숲 μΌλŒ€μ— μˆ˜μ’… 정보λ₯Ό λ§΅ν•‘ν•˜μ˜€λ‹€. μˆ˜ν”Ό μ‚¬μ§„μœΌλ‘œλΆ€ν„°μ˜ μˆ˜μ’… μ •λ³΄λŠ” λ”₯λŸ¬λ‹ λ„€νŠΈμ›Œν¬μ— μ˜ν•΄ μžλ™μœΌλ‘œ λΆ„λ₯˜λ˜μ—ˆμœΌλ©°, 이 κ³Όμ •μ—μ„œ 이미지 λΆ„ν•  λͺ¨λΈ λ˜ν•œ μ μš©λ˜μ–΄ λ”₯λŸ¬λ‹ λΆ„λ₯˜ λͺ¨λΈμ΄ μ˜€λ‘œμ§€ μˆ˜ν”Ό λΆ€λΆ„μ—λ§Œ 집쀑할 수 μžˆλ„λ‘ ν•˜μ˜€λ‹€. μ…‹μ§Έ, μš°λ¦¬λŠ” μ°¨λŸ‰μ— νƒ‘μž¬λœ 카메라λ₯Ό ν™œμš©ν•˜μ—¬ λ„λ‘œλ³€ κ°€λ‘œμˆ˜μ˜ μˆ˜μ’…μ„ νƒμ§€ν•˜μ˜€λ‹€. 이 κ³Όμ •μ—μ„œ μ°¨λŸ‰μœΌλ‘œλΆ€ν„° κ°€λ‘œμˆ˜κΉŒμ§€μ˜ 거리 정보가 ν•„μš”ν•˜μ˜€λŠ”λ°, μš°λ¦¬λŠ” 이미지 μƒμ˜ μ°¨μ„  κ°œμˆ˜λ‘œλΆ€ν„° 거리λ₯Ό μΆ”μ •ν•˜μ˜€λ‹€. λ§ˆμ§€λ§‰μœΌλ‘œ, λ³Έ 연ꡬ κ²°κ³ΌλŠ” 라이닀 (Light Detection and Ranging, LiDAR)와 GPS μž₯λΉ„, 그리고 ν˜„μž₯ μžλ£Œμ— μ˜ν•΄ ν‰κ°€λ˜μ—ˆλ‹€. μš°λ¦¬λŠ” 121.04 kmΒ² 면적의 λŒ€μƒμ§€ 내에 μ•½ 130λ§Œμ—¬ 그루의 수λͺ©μ΄ μ‘΄μž¬ν•˜λŠ” 것을 ν™•μΈν•˜μ˜€μœΌλ©°, λ‹€μ–‘ν•œ 선행연ꡬ보닀 높은 μ •ν™•λ„μ˜ κ°œλ³„ 수λͺ© μœ„μΉ˜ 지도λ₯Ό μ œμž‘ν•˜μ˜€λ‹€. νƒμ§€λœ λͺ¨λ“  수λͺ© 쀑 μ•½ 6만 3μ²œμ—¬ 그루의 μˆ˜μ’… 정보가 νƒμ§€λ˜μ—ˆμœΌλ©°, 이쀑 κ°€μž₯ 빈번히 νƒμ§€λœ 수λͺ©μ€ μ™•λ²šλ‚˜λ¬΄ (Prunus yedoensis, 21.43 %)μ˜€λ‹€. μ€ν–‰λ‚˜λ¬΄ (Ginkgo biloba, 19.44 %), λŠν‹°λ‚˜λ¬΄ (Zelkova serrata, 18.68 %), μ†Œλ‚˜λ¬΄ (Pinus densiflora, 7.55 %), 그리고 메타세쿼이어 (Metasequoia glyptostroboides, 5.97 %) 등이 κ·Έ λ’€λ₯Ό μ΄μ—ˆλ‹€. 포괄적인 검증이 μˆ˜ν–‰λ˜μ—ˆκ³ , λ³Έ μ—°κ΅¬μ—μ„œλŠ” μ‹œλ―Όμ΄ μˆ˜μ§‘ν•œ μˆ˜ν”Ό 사진과 μ°¨λŸ‰μœΌλ‘œλΆ€ν„° μˆ˜μ§‘λœ λ„λ‘œλ³€ μ΄λ―Έμ§€λŠ” λ„μ‹œ μˆ˜μ’… 뢄포에 λŒ€ν•œ μ •ν™•ν•˜κ³  μ •λŸ‰μ μΈ 정보λ₯Ό μ œκ³΅ν•œλ‹€λŠ” 것을 κ²€μ¦ν•˜μ˜€λ‹€.1. Introduction 6 2. Methodology 9 2.1. Data collection 9 2.2. Deep learning overall 12 2.3. Tree counting and mapping 15 2.4. Tree species detection 16 2.5. Evaluation 21 3. Results 22 3.1. Evaluation of deep learning performance 22 3.2. Tree counting and mapping 23 3.3. Tree species detection 27 4. Discussion 30 4.1. Multiple sensing platforms for urban areas 30 4.2. Potential of citizen and vehicle sensors 34 4.3. Implications 48 5. Conclusion 51 Bibliography 52 Abstract in Korean 61석

    Two and three dimensional segmentation of multimodal imagery

    Get PDF
    The role of segmentation in the realms of image understanding/analysis, computer vision, pattern recognition, remote sensing and medical imaging in recent years has been significantly augmented due to accelerated scientific advances made in the acquisition of image data. This low-level analysis protocol is critical to numerous applications, with the primary goal of expediting and improving the effectiveness of subsequent high-level operations by providing a condensed and pertinent representation of image information. In this research, we propose a novel unsupervised segmentation framework for facilitating meaningful segregation of 2-D/3-D image data across multiple modalities (color, remote-sensing and biomedical imaging) into non-overlapping partitions using several spatial-spectral attributes. Initially, our framework exploits the information obtained from detecting edges inherent in the data. To this effect, by using a vector gradient detection technique, pixels without edges are grouped and individually labeled to partition some initial portion of the input image content. Pixels that contain higher gradient densities are included by the dynamic generation of segments as the algorithm progresses to generate an initial region map. Subsequently, texture modeling is performed and the obtained gradient, texture and intensity information along with the aforementioned initial partition map are used to perform a multivariate refinement procedure, to fuse groups with similar characteristics yielding the final output segmentation. Experimental results obtained in comparison to published/state-of the-art segmentation techniques for color as well as multi/hyperspectral imagery, demonstrate the advantages of the proposed method. Furthermore, for the purpose of achieving improved computational efficiency we propose an extension of the aforestated methodology in a multi-resolution framework, demonstrated on color images. Finally, this research also encompasses a 3-D extension of the aforementioned algorithm demonstrated on medical (Magnetic Resonance Imaging / Computed Tomography) volumes

    Study on Co-occurrence-based Image Feature Analysis and Texture Recognition Employing Diagonal-Crisscross Local Binary Pattern

    Get PDF
    In this thesis, we focus on several important fields on real-world image texture analysis and recognition. We survey various important features that are suitable for texture analysis. Apart from the issue of variety of features, different types of texture datasets are also discussed in-depth. There is no thorough work covering the important databases and analyzing them in various viewpoints. We persuasively categorize texture databases ? based on many references. In this survey, we put a categorization to split these texture datasets into few basic groups and later put related datasets. Next, we exhaustively analyze eleven second-order statistical features or cues based on co-occurrence matrices to understand image texture surface. These features are exploited to analyze properties of image texture. The features are also categorized based on their angular orientations and their applicability. Finally, we propose a method called diagonal-crisscross local binary pattern (DCLBP) for texture recognition. We also propose two other extensions of the local binary pattern. Compare to the local binary pattern and few other extensions, we achieve that our proposed method performs satisfactorily well in two very challenging benchmark datasets, called the KTH-TIPS (Textures under varying Illumination, Pose and Scale) database, and the USC-SIPI (University of Southern California ? Signal and Image Processing Institute) Rotations Texture dataset.九州ε·₯ζ₯­ε€§ε­¦εšε£«ε­¦δ½θ«–ζ–‡ 学位記η•ͺ号:ε·₯εšη”²η¬¬354ε·γ€€ε­¦δ½ζŽˆδΈŽεΉ΄ζœˆζ—₯:平成25εΉ΄9月27ζ—₯CHAPTER 1 INTRODUCTION|CHAPTER 2 FEATURES FOR TEXTURE ANALYSIS|CHAPTER 3 IN-DEPTH ANALYSIS OF TEXTURE DATABASES|CHAPTER 4 ANALYSIS OF FEATURES BASED ON CO-OCCURRENCE IMAGE MATRIX|CHAPTER 5 CATEGORIZATION OF FEATURES BASED ON CO-OCCURRENCE IMAGE MATRIX|CHAPTER 6 TEXTURE RECOGNITION BASED ON DIAGONAL-CRISSCROSS LOCAL BINARY PATTERN|CHAPTER 7 CONCLUSIONS AND FUTURE WORK九州ε·₯ζ₯­ε€§ε­¦εΉ³ζˆ25εΉ΄

    Varied Image Data Augmentation Methods for Building Ensemble

    Get PDF
    Convolutional Neural Networks (CNNs) are used in many domains but the requirement of large datasets for robust training sessions and no overfitting makes them hard to apply in medical fields and similar fields. However, when large quantities of samples cannot be easily collected, various methods can still be applied to stem the problem depending on the sample type. Data augmentation, rather than other methods, has recently been under the spotlight mostly because of the simplicity and effectiveness of some of the more adopted methods. The research question addressed in this work is whether data augmentation techniques can help in developing robust and efficient machine learning systems to be used in different domains for classification purposes. To do that, we introduce new image augmentation techniques that make use of different methods like Fourier Transform (FT), Discrete Cosine Transform (DCT), Radon Transform (RT), Hilbert Transform (HT), Singular Value Decomposition (SVD), Local Laplacian Filters (LLF) and Hampel filter (HF). We define different ensemble methods by combining various classical data augmentation methods with the newer ones presented here. We performed an extensive empirical evaluation on 15 different datasets to validate our proposal. The obtained results show that the newly proposed data augmentation methods can be very effective even when used alone. The ensembles trained with different augmentations methods can outperform some of the best approaches reported in the literature as well as compete with state-of-the-art custom methods. All resources are available at https://github.com/LorisNanni.publishedVersionPeer reviewe

    UAVs for the Environmental Sciences

    Get PDF
    This book gives an overview of the usage of UAVs in environmental sciences covering technical basics, data acquisition with different sensors, data processing schemes and illustrating various examples of application

    Food Recognition and Volume Estimation in a Dietary Assessment System

    Full text link
    Recently obesity has become an epidemic and one of the most serious worldwide public health concerns of the 21st century. Obesity diminishes the average life expectancy and there is now convincing evidence that poor diet, in combination with physical inactivity are key determinants of an individual s risk of developing chronic diseases such as cancer, cardiovascular disease or diabetes. Assessing what people eat is fundamental to establishing the link between diet and disease. Food records are considered the best approach for assessing energy intake. However, this method requires literate and highly motivated subjects. This is a particular problem for adolescents and young adults who are the least likely to undertake food records. The ready access of the majority of the population to mobile phones (with integrated camera, improved memory capacity, network connectivity and faster processing capability) has opened up new opportunities for dietary assessment. The dietary information extracted from dietary assessment provide valuable insights into the cause of diseases that greatly helps practicing dietitians and researchers to develop subsequent approaches for mounting intervention programs for prevention. In such systems, the camera in the mobile phone is used for capturing images of food consumed and these images are then processed to automatically estimate the nutritional content of the food. However, food objects are deformable objects that exhibit variations in appearance, shape, texture and color so the food classification and volume estimation in these systems suffer from lower accuracy. The improvement of the food recognition accuracy and volume estimation accuracy are challenging tasks. This thesis presents new techniques for food classification and food volume estimation. For food recognition, emphasis was given to texture features. The existing food recognition techniques assume that the food images will be viewed at similar scales and from the same viewpoints. However, this assumption fails in practical applications, because it is difficult to ensure that a user in a dietary assessment system will put his/her camera at the same scale and orientation to capture food images as that of the target food images in the database. A new scale and rotation invariant feature generation approach that applies Gabor filter banks is proposed. To obtain scale and rotation invariance, the proposed approach identifies the dominant orientation of the filtered coefficient and applies a circular shifting operation to place this value at the first scale of dominant direction. The advantages of this technique are it does not require the scale factor to be known in advance and it is scale/and rotation invariant separately and concurrently. This approach is modified to achieve improved accuracy by applying a Gaussian window along the scale dimension which reduces the impact of high and low frequencies of the filter outputs enabling better matching between the same classes. Besides automatic classification, semi automatic classification and group classification are also considered to have an idea about the improvement. To estimate the volume of a food item, a stereo pair is used to recover the structure as a 3D point cloud. A slice based volume estimation approach is proposed that converts the 3D point cloud to a series of 2D slices. The proposed approach eliminates the problem of knowing the distance between two cameras with the help of disparities and depth information from a fiducial marker. The experimental results show that the proposed approach can provide an accurate estimate of food volume

    Abstracts on Radio Direction Finding (1899 - 1995)

    Get PDF
    The files on this record represent the various databases that originally composed the CD-ROM issue of "Abstracts on Radio Direction Finding" database, which is now part of the Dudley Knox Library's Abstracts and Selected Full Text Documents on Radio Direction Finding (1899 - 1995) Collection. (See Calhoun record https://calhoun.nps.edu/handle/10945/57364 for further information on this collection and the bibliography). Due to issues of technological obsolescence preventing current and future audiences from accessing the bibliography, DKL exported and converted into the three files on this record the various databases contained in the CD-ROM. The contents of these files are: 1) RDFA_CompleteBibliography_xls.zip [RDFA_CompleteBibliography.xls: Metadata for the complete bibliography, in Excel 97-2003 Workbook format; RDFA_Glossary.xls: Glossary of terms, in Excel 97-2003 Workbookformat; RDFA_Biographies.xls: Biographies of leading figures, in Excel 97-2003 Workbook format]; 2) RDFA_CompleteBibliography_csv.zip [RDFA_CompleteBibliography.TXT: Metadata for the complete bibliography, in CSV format; RDFA_Glossary.TXT: Glossary of terms, in CSV format; RDFA_Biographies.TXT: Biographies of leading figures, in CSV format]; 3) RDFA_CompleteBibliography.pdf: A human readable display of the bibliographic data, as a means of double-checking any possible deviations due to conversion
    corecore