11 research outputs found
Feature transforms for image data augmentation
A problem with convolutional neural networks (CNNs) is that they require large datasets to obtain adequate robustness; on small datasets, they are prone to overfitting. Many methods have been proposed to overcome this shortcoming with CNNs. In cases where additional samples cannot easily be collected, a common approach is to generate more data points from existing data using an augmentation technique. In image classification, many augmentation approaches utilize simple image manipulation algorithms. In this work, we propose some new methods for data augmentation based on several image transformations: the Fourier transform (FT), the Radon transform (RT), and the discrete cosine transform (DCT). These and other data augmentation methods are considered in order to quantify their effectiveness in creating ensembles of neural networks. The novelty of this research is to consider different strategies for data augmentation to generate training sets from which to train several classifiers which are combined into an ensemble. Specifically, the idea is to create an ensemble based on a kind of bagging of the training set, where each model is trained on a different training set obtained by augmenting the original training set with different approaches. We build ensembles on the data level by adding images generated by combining fourteen augmentation approaches, with three based on FT, RT, and DCT, proposed here for the first time. Pretrained ResNet50 networks are finetuned on training sets that include images derived from each augmentation method. These networks and several fusions are evaluated and compared across eleven benchmarks. Results show that building ensembles on the data level by combining different data augmentation methods produce classifiers that not only compete competitively against the state-of-the-art but often surpass the best approaches reported in the literature
λ€μ€ μΌμ± νλ«νΌκ³Ό λ₯λ¬λμ νμ©ν λμ κ·λͺ¨μ μλͺ© 맡ν λ° μμ’ νμ§
νμλ
Όλ¬Έ(μμ¬) -- μμΈλνκ΅λνμ : λμ
μλͺ
κ³Όνλν μνμ‘°κ²½Β·μ§μμμ€ν
곡νλΆ(μνμ‘°κ²½ν), 2023. 2. λ₯μλ ¬.Precise estimation of the number of trees and individual tree location with species information all over the city forms solid foundation for enhancing ecosystem service. However, mapping individual trees at the city scale remains challenging due to heterogeneous patterns of urban tree distribution. Here, we present a novel framework for merging multiple sensing platforms with leveraging various deep neural networks to produce a fine-grained urban tree map. We performed mapping trees and detecting species by relying only on RGB images taken by multiple sensing platforms such as airborne, citizens and vehicles, which fueled six deep learning models. We divided the entire process into three steps, since each platform has its own strengths. First, we produced individual tree location maps by converting the central points of the bounding boxes into actual coordinates from airborne imagery. Since many trees were obscured by the shadows of the buildings, we applied Generative Adversarial Network (GAN) to delineate hidden trees from the airborne images. Second, we selected tree bark photos collected by citizen for species mapping in urban parks and forests. Species information of all tree bark photos were automatically classified after non-tree parts of images were segmented. Third, we classified species of roadside trees by using a camera mounted on a car to augment our species mapping framework with street-level tree data. We estimated the distance from a car to street trees from the number of lanes detected from the images. Finally, we assessed our results by comparing it with Light Detection and Ranging (LiDAR), GPS and field data. We estimated over 1.2 million trees existed in the city of 121.04 kmΒ² and generated more accurate individual tree positions, outperforming the conventional field survey methods. Among them, we detected the species of more than 63,000 trees. The most frequently detected species was Prunus yedoensis (21.43 %) followed by Ginkgo biloba (19.44 %), Zelkova serrata (18.68 %), Pinus densiflora (7.55 %) and Metasequoia glyptostroboides (5.97 %). Comprehensive experimental results demonstrate that tree bark photos and street-level imagery taken by citizens and vehicles are conducive to delivering accurate and quantitative information on the distribution of urban tree species.λμ μ μμ μ‘΄μ¬νλ λͺ¨λ μλͺ©μ μ«μμ κ°λ³ μμΉ, κ·Έλ¦¬κ³ μμ’
λΆν¬λ₯Ό μ ννκ² νμ
νλ κ²μ μνκ³ μλΉμ€λ₯Ό ν₯μμν€κΈ° μν νμ쑰건μ΄λ€. νμ§λ§, λμμμλ μλͺ©μ λΆν¬κ° λ§€μ° λ³΅μ‘νκΈ° λλ¬Έμ κ°λ³ μλͺ©μ 맡ννλ κ²μ μ΄λ €μ λ€. λ³Έ μ°κ΅¬μμλ, μ¬λ¬κ°μ§ μΌμ± νλ«νΌμ μ΅ν©ν¨κ³Ό λμμ λ€μν λ₯λ¬λ λ€νΈμν¬λ€μ νμ©νμ¬ μΈλ°ν λμ μλͺ© μ§λλ₯Ό μ μνλ μλ‘μ΄ νλ μμν¬λ₯Ό μ μνλ€. μ°λ¦¬λ μ€μ§ ν곡μ¬μ§, μλ―Ό, μ°¨λ λ±μ νλ«νΌμΌλ‘λΆν° μμ§λ RGB μ΄λ―Έμ§λ§μ μ¬μ©νμμΌλ©°, 6κ°μ§ λ₯λ¬λ λͺ¨λΈμ νμ©νμ¬ μλͺ©μ 맡ννκ³ μμ’
μ νμ§νμλ€. κ°κ°μ νλ«νΌμ μ λ§λ€μ κ°μ μ΄ μκΈ° λλ¬Έμ μ κ³Όμ μ μΈ κ°μ§ μ€ν
μΌλ‘ ꡬλΆν μ μλ€. 첫째, μ°λ¦¬λ ν곡μ¬μ§ μμμ νμ§λ μλͺ©μ λ₯λ¬λ λ°μ΄λ© λ°μ€λ‘λΆν° μ€μ¬μ μ μΆμΆνμ¬ κ°λ³ μλͺ©μ μμΉ μ§λλ₯Ό μ μνμλ€. λ§μ μλͺ©μ΄ λμ λ΄ κ³ μΈ΅ λΉλ©μ κ·Έλ¦Όμμ μν΄ κ°λ €μ‘κΈ° λλ¬Έμ, μ°λ¦¬λ μμ μ μ λμ μ κ²½λ§ (Generative Adversarial Network, GAN)μ ν΅ν΄ ν곡μ¬μ§ μμ μ¨κ²¨μ§ μλͺ©μ κ·Έλ €λ΄κ³ μ νμλ€. λμ§Έ, μ°λ¦¬λ μλ―Όλ€μ΄ μμ§ν μλͺ©μ μνΌ μ¬μ§μ νμ©νμ¬ λμ 곡μ λ° λμ μ² μΌλμ μμ’
μ 보λ₯Ό 맡ννμλ€. μνΌ μ¬μ§μΌλ‘λΆν°μ μμ’
μ 보λ λ₯λ¬λ λ€νΈμν¬μ μν΄ μλμΌλ‘ λΆλ₯λμμΌλ©°, μ΄ κ³Όμ μμ μ΄λ―Έμ§ λΆν λͺ¨λΈ λν μ μ©λμ΄ λ₯λ¬λ λΆλ₯ λͺ¨λΈμ΄ μ€λ‘μ§ μνΌ λΆλΆμλ§ μ§μ€ν μ μλλ‘ νμλ€. μ
μ§Έ, μ°λ¦¬λ μ°¨λμ νμ¬λ μΉ΄λ©λΌλ₯Ό νμ©νμ¬ λλ‘λ³ κ°λ‘μμ μμ’
μ νμ§νμλ€. μ΄ κ³Όμ μμ μ°¨λμΌλ‘λΆν° κ°λ‘μκΉμ§μ 거리 μ λ³΄κ° νμνμλλ°, μ°λ¦¬λ μ΄λ―Έμ§ μμ μ°¨μ κ°μλ‘λΆν° 거리λ₯Ό μΆμ νμλ€. λ§μ§λ§μΌλ‘, λ³Έ μ°κ΅¬ κ²°κ³Όλ λΌμ΄λ€ (Light Detection and Ranging, LiDAR)μ GPS μ₯λΉ, κ·Έλ¦¬κ³ νμ₯ μλ£μ μν΄ νκ°λμλ€. μ°λ¦¬λ 121.04 kmΒ² λ©΄μ μ λμμ§ λ΄μ μ½ 130λ§μ¬ 그루μ μλͺ©μ΄ μ‘΄μ¬νλ κ²μ νμΈνμμΌλ©°, λ€μν μ νμ°κ΅¬λ³΄λ€ λμ μ νλμ κ°λ³ μλͺ© μμΉ μ§λλ₯Ό μ μνμλ€. νμ§λ λͺ¨λ μλͺ© μ€ μ½ 6λ§ 3μ²μ¬ 그루μ μμ’
μ λ³΄κ° νμ§λμμΌλ©°, μ΄μ€ κ°μ₯ λΉλ²ν νμ§λ μλͺ©μ μλ²λ무 (Prunus yedoensis, 21.43 %)μλ€. μνλ무 (Ginkgo biloba, 19.44 %), λν°λ무 (Zelkova serrata, 18.68 %), μλ무 (Pinus densiflora, 7.55 %), κ·Έλ¦¬κ³ λ©νμΈμΏΌμ΄μ΄ (Metasequoia glyptostroboides, 5.97 %) λ±μ΄ κ·Έ λ€λ₯Ό μ΄μλ€. ν¬κ΄μ μΈ κ²μ¦μ΄ μνλμκ³ , λ³Έ μ°κ΅¬μμλ μλ―Όμ΄ μμ§ν μνΌ μ¬μ§κ³Ό μ°¨λμΌλ‘λΆν° μμ§λ λλ‘λ³ μ΄λ―Έμ§λ λμ μμ’
λΆν¬μ λν μ ννκ³ μ λμ μΈ μ 보λ₯Ό μ 곡νλ€λ κ²μ κ²μ¦νμλ€.1. Introduction 6
2. Methodology 9
2.1. Data collection 9
2.2. Deep learning overall 12
2.3. Tree counting and mapping 15
2.4. Tree species detection 16
2.5. Evaluation 21
3. Results 22
3.1. Evaluation of deep learning performance 22
3.2. Tree counting and mapping 23
3.3. Tree species detection 27
4. Discussion 30
4.1. Multiple sensing platforms for urban areas 30
4.2. Potential of citizen and vehicle sensors 34
4.3. Implications 48
5. Conclusion 51
Bibliography 52
Abstract in Korean 61μ
Two and three dimensional segmentation of multimodal imagery
The role of segmentation in the realms of image understanding/analysis, computer vision, pattern recognition, remote sensing and medical imaging in recent years has been significantly augmented due to accelerated scientific advances made in the acquisition of image data. This low-level analysis protocol is critical to numerous applications, with the primary goal of expediting and improving the effectiveness of subsequent high-level operations by providing a condensed and pertinent representation of image information. In this research, we propose a novel unsupervised segmentation framework for facilitating meaningful segregation of 2-D/3-D image data across multiple modalities (color, remote-sensing and biomedical imaging) into non-overlapping partitions using several spatial-spectral attributes. Initially, our framework exploits the information obtained from detecting edges inherent in the data. To this effect, by using a vector gradient detection technique, pixels without edges are grouped and individually labeled to partition some initial portion of the input image content. Pixels that contain higher gradient densities are included by the dynamic generation of segments as the algorithm progresses to generate an initial region map. Subsequently, texture modeling is performed and the obtained gradient, texture and intensity information along with the aforementioned initial partition map are used to perform a multivariate refinement procedure, to fuse groups with similar characteristics yielding the final output segmentation. Experimental results obtained in comparison to published/state-of the-art segmentation techniques for color as well as multi/hyperspectral imagery, demonstrate the advantages of the proposed method. Furthermore, for the purpose of achieving improved computational efficiency we propose an extension of the aforestated methodology in a multi-resolution framework, demonstrated on color images. Finally, this research also encompasses a 3-D extension of the aforementioned algorithm demonstrated on medical (Magnetic Resonance Imaging / Computed Tomography) volumes
Study on Co-occurrence-based Image Feature Analysis and Texture Recognition Employing Diagonal-Crisscross Local Binary Pattern
In this thesis, we focus on several important fields on real-world image texture analysis and recognition. We survey various important features that are suitable for texture analysis. Apart from the issue of variety of features, different types of texture datasets are also discussed in-depth. There is no thorough work covering the important databases and analyzing them in various viewpoints. We persuasively categorize texture databases ? based on many references. In this survey, we put a categorization to split these texture datasets into few basic groups and later put related datasets. Next, we exhaustively analyze eleven second-order statistical features or cues based on co-occurrence matrices to understand image texture surface. These features are exploited to analyze properties of image texture. The features are also categorized based on their angular orientations and their applicability. Finally, we propose a method called diagonal-crisscross local binary pattern (DCLBP) for texture recognition. We also propose two other extensions of the local binary pattern. Compare to the local binary pattern and few other extensions, we achieve that our proposed method performs satisfactorily well in two very challenging benchmark datasets, called the KTH-TIPS (Textures under varying Illumination, Pose and Scale) database, and the USC-SIPI (University of Southern California ? Signal and Image Processing Institute) Rotations Texture dataset.δΉε·ε·₯ζ₯倧ε¦ε士ε¦δ½θ«ζ ε¦δ½θ¨ηͺε·:ε·₯εη²η¬¬354ε·γε¦δ½ζδΈεΉ΄ζζ₯:εΉ³ζ25εΉ΄9ζ27ζ₯CHAPTER 1 INTRODUCTION|CHAPTER 2 FEATURES FOR TEXTURE ANALYSIS|CHAPTER 3 IN-DEPTH ANALYSIS OF TEXTURE DATABASES|CHAPTER 4 ANALYSIS OF FEATURES BASED ON CO-OCCURRENCE IMAGE MATRIX|CHAPTER 5 CATEGORIZATION OF FEATURES BASED ON CO-OCCURRENCE IMAGE MATRIX|CHAPTER 6 TEXTURE RECOGNITION BASED ON DIAGONAL-CRISSCROSS LOCAL BINARY PATTERN|CHAPTER 7 CONCLUSIONS AND FUTURE WORKδΉε·ε·₯ζ₯倧ε¦εΉ³ζ25εΉ΄
Varied Image Data Augmentation Methods for Building Ensemble
Convolutional Neural Networks (CNNs) are used in many domains but the requirement of large datasets for robust training sessions and no overfitting makes them hard to apply in medical fields and similar fields. However, when large quantities of samples cannot be easily collected, various methods can still be applied to stem the problem depending on the sample type. Data augmentation, rather than other methods, has recently been under the spotlight mostly because of the simplicity and effectiveness of some of the more adopted methods. The research question addressed in this work is whether data augmentation techniques can help in developing robust and efficient machine learning systems to be used in different domains for classification purposes. To do that, we introduce new image augmentation techniques that make use of different methods like Fourier Transform (FT), Discrete Cosine Transform (DCT), Radon Transform (RT), Hilbert Transform (HT), Singular Value Decomposition (SVD), Local Laplacian Filters (LLF) and Hampel filter (HF). We define different ensemble methods by combining various classical data augmentation methods with the newer ones presented here. We performed an extensive empirical evaluation on 15 different datasets to validate our proposal. The obtained results show that the newly proposed data augmentation methods can be very effective even when used alone. The ensembles trained with different augmentations methods can outperform some of the best approaches reported in the literature as well as compete with state-of-the-art custom methods. All resources are available at https://github.com/LorisNanni.publishedVersionPeer reviewe
UAVs for the Environmental Sciences
This book gives an overview of the usage of UAVs in environmental sciences covering technical basics, data acquisition with different sensors, data processing schemes and illustrating various examples of application
Food Recognition and Volume Estimation in a Dietary Assessment System
Recently obesity has become an epidemic and one of the most serious worldwide public
health concerns of the 21st century. Obesity diminishes the average life expectancy and
there is now convincing evidence that poor diet, in combination with physical inactivity
are key determinants of an individual s risk of developing chronic diseases such as
cancer, cardiovascular disease or diabetes. Assessing what people eat is fundamental
to establishing the link between diet and disease. Food records are considered the best
approach for assessing energy intake. However, this method requires literate and highly
motivated subjects. This is a particular problem for adolescents and young adults who
are the least likely to undertake food records. The ready access of the majority of the
population to mobile phones (with integrated camera, improved memory capacity, network
connectivity and faster processing capability) has opened up new opportunities for
dietary assessment. The dietary information extracted from dietary assessment provide
valuable insights into the cause of diseases that greatly helps practicing dietitians and
researchers to develop subsequent approaches for mounting intervention programs for
prevention. In such systems, the camera in the mobile phone is used for capturing images
of food consumed and these images are then processed to automatically estimate the nutritional content of the food. However, food objects are deformable objects that
exhibit variations in appearance, shape, texture and color so the food classification and
volume estimation in these systems suffer from lower accuracy. The improvement of
the food recognition accuracy and volume estimation accuracy are challenging tasks.
This thesis presents new techniques for food classification and food volume estimation.
For food recognition, emphasis was given to texture features. The existing food
recognition techniques assume that the food images will be viewed at similar scales and
from the same viewpoints. However, this assumption fails in practical applications, because
it is difficult to ensure that a user in a dietary assessment system will put his/her
camera at the same scale and orientation to capture food images as that of the target food
images in the database. A new scale and rotation invariant feature generation approach
that applies Gabor filter banks is proposed. To obtain scale and rotation invariance,
the proposed approach identifies the dominant orientation of the filtered coefficient and
applies a circular shifting operation to place this value at the first scale of dominant
direction. The advantages of this technique are it does not require the scale factor to
be known in advance and it is scale/and rotation invariant separately and concurrently.
This approach is modified to achieve improved accuracy by applying a Gaussian window
along the scale dimension which reduces the impact of high and low frequencies of
the filter outputs enabling better matching between the same classes. Besides automatic
classification, semi automatic classification and group classification are also considered
to have an idea about the improvement. To estimate the volume of a food item, a stereo pair is used to recover the structure as a 3D point cloud. A slice based volume estimation
approach is proposed that converts the 3D point cloud to a series of 2D slices.
The proposed approach eliminates the problem of knowing the distance between two
cameras with the help of disparities and depth information from a fiducial marker. The
experimental results show that the proposed approach can provide an accurate estimate
of food volume
Abstracts on Radio Direction Finding (1899 - 1995)
The files on this record represent the various databases that originally composed the CD-ROM issue of "Abstracts on Radio Direction Finding" database, which is now part of the Dudley Knox Library's Abstracts and Selected Full Text Documents on Radio Direction Finding (1899 - 1995) Collection. (See Calhoun record https://calhoun.nps.edu/handle/10945/57364 for further information on this collection and the bibliography).
Due to issues of technological obsolescence preventing current and future audiences from accessing the bibliography, DKL exported and converted into the three files on this record the various databases contained in the CD-ROM.
The contents of these files are:
1) RDFA_CompleteBibliography_xls.zip [RDFA_CompleteBibliography.xls: Metadata for the complete bibliography, in Excel 97-2003 Workbook format; RDFA_Glossary.xls: Glossary of terms, in Excel 97-2003 Workbookformat; RDFA_Biographies.xls: Biographies of leading figures, in Excel 97-2003 Workbook format];
2) RDFA_CompleteBibliography_csv.zip [RDFA_CompleteBibliography.TXT: Metadata for the complete bibliography, in CSV format; RDFA_Glossary.TXT: Glossary of terms, in CSV format; RDFA_Biographies.TXT: Biographies of leading figures, in CSV format];
3) RDFA_CompleteBibliography.pdf: A human readable display of the bibliographic data, as a means of double-checking any possible deviations due to conversion