Search CORE

11 research outputs found

Feature transforms for image data augmentation

Author: Brahnam S.
Lumini A.
Nanni L.
Paci M.
Publication venue
Publication date: 01/01/2022
Field of study

A problem with convolutional neural networks (CNNs) is that they require large datasets to obtain adequate robustness; on small datasets, they are prone to overfitting. Many methods have been proposed to overcome this shortcoming with CNNs. In cases where additional samples cannot easily be collected, a common approach is to generate more data points from existing data using an augmentation technique. In image classification, many augmentation approaches utilize simple image manipulation algorithms. In this work, we propose some new methods for data augmentation based on several image transformations: the Fourier transform (FT), the Radon transform (RT), and the discrete cosine transform (DCT). These and other data augmentation methods are considered in order to quantify their effectiveness in creating ensembles of neural networks. The novelty of this research is to consider different strategies for data augmentation to generate training sets from which to train several classifiers which are combined into an ensemble. Specifically, the idea is to create an ensemble based on a kind of bagging of the training set, where each model is trained on a different training set obtained by augmenting the original training set with different approaches. We build ensembles on the data level by adding images generated by combining fourteen augmentation approaches, with three based on FT, RT, and DCT, proposed here for the first time. Pretrained ResNet50 networks are finetuned on training sets that include images derived from each augmentation method. These networks and several fusions are evaluated and compared across eleven benchmarks. Results show that building ensembles on the data level by combining different data augmentation methods produce classifiers that not only compete competitively against the state-of-the-art but often surpass the best approaches reported in the literature

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

다중 센싱 플랫폼과 딥러닝을 활용한 도시 규모의 수목 맵핑 및 수종 탐지

Author: 권령섭
Publication venue: 서울대학교 대학원
Publication date: 01/02/2023
Field of study

학위논문(석사) -- 서울대학교대학원 : 농업생명과학대학 생태조경·지역시스템공학부(생태조경학), 2023. 2. 류영렬.Precise estimation of the number of trees and individual tree location with species information all over the city forms solid foundation for enhancing ecosystem service. However, mapping individual trees at the city scale remains challenging due to heterogeneous patterns of urban tree distribution. Here, we present a novel framework for merging multiple sensing platforms with leveraging various deep neural networks to produce a fine-grained urban tree map. We performed mapping trees and detecting species by relying only on RGB images taken by multiple sensing platforms such as airborne, citizens and vehicles, which fueled six deep learning models. We divided the entire process into three steps, since each platform has its own strengths. First, we produced individual tree location maps by converting the central points of the bounding boxes into actual coordinates from airborne imagery. Since many trees were obscured by the shadows of the buildings, we applied Generative Adversarial Network (GAN) to delineate hidden trees from the airborne images. Second, we selected tree bark photos collected by citizen for species mapping in urban parks and forests. Species information of all tree bark photos were automatically classified after non-tree parts of images were segmented. Third, we classified species of roadside trees by using a camera mounted on a car to augment our species mapping framework with street-level tree data. We estimated the distance from a car to street trees from the number of lanes detected from the images. Finally, we assessed our results by comparing it with Light Detection and Ranging (LiDAR), GPS and field data. We estimated over 1.2 million trees existed in the city of 121.04 km² and generated more accurate individual tree positions, outperforming the conventional field survey methods. Among them, we detected the species of more than 63,000 trees. The most frequently detected species was Prunus yedoensis (21.43 %) followed by Ginkgo biloba (19.44 %), Zelkova serrata (18.68 %), Pinus densiflora (7.55 %) and Metasequoia glyptostroboides (5.97 %). Comprehensive experimental results demonstrate that tree bark photos and street-level imagery taken by citizens and vehicles are conducive to delivering accurate and quantitative information on the distribution of urban tree species.도시 전역에 존재하는 모든 수목의 숫자와 개별 위치, 그리고 수종 분포를 정확하게 파악하는 것은 생태계 서비스를 향상시키기 위한 필수조건이다. 하지만, 도시에서는 수목의 분포가 매우 복잡하기 때문에 개별 수목을 맵핑하는 것은 어려웠다. 본 연구에서는, 여러가지 센싱 플랫폼을 융합함과 동시에 다양한 딥러닝 네트워크들을 활용하여 세밀한 도시 수목 지도를 제작하는 새로운 프레임워크를 제안한다. 우리는 오직 항공사진, 시민, 차량 등의 플랫폼으로부터 수집된 RGB 이미지만을 사용하였으며, 6가지 딥러닝 모델을 활용하여 수목을 맵핑하고 수종을 탐지하였다. 각각의 플랫폼은 저마다의 강점이 있기 때문에 전 과정을 세 가지 스텝으로 구분할 수 있다. 첫째, 우리는 항공사진 상에서 탐지된 수목의 딥러닝 바운딩 박스로부터 중심점을 추출하여 개별 수목의 위치 지도를 제작하였다. 많은 수목이 도시 내 고층 빌딩의 그림자에 의해 가려졌기 때문에, 우리는 생정적 적대적 신경망 (Generative Adversarial Network, GAN)을 통해 항공사진 상에 숨겨진 수목을 그려내고자 하였다. 둘째, 우리는 시민들이 수집한 수목의 수피 사진을 활용하여 도시 공원 및 도시 숲 일대에 수종 정보를 맵핑하였다. 수피 사진으로부터의 수종 정보는 딥러닝 네트워크에 의해 자동으로 분류되었으며, 이 과정에서 이미지 분할 모델 또한 적용되어 딥러닝 분류 모델이 오로지 수피 부분에만 집중할 수 있도록 하였다. 셋째, 우리는 차량에 탑재된 카메라를 활용하여 도로변 가로수의 수종을 탐지하였다. 이 과정에서 차량으로부터 가로수까지의 거리 정보가 필요하였는데, 우리는 이미지 상의 차선 개수로부터 거리를 추정하였다. 마지막으로, 본 연구 결과는 라이다 (Light Detection and Ranging, LiDAR)와 GPS 장비, 그리고 현장 자료에 의해 평가되었다. 우리는 121.04 km² 면적의 대상지 내에 약 130만여 그루의 수목이 존재하는 것을 확인하였으며, 다양한 선행연구보다 높은 정확도의 개별 수목 위치 지도를 제작하였다. 탐지된 모든 수목 중 약 6만 3천여 그루의 수종 정보가 탐지되었으며, 이중 가장 빈번히 탐지된 수목은 왕벚나무 (Prunus yedoensis, 21.43 %)였다. 은행나무 (Ginkgo biloba, 19.44 %), 느티나무 (Zelkova serrata, 18.68 %), 소나무 (Pinus densiflora, 7.55 %), 그리고 메타세쿼이어 (Metasequoia glyptostroboides, 5.97 %) 등이 그 뒤를 이었다. 포괄적인 검증이 수행되었고, 본 연구에서는 시민이 수집한 수피 사진과 차량으로부터 수집된 도로변 이미지는 도시 수종 분포에 대한 정확하고 정량적인 정보를 제공한다는 것을 검증하였다.1. Introduction 6 2. Methodology 9 2.1. Data collection 9 2.2. Deep learning overall 12 2.3. Tree counting and mapping 15 2.4. Tree species detection 16 2.5. Evaluation 21 3. Results 22 3.1. Evaluation of deep learning performance 22 3.2. Tree counting and mapping 23 3.3. Tree species detection 27 4. Discussion 30 4.1. Multiple sensing platforms for urban areas 30 4.2. Potential of citizen and vehicle sensors 34 4.3. Implications 48 5. Conclusion 51 Bibliography 52 Abstract in Korean 61석

SNU Open Repository and Archive

Two and three dimensional segmentation of multimodal imagery

Author: Vantaram Sreenath Rao
Publication venue: RIT Scholar Works
Publication date: 01/10/2012
Field of study

The role of segmentation in the realms of image understanding/analysis, computer vision, pattern recognition, remote sensing and medical imaging in recent years has been significantly augmented due to accelerated scientific advances made in the acquisition of image data. This low-level analysis protocol is critical to numerous applications, with the primary goal of expediting and improving the effectiveness of subsequent high-level operations by providing a condensed and pertinent representation of image information. In this research, we propose a novel unsupervised segmentation framework for facilitating meaningful segregation of 2-D/3-D image data across multiple modalities (color, remote-sensing and biomedical imaging) into non-overlapping partitions using several spatial-spectral attributes. Initially, our framework exploits the information obtained from detecting edges inherent in the data. To this effect, by using a vector gradient detection technique, pixels without edges are grouped and individually labeled to partition some initial portion of the input image content. Pixels that contain higher gradient densities are included by the dynamic generation of segments as the algorithm progresses to generate an initial region map. Subsequently, texture modeling is performed and the obtained gradient, texture and intensity information along with the aforementioned initial partition map are used to perform a multivariate refinement procedure, to fuse groups with similar characteristics yielding the final output segmentation. Experimental results obtained in comparison to published/state-of the-art segmentation techniques for color as well as multi/hyperspectral imagery, demonstrate the advantages of the proposed method. Furthermore, for the purpose of achieving improved computational efficiency we propose an extension of the aforestated methodology in a multi-resolution framework, demonstrated on color images. Finally, this research also encompasses a 3-D extension of the aforementioned algorithm demonstrated on medical (Magnetic Resonance Imaging / Computed Tomography) volumes

RIT Scholar Works

Study on Co-occurrence-based Image Feature Analysis and Texture Recognition Employing Diagonal-Crisscross Local Binary Pattern

Author: Hossain Shahera
Publication venue: 芹川, 聖一
Publication date: 01/06/2013
Field of study

In this thesis, we focus on several important fields on real-world image texture analysis and recognition. We survey various important features that are suitable for texture analysis. Apart from the issue of variety of features, different types of texture datasets are also discussed in-depth. There is no thorough work covering the important databases and analyzing them in various viewpoints. We persuasively categorize texture databases ? based on many references. In this survey, we put a categorization to split these texture datasets into few basic groups and later put related datasets. Next, we exhaustively analyze eleven second-order statistical features or cues based on co-occurrence matrices to understand image texture surface. These features are exploited to analyze properties of image texture. The features are also categorized based on their angular orientations and their applicability. Finally, we propose a method called diagonal-crisscross local binary pattern (DCLBP) for texture recognition. We also propose two other extensions of the local binary pattern. Compare to the local binary pattern and few other extensions, we achieve that our proposed method performs satisfactorily well in two very challenging benchmark datasets, called the KTH-TIPS (Textures under varying Illumination, Pose and Scale) database, and the USC-SIPI (University of Southern California ? Signal and Image Processing Institute) Rotations Texture dataset.九州工業大学博士学位論文学位記番号:工博甲第354号　学位授与年月日:平成25年9月27日CHAPTER 1 INTRODUCTION|CHAPTER 2 FEATURES FOR TEXTURE ANALYSIS|CHAPTER 3 IN-DEPTH ANALYSIS OF TEXTURE DATABASES|CHAPTER 4 ANALYSIS OF FEATURES BASED ON CO-OCCURRENCE IMAGE MATRIX|CHAPTER 5 CATEGORIZATION OF FEATURES BASED ON CO-OCCURRENCE IMAGE MATRIX|CHAPTER 6 TEXTURE RECOGNITION BASED ON DIAGONAL-CRISSCROSS LOCAL BINARY PATTERN|CHAPTER 7 CONCLUSIONS AND FUTURE WORK九州工業大学平成25年

Kyutacar : Kyushu Institute of Technology Academic Repository

Kyushu Institute of Technology of Academic Repository

Varied Image Data Augmentation Methods for Building Ensemble

Author: Brahnam Sheryl
Bravin Riccardo
Loreggia Andrea
Nanni Loris
Paci Michelangelo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2023
Field of study

Convolutional Neural Networks (CNNs) are used in many domains but the requirement of large datasets for robust training sessions and no overfitting makes them hard to apply in medical fields and similar fields. However, when large quantities of samples cannot be easily collected, various methods can still be applied to stem the problem depending on the sample type. Data augmentation, rather than other methods, has recently been under the spotlight mostly because of the simplicity and effectiveness of some of the more adopted methods. The research question addressed in this work is whether data augmentation techniques can help in developing robust and efficient machine learning systems to be used in different domains for classification purposes. To do that, we introduce new image augmentation techniques that make use of different methods like Fourier Transform (FT), Discrete Cosine Transform (DCT), Radon Transform (RT), Hilbert Transform (HT), Singular Value Decomposition (SVD), Local Laplacian Filters (LLF) and Hampel filter (HF). We define different ensemble methods by combining various classical data augmentation methods with the newer ones presented here. We performed an extensive empirical evaluation on 15 different datasets to validate our proposal. The obtained results show that the newly proposed data augmentation methods can be very effective even when used alone. The ensembles trained with different augmentations methods can outperform some of the best approaches reported in the literature as well as compete with state-of-the-art custom methods. All resources are available at https://github.com/LorisNanni.publishedVersionPeer reviewe

Trepo - Institutional Repository of Tampere University

Archivio istituzionale della ricerca - Università di Padova

UAVs for the Environmental Sciences

Author
Publication venue: WBG Academic
Publication date: 05/12/2022
Field of study

This book gives an overview of the usage of UAVs in environmental sciences covering technical basics, data acquisition with different sensors, data processing schemes and illustrating various examples of application

Directory of Open Access Books (DOAB)

Legal considerations of UAV flights

Author: Stöcker Claudia
Zevenbergen J.A.
Publication venue
Publication date: 01/01/2022
Field of study

University of Twente Research Information

Food Recognition and Volume Estimation in a Dietary Assessment System

Author: Rahman Md. Hafizur
Publication venue: UNSW, Sydney
Publication date: 01/01/2013
Field of study

Recently obesity has become an epidemic and one of the most serious worldwide public health concerns of the 21st century. Obesity diminishes the average life expectancy and there is now convincing evidence that poor diet, in combination with physical inactivity are key determinants of an individual s risk of developing chronic diseases such as cancer, cardiovascular disease or diabetes. Assessing what people eat is fundamental to establishing the link between diet and disease. Food records are considered the best approach for assessing energy intake. However, this method requires literate and highly motivated subjects. This is a particular problem for adolescents and young adults who are the least likely to undertake food records. The ready access of the majority of the population to mobile phones (with integrated camera, improved memory capacity, network connectivity and faster processing capability) has opened up new opportunities for dietary assessment. The dietary information extracted from dietary assessment provide valuable insights into the cause of diseases that greatly helps practicing dietitians and researchers to develop subsequent approaches for mounting intervention programs for prevention. In such systems, the camera in the mobile phone is used for capturing images of food consumed and these images are then processed to automatically estimate the nutritional content of the food. However, food objects are deformable objects that exhibit variations in appearance, shape, texture and color so the food classification and volume estimation in these systems suffer from lower accuracy. The improvement of the food recognition accuracy and volume estimation accuracy are challenging tasks. This thesis presents new techniques for food classification and food volume estimation. For food recognition, emphasis was given to texture features. The existing food recognition techniques assume that the food images will be viewed at similar scales and from the same viewpoints. However, this assumption fails in practical applications, because it is difficult to ensure that a user in a dietary assessment system will put his/her camera at the same scale and orientation to capture food images as that of the target food images in the database. A new scale and rotation invariant feature generation approach that applies Gabor filter banks is proposed. To obtain scale and rotation invariance, the proposed approach identifies the dominant orientation of the filtered coefficient and applies a circular shifting operation to place this value at the first scale of dominant direction. The advantages of this technique are it does not require the scale factor to be known in advance and it is scale/and rotation invariant separately and concurrently. This approach is modified to achieve improved accuracy by applying a Gaussian window along the scale dimension which reduces the impact of high and low frequencies of the filter outputs enabling better matching between the same classes. Besides automatic classification, semi automatic classification and group classification are also considered to have an idea about the improvement. To estimate the volume of a food item, a stereo pair is used to recover the structure as a 3D point cloud. A slice based volume estimation approach is proposed that converts the 3D point cloud to a series of 2D slices. The proposed approach eliminates the problem of knowing the distance between two cameras with the help of disparities and depth information from a fiducial marker. The experimental results show that the proposed approach can provide an accurate estimate of food volume

UNSWorks

Abstracts on Radio Direction Finding (1899 - 1995)

Author
Publication venue
Publication date: 01/01/2018
Field of study

The files on this record represent the various databases that originally composed the CD-ROM issue of "Abstracts on Radio Direction Finding" database, which is now part of the Dudley Knox Library's Abstracts and Selected Full Text Documents on Radio Direction Finding (1899 - 1995) Collection. (See Calhoun record https://calhoun.nps.edu/handle/10945/57364 for further information on this collection and the bibliography). Due to issues of technological obsolescence preventing current and future audiences from accessing the bibliography, DKL exported and converted into the three files on this record the various databases contained in the CD-ROM. The contents of these files are: 1) RDFA_CompleteBibliography_xls.zip [RDFA_CompleteBibliography.xls: Metadata for the complete bibliography, in Excel 97-2003 Workbook format; RDFA_Glossary.xls: Glossary of terms, in Excel 97-2003 Workbookformat; RDFA_Biographies.xls: Biographies of leading figures, in Excel 97-2003 Workbook format]; 2) RDFA_CompleteBibliography_csv.zip [RDFA_CompleteBibliography.TXT: Metadata for the complete bibliography, in CSV format; RDFA_Glossary.TXT: Glossary of terms, in CSV format; RDFA_Biographies.TXT: Biographies of leading figures, in CSV format]; 3) RDFA_CompleteBibliography.pdf: A human readable display of the bibliographic data, as a means of double-checking any possible deviations due to conversion

Calhoun, Institutional Archive of the Naval Postgraduate School