340 research outputs found

    A novel bottleneck residual and self-attention fusion-assisted architecture for land use recognition in remote sensing images

    Get PDF
    The massive yearly population growth is causing hazards to spread swiftly around the world and have a detrimental impact on both human life and the world economy. By ensuring early prediction accuracy, remote sensing enters the scene to safeguard the globe against weather-related threats and natural disasters. Convolutional neural networks, which are a reflection of deep learning, have been used more recently to reliably identify land use in remote sensing images. This work proposes a novel bottleneck residual and self-attention fusion-assisted architecture for land use recognition from remote sensing images. First, we proposed using the fast neural approach to generate cloud-effect satellite images. In neural style, we proposed a 5-layered residual block CNN to estimate the loss of neural-style images. After that, we proposed two novel architectures, named 3-layered bottleneck CNN architecture and 3-layered bottleneck self-attention CNN architecture, for the classification of land use images. Training has been conducted on both proposed and original neural-style generated datasets for both architectures. Subsequently, features are extracted from the deep layers and merged employing an innovative serial approach based on weighted entropy. By removing redundant and superfluous data, a novel Chimp Optimization technique is applied to the fused features in order to further refine them. In conclusion, selected features are classified using the help of neural network classifiers. The experimental procedure yielded respective accuracy rates of 99.0% and 99.4% when applied to both datasets. When evaluated in comparison to state-of-the-art (SOTA) methods, the outcomes generated by the proposed framework demonstrated enhanced precision and accuracy

    MatSpectNet: Material Segmentation Network with Domain-Aware and Physically-Constrained Hyperspectral Reconstruction

    Full text link
    Achieving accurate material segmentation for 3-channel RGB images is challenging due to the considerable variation in a material's appearance. Hyperspectral images, which are sets of spectral measurements sampled at multiple wavelengths, theoretically offer distinct information for material identification, as variations in intensity of electromagnetic radiation reflected by a surface depend on the material composition of a scene. However, existing hyperspectral datasets are impoverished regarding the number of images and material categories for the dense material segmentation task, and collecting and annotating hyperspectral images with a spectral camera is prohibitively expensive. To address this, we propose a new model, the MatSpectNet to segment materials with recovered hyperspectral images from RGB images. The network leverages the principles of colour perception in modern cameras to constrain the reconstructed hyperspectral images and employs the domain adaptation method to generalise the hyperspectral reconstruction capability from a spectral recovery dataset to material segmentation datasets. The reconstructed hyperspectral images are further filtered using learned response curves and enhanced with human perception. The performance of MatSpectNet is evaluated on the LMD dataset as well as the OpenSurfaces dataset. Our experiments demonstrate that MatSpectNet attains a 1.60% increase in average pixel accuracy and a 3.42% improvement in mean class accuracy compared with the most recent publication. The project code is attached to the supplementary material and will be published on GitHub.Comment: 7 pages main pape

    La traduzione specializzata all’opera per una piccola impresa in espansione: la mia esperienza di internazionalizzazione in cinese di Bioretics© S.r.l.

    Get PDF
    Global markets are currently immersed in two all-encompassing and unstoppable processes: internationalization and globalization. While the former pushes companies to look beyond the borders of their country of origin to forge relationships with foreign trading partners, the latter fosters the standardization in all countries, by reducing spatiotemporal distances and breaking down geographical, political, economic and socio-cultural barriers. In recent decades, another domain has appeared to propel these unifying drives: Artificial Intelligence, together with its high technologies aiming to implement human cognitive abilities in machinery. The “Language Toolkit – Le lingue straniere al servizio dell’internazionalizzazione dell’impresa” project, promoted by the Department of Interpreting and Translation (Forlì Campus) in collaboration with the Romagna Chamber of Commerce (Forlì-Cesena and Rimini), seeks to help Italian SMEs make their way into the global market. It is precisely within this project that this dissertation has been conceived. Indeed, its purpose is to present the translation and localization project from English into Chinese of a series of texts produced by Bioretics© S.r.l.: an investor deck, the company website and part of the installation and use manual of the Aliquis© framework software, its flagship product. This dissertation is structured as follows: Chapter 1 presents the project and the company in detail; Chapter 2 outlines the internationalization and globalization processes and the Artificial Intelligence market both in Italy and in China; Chapter 3 provides the theoretical foundations for every aspect related to Specialized Translation, including website localization; Chapter 4 describes the resources and tools used to perform the translations; Chapter 5 proposes an analysis of the source texts; Chapter 6 is a commentary on translation strategies and choices

    Potassium deficiency diagnosis method of apple leaves based on MLR-LDA-SVM

    Get PDF
    IntroductionAt present, machine learning and image processing technology are widely used in plant disease diagnosis. In order to address the challenges of subjectivity, cost, and timeliness associated with traditional methods of diagnosing potassium deficiency in apple tree leaves. MethodsThe study proposes a model that utilizes image processing technology and machine learning techniques to enhance the accuracy of detection during each growth period. Leaf images were collected at different growth stages and processed through denoising and segmentation. Color and shape features of the leaves were extracted and a multiple regression analysis model was used to screen for key features. Linear discriminant analysis was then employed to optimize the data and obtain the optimal shape and color feature factors of apple tree leaves during each growth period. Various machine-learning methods, including SVM, DT, and KNN, were used for the diagnosis of potassium deficiency. ResultsThe MLR-LDA-SVM model was found to be the optimal model based on comprehensive evaluation indicators. Field experiments were conducted to verify the accuracy of the diagnostic model, achieving high diagnostic accuracy during different growth periods. DiscussionThe model can accurately diagnose whether potassium deficiency exists in apple tree leaves during each growth period. This provides theoretical guidance for intelligent and precise water and fertilizer management in orchards

    UDP-YOLO: High Efficiency and Real-Time Performance of Autonomous Driving Technology

    Get PDF
    In recent years, autonomous driving technology has gradually appeared in our field of vision. It senses the surrounding environment by using radar, laser, ultrasound, GPS, computer vision and other technologies, and then identifies obstacles and various signboards, and plans a suitable path to control the driving of vehicles. However, some problems occur when this technology is applied in foggy environment, such as the low probability of recognizing objects, or the fact that some objects cannot be recognized because the fog's fuzzy degree makes the planned path wrong. In view of this defect, and considering that automatic driving technology needs to respond quickly to objects when driving, this paper extends the prior defogging algorithm of dark channel, and proposes UDP-YOLO network to apply it to automatic driving technology. This paper is mainly divided into two parts: 1. Image processing: firstly, the data set is discriminated whether there is fog or not, then the fogged data set is defogged by defogging algorithm, and finally, the defogged data set is subjected to adaptive brightness enhancement; 2. Target detection: UDP-YOLO network proposed in this paper is used to detect the defogged data set. Through the observation results, it is found that the performance of the model proposed in this paper has been greatly improved while balancing the speed

    Single Remote Sensing Image Dehazing Using Robust Light-Dark Prior

    No full text
    Haze, generated by floaters (semitransparent clouds, fog, snow, etc.) in the atmosphere, can significantly degrade the utilization of remote sensing images (RSIs). However, the existing techniques for single image dehazing rarely consider that the haze is superimposed by floaters and shadow, and they often aggravate the degree of the haze shadow and dark region. In this paper, a single RSI dehazing method based on robust light-dark prior (RLDP) is proposed, which utilizes the proposed hybrid model and is robust to outlier pixels. In the proposed RLDP method, the haze is first removed by a robust dark channel prior (RDCP). Then, the shadow is removed with a robust light channel prior (RLCP). Further, a cube root mean enhancement (CRME)-based stable state search criterion is proposed for solving the difficult problem of patch size setting. The experiment results on benchmark and Landsat 8 RSIs demonstrate that the RLDP method could effectively remove haze

    SAR-to-Optical Image Translation via Thermodynamics-inspired Network

    Full text link
    Synthetic aperture radar (SAR) is prevalent in the remote sensing field but is difficult to interpret in human visual perception. Recently, SAR-to-optical (S2O) image conversion methods have provided a prospective solution for interpretation. However, since there is a huge domain difference between optical and SAR images, they suffer from low image quality and geometric distortion in the produced optical images. Motivated by the analogy between pixels during the S2O image translation and molecules in a heat field, Thermodynamics-inspired Network for SAR-to-Optical Image Translation (S2O-TDN) is proposed in this paper. Specifically, we design a Third-order Finite Difference (TFD) residual structure in light of the TFD equation of thermodynamics, which allows us to efficiently extract inter-domain invariant features and facilitate the learning of the nonlinear translation mapping. In addition, we exploit the first law of thermodynamics (FLT) to devise an FLT-guided branch that promotes the state transition of the feature values from the unstable diffusion state to the stable one, aiming to regularize the feature diffusion and preserve image structures during S2O image translation. S2O-TDN follows an explicit design principle derived from thermodynamic theory and enjoys the advantage of explainability. Experiments on the public SEN1-2 dataset show the advantages of the proposed S2O-TDN over the current methods with more delicate textures and higher quantitative results

    Enhancing Image Quality: A Comparative Study of Spatial, Frequency Domain, and Deep Learning Methods

    Get PDF
    Image restoration and noise reduction methods have been created to restore deteriorated images and improve their quality. These methods have garnered substantial significance in recent times, mainly due to the growing utilization of digital imaging across diverse domains, including but not limited to medical imaging, surveillance, satellite imaging, and numerous others. In this paper, we conduct a comparative analysis of three distinct approaches to image restoration: the spatial method, the frequency domain method, and the deep learning method. The study was conducted on a dataset of 10,000 images, and the performance of each method was evaluated using the accuracy and loss metrics. The results show that the deep learning method outperformed the other two methods, achieving a validation accuracy of 72.68% after 10 epochs. The spatial method had the lowest accuracy of the three, achieving a validation accuracy of 69.98% after 10 epochs. The FFT frequency domain method had a validation accuracy of 52.87% after 10 epochs, significantly lower than the other two methods. The study demonstrates that deep learning is a promising approach for image classification tasks and outperforms traditional methods such as spatial and frequency domain techniques

    Improving Classification in Single and Multi-View Images

    Get PDF
    Image classification is a sub-field of computer vision that focuses on identifying objects within digital images. In order to improve image classification we must address the following areas of improvement: 1) Single and Multi-View data quality using data pre-processing techniques. 2) Enhancing deep feature learning to extract alternative representation of the data. 3) Improving decision or prediction of labels. This dissertation presents a series of four published papers that explore different improvements of image classification. In our first paper, we explore the Siamese network architecture to create a Convolution Neural Network based similarity metric. We learn the priority features that differentiate two given input images. The metric proposed achieves state-of-the-art Fβ measure. In our second paper, we explore multi-view data classification. We investigate the application of Generative Adversarial Networks GANs on Multi-view data image classification and few-shot learning. Experimental results show that our method outperforms state-of-the-art research. In our third paper, we take on the challenge of improving ResNet backbone model. For this task, we focus on improving channel attention mechanisms. We utilize Discrete Wavelet Transform compression to address the channel representation problem. Experimental results on ImageNet shows that our method outperforms baseline SENet-34 and SOTA FcaNet-34 at no extra computational cost. In our fourth paper, we investigate further the potential of orthogonalization of filters for extraction of diverse information for channel attention. We prove that using only random constant orthogonal filters is sufficient enough to achieve good channel attention. We test our proposed method using ImageNet, Places365, and Birds datasets for image classification, MS-COCO for object detection, and instance segmentation tasks. Our method outperforms FcaNet, and WaveNet and achieves the state-of-the-art results

    Streamlined Global and Local Features Combinator (SGLC) for High Resolution Image Dehazing

    Full text link
    Image Dehazing aims to remove atmospheric fog or haze from an image. Although the Dehazing models have evolved a lot in recent years, few have precisely tackled the problem of High-Resolution hazy images. For this kind of image, the model needs to work on a downscaled version of the image or on cropped patches from it. In both cases, the accuracy will drop. This is primarily due to the inherent failure to combine global and local features when the image size increases. The Dehazing model requires global features to understand the general scene peculiarities and the local features to work better with fine and pixel details. In this study, we propose the Streamlined Global and Local Features Combinator (SGLC) to solve these issues and to optimize the application of any Dehazing model to High-Resolution images. The SGLC contains two successive blocks. The first is the Global Features Generator (GFG) which generates the first version of the Dehazed image containing strong global features. The second block is the Local Features Enhancer (LFE) which improves the local feature details inside the previously generated image. When tested on the Uformer architecture for Dehazing, SGLC increased the PSNR metric by a significant margin. Any other model can be incorporated inside the SGLC process to improve its efficiency on High-Resolution input data.Comment: Accepted in CVPR 2023 Workshop
    • …
    corecore