29 research outputs found
Business Category Classification via Indistinctive Satellite Image Analysis Using Deep Learning
Satellite image analysis has numerous useful applications in various domains. Extracting their visual information has been made easier using remote sensing and deep learning technologies that intelligently interpret clear visual cues. However, satellite information has the potential for more complex tasks, such as recommending business locations and categories based on the implicit patterns and structures of the regions of interest. Nonetheless, this task is significantly more challenging due to the absence of obvious visual cues and the highly similar appearance of each location. This study aims to analyze satellite image similarity between business class categories and investigate the capabilities of state-of-the-art deep learning models for learning non-obvious visual cues. Specifically, a satellite image dataset is constructed using business locations and annotated with the business categories for image structural similarity analysis, followed by business category classification via fine-tuning of deep learning classifiers. The models are then analyzed by visualizing the features learned to determine if they could capture hidden information for such a task. Experiments show that business locations have significantly high SSIM regardless of categories, and deep learning models only recorded a top accuracy of 60%. However, feature visualization using Grad-CAM shows that the models learn biased features and disregard highly informative details such as roads. It is concluded that typical learning models and strategies are insufficient to effectively solve this complex visual problem; thus, further research should be done to formulate solutions for such non-obvious classifications with the potential to support business recommendation applications
Multimodal hybrid deep learning approach to detect tomato leaf disease using attention based dilated convolution feature extractor with logistic regression classification
Automatic leaf disease detection techniques are effective for reducing the time-consuming effort of monitoring large crop farms and early identification of disease symptoms of plant leaves. Although crop tomatoes are seen to be susceptible to a variety of diseases that can reduce the production of the crop. In recent years, advanced deep learning methods show successful applications for plant disease detection based on observed symptoms on leaves. However, these methods have some limitations. This study proposed a high-performance tomato leaf disease detection approach, namely attention-based dilated CNN logistic regression (ADCLR). Firstly, we develop a new feature extraction method using attention-based dilated CNN to extract most relevant features in a faster time. In our preprocessing, we use Bilateral filtering to handle larger features to make the image smoother and the Ostu image segmentation process to remove noise in a fast and simple way. In this proposed method, we preprocess the image with bilateral filtering and Otsu segmentation. Then, we use the Conditional Generative Adversarial Network (CGAN) model to generate a synthetic image from the image which is preprocessed in the previous stage. The synthetic image is generated to handle imbalance and noisy or wrongly labeled data to obtain good prediction results. Then, the extracted features are normalized to lower the dimensionality. Finally, extracted features from preprocessed data are combined and then classified using fast and simple logistic regression (LR) classifier. The experimental outcomes show the state-of-the-art performance on the Plant Village database of tomato leaf disease by achieving 100%, 100%, 96.6% training, testing, and validation accuracy, respectively, for multiclass. From the experimental analysis, it is clearly demonstrated that the proposed multimodal approach can be utilized to detect tomato leaf disease precisely, simply and quickly. We have a potential plan to improve the model to make it cloud-based automated leaf disease classification for different plants
Study Of Shape-Based Semi-Local Point Descriptor For Building Recognition
In this thesis project, the author has carried out a study involving low-level features and middle-level features that can be used for man-made object recognition. This study focuses on a very specific type of man-made structure, i.e. buildings, taken from image capturing devices from ground-level
Comparison of CNN-based Algorithms for Halal Logo Recognition
The market for halal products has been continuously growing from day
to day. Along with this, the demand for halal product verification has grown. As a
certification symbol, a unique halal logo can be presented on the products, and the
logos are uniquely designed by each halal certification body. However, there are
instances where an irresponsible party creates a fake halal logo and displays it on
their product, deceiving Muslim consumers. In Malaysia, the Department of
Islamic Development (JAKIM) has introduced a standard halal logo for locally
manufactured products. It currently recognizes other halal logos from foreign
certification bodies around the world for products imported into Malaysia. Our
work proposes the use of deep learning methods to identify the various halal logos
from different countries. Existing methods and algorithms are used to identify and
recognize halal logos. Three deep learning methods, notably YOLOv5, Back
Propagation Neural Network and MobileNetV2-SSD are compared, and it is shown
that the Back Propagation Neural Network outperforms the other two methods with
F1-score of 0.949. This method is then implemented on a mobile application that
can be used to capture a halal logo from a product followed by recognizing the logo
and its country of origin
Combinatorial Filtering and Machine Learning Approach to Improve Vessel Segmentation for Retinal Image Analysis
In assessing prevalent eye-diseases such Glaucoma,
Diabetic Retinopathy (DR), Age-related macular
degeneration (AMD) etc. retinal image understanding is
crucial [1],[2].
• Fundus cameras are used to capture the fundus images
which visualize the interior surface of the eye.
• Fundus images are analyzed by the ophthalmologist to
identify any abnormalities within the retinal structures.
• Retinal blood vessel (RBV) and Optic Disc (OD) structures
found in fundus images are segmented and localized to
diagnose diseases
Vision Based Gesture Recognition from RGB Video frames Using Morphological Image Processing Techniques
with the large number of population in all over the world nowadays, novel human computer interaction systems and techniques can be used to help improve our way of life.
A vision based gesture recognition technology can help to maintain the safety and needs of the disable as well as others. Gesture recognition from video frames is a challenging
task due to the high changeability in the features of each gesture with respect to different person. In this work, we propose a vision-based hand gesture recognition algorithm where the image frames are from RGB video data. Gesture-based systems are more
natural, spontaneous, and straightforward. Previous works attempted to recognize hand gesture for different kind of scenarios. According to our studies, gesture recognition
system can be based on wearable sensor or it can be vision based. Our proposed method is applied on a vision based gesture recognition system. In our proposed system image acquisition starts from RGB videos capture using Kinect sensor. We convert the image frames one after another from videos to blur for background noise removal. Then, we convert the images of a whole video into HSV color mode. After that, we do the dilation, erosion, filtering, and thresholding operations on the images. We use these morphological image processing techniques for converting the images to black and white format. Finally, using the prominent classification algorithm SVM we recognize the hand gestures
with a higher accuracy 91.01 percent compared to the state of the art. In conclusion, the proposed algorithm aims to create a better vision-based hand gesture recognition system with a unique solution in this domain
Content based image retrieval and classification using Speeded-Up Robust Features (SURF) and grouped Bag-of-Visual-Words (BoVW)
This paper presents a work in progress for a proposed method for Content Based Image Retrieval (CBIR) and Classification. The proposed method makes use of the interest points detector and descriptor called Speeded-Up Robust Features (SURF) combined with Bag-of-Visual-Words (BoVW). The combination yields a good retrieval and classification result when compared to other methods. Moreover, a new dictionary building method in which each group has its own dictionary is also proposed. Our method is tested on the highly diverse COREL1000 database and has shown a more discriminative classification and retrieval result
Vision-based hand gesture recognition from RGB video data using SVM
With the growing number of population in the world nowadays, novel human-computer interaction systems and techniques can be used to help improve their quality of life. A gesture based technology can help to maintain the safety and needs of the disable as well as the general people. Gesture recognition from video streams is a challenging task due to the high changeability in the features of each gesture with respect to different person. In this work, we propose a vision-based hand gesture recognition from RGB video data using SVM. Gesture-based interfaces are more natural, spontaneous, and straightforward. Previous works attempted to recognize hand gesture for different scenarios. Throughout our studies, gesture recognition system can be based on wearable sensor or it can be vision based. Our proposed method is applied on a vision based gesture recognition system. In our proposed system image acquisition starts from RGB videos capture using Kinect sensor. We convert the image frames from videos to blur for background noise removal. Then, we convert the images into hsv color mode. After that, we do the dilation, erosion, filtering, and thresholding the image for converting to black and white format. Finally, using the prominent classification algorithm SVM, hand gestures has been recognized. In conclusion, the framework aims to create a better vision-based hand gesture recognition system with novel techniques.
© (2019) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only
Vision-Based Verification of Outdoor Fiber Distribution Point Installation
Inspection of new installations of Fiber Distribution Point (FDP) poles in network
infrastructure is required for all new points and must meet the requirements.
• Currently, network development are required to send a supervisor to site for
verification of new FDP installation, before accepting and approving the contractors'
work and closing the project.
• This current process causes delay in project closure, due to the required manual site
verification for each installation.
• There is a need for the automation of geotagging from the coordination of the new FDP
installation, which will help identifying the location of FDP for future work and
maintenance
Video analytics using deep learning for crowd analysis: a review
Gathering a large number of people in a shared physical area is very common in urban culture. Although there are limitless examples of mega crowds, the Islamic religious ritual, the Hajj, is considered as one of the greatest crowd scenarios in the world. The Hajj is carried out once in a year with a congregation of millions of people when the Muslims visit the holy city of Makkah at a given time and date. Such a big crowd is always prone to public safety issues, and therefore requires proper measures to ensure safe and comfortable arrangement. Through the advances in computer vision based scene understanding, automatic analysis of crowd scenes is gaining popularity. However, existing crowd analysis algorithms might not be able to correctly interpret the video content in the context of the Hajj. This is because the Hajj is a unique congregation of millions of people crowded in a small area, which can overwhelm the use of existing video and computer vision based sophisticated algorithms. Through our studies on crowd analysis, crowd counting, density estimation, and the Hajj crowd behavior, we faced the need of a review work to get a research direction for abnormal behavior analysis of Hajj pilgrims. Therefore, this review aims to summarize the research works relevant to the broader field of video analytics using deep learning with a special focus on the visual surveillance in the Hajj. The review identifies the challenges and leading-edge techniques of visual surveillance in general, which may gracefully be adaptable to the applications of Hajj and Umrah. The paper presents detailed reviews on existing techniques and approaches employed for crowd analysis from crowd videos, specifically the techniques that use deep learning in detecting abnormal behavior. These observations give us the impetus to undertake a painstaking yet exhilarating journey on crowd analysis, classification and detection of any abnormal movement of the Hajj pilgrims. Furthermore, because the Hajj pilgrimage is the most crowded domain for video-related extensive research activities, this study motivates us to critically analyze the crowd on a large scale