8 research outputs found
New Hybrid Deep Learning Method to Recognize Human Action from Video
There has been a tremendous increase in internet users and enough bandwidth in recent years. Because Internet connectivity is so inexpensive, information sharing (text, audio, and video) has become more popular and faster. This video content must be examined in order to classify it for different purposes for users. Several machine learning approaches for video classification have been developed to save users time and energy. The use of deep neural networks to recognize human behavior has become a popular issue in recent years. Although significant progress has been made in the field of video recognition, there are still numerous challenges in the realm of video to be overcome. Convolutional neural networks (CNNs) are well-known for requiring a fixed-size image input, which limits the network topology and reduces identification accuracy. Despite the fact that this problem has been solved in the world of photos, it has yet to be solved in the area of video. We present a ten stacked three-dimensional (3D) convolutional network based on the spatial pyramid-based pooling to handle the input problem of fixed size video frames in video recognition. The network structure is made up of three sections, as the name suggests: a ten-layer stacked 3DCNN, DenseNet, and SPPNet. A KTH dataset was used to test our algorithms. The experimental findings showed that our model outperformed existing models in the area of video-based behavior identification by 2% margin accuracy
A review on Video Classification with Methods, Findings, Performance, Challenges, Limitations and Future Work
In recent years, there has been a rapid development in web users and sufficient bandwidth. Internet connectivity, which is so low cost, makes the sharing of information (text, audio, and videos) more common and faster. This video content needs to be analyzed for prediction it classes in different purpose for the users. Many machines learning approach has been developed for the classification of video to save people time and energy. There are a lot of existing review papers on video classification, but they have some limitations such as limitation of the analysis, badly structured, not mention research gaps or findings, not clearly describe advantages, disadvantages, and future work. But our review paper almost overcomes these limitations. This study attempts to review existing video-classification procedures and to examine the existing methods of video-classification comparatively and critically and to recommend the most effective and productive process. First of all, our analysis examines the classification of videos with taxonomical details, the latest application, process, and datasets information. Secondly, overall inconvenience, difficulties, shortcomings and potential work, data, performance measurements with the related recent relation in science, deep learning, and the model of machine learning. Study on video classification systems using their tools, benefits, drawbacks, as well as other features to compare the techniques they have used also constitutes a key task of this review. Lastly, we also present a quick summary table based on selected features. In terms of precision and independence extraction functions, the RNN (Recurrent Neural Network), CNN (Convolutional Neural Network) and combination approach performs better than the CNN dependent method
New hybrid deep learning method to recognize human action from video
There has been a tremendous increase in internet users and enough bandwidth in recent years. Because Internet connectivity is so inexpensive, information sharing (text, audio, and video) has become more popular and faster. This video content must be examined in order to classify it for different purposes for users. Several machine learning approaches for video classification have been developed to save users time and energy. The use of deep neural networks to recognize human behavior has become a popular issue in recent years. Although significant progress has been made in the field of video recognition, there are still numerous challenges in the realm of video to be overcome. Convolutional neural networks (CNNs) are well-known for requiring a fixedsize image input, which limits the network topology and reduces identification accuracy. Despite the fact that this problem has been solved in the world of photos, it has yet to be solved in the area of video. We present a ten stacked three-dimensional (3D) convolutional network based on the spatial pyramidbased pooling to handle the input problem of fixed size video frames in video recognition. The network structure is made up of three sections, as the name suggests: a ten-layer stacked 3DCNN, DenseNet, and SPPNet. A KTH dataset was used to test our algorithms. The experimental findings showed that our model outperformed existing models in the area of video-based behavior identification by 2% margin accuracy
HARC-New Hybrid Method with Hierarchical Attention Based Bidirectional Recurrent Neural Network with Dilated Convolutional Neural Network to Recognize Multilabel Emotions from Text
We present a modern hybrid paradigm for managing tacit semantic awareness and qualitative meaning in short texts. The main goals of this proposed technique are to use deep learning approaches to identify multilevel textual sentiment with far less time and more accurate and simple network structure training for better performance. In this analysis, the proposed new hybrid deep learning HARC model architecture for the recognition of multilevel textual sentiment that combines hierarchical attention with Convolutional Neural Network (CNN), Bidirectional Gated Recurrent Unit (BiGRU), and Bidirectional Long Short-Term Memory (BiLSTM) outperforms other compared approaches. BiGRU and BiLSTM were used in this model to eliminate individual context functions and to adequately manage long-range features. Dilated CNN was used to replicate the retrieved feature by forwarding vector instances for better support in the hierarchical attention layer, and it was used to eliminate better text information using higher coupling correlations. Our method handles the most important features to recover the limitations of handling context and semantics sufficiently. On a variety of datasets, our proposed HARC algorithm solution outperformed traditional machine learning approaches as well as comparable deep learning models by a margin of 1%. The accuracy of the proposed HARC method was 82.50 percent IMDB, 98.00 percent for toxic data, 92.31 percent for Cornflower, and 94.60 percent for Emotion recognition data. Our method works better than other basic and CNN and RNN based hybrid models. In the future, we will work for more levels of text emotions from long and more complex text
HARDC : A novel ECG-based heartbeat classification method to detect arrhythmia using hierarchical attention based dual structured RNN with dilated CNN
In this paper have developed a novel hybrid hierarchical attention-based
bidirectional recurrent neural network with dilated CNN (HARDC) method for
arrhythmia classification. This solves problems that arise when traditional
dilated convolutional neural network (CNN) models disregard the correlation
between contexts and gradient dispersion. The proposed HARDC fully exploits the
dilated CNN and bidirectional recurrent neural network unit (BiGRU-BiLSTM)
architecture to generate fusion features. As a result of incorporating both
local and global feature information and an attention mechanism, the model's
performance for prediction is improved.By combining the fusion features with a
dilated CNN and a hierarchical attention mechanism, the trained HARDC model
showed significantly improved classification results and interpretability of
feature extraction on the PhysioNet 2017 challenge dataset. Sequential Z-Score
normalization, filtering, denoising, and segmentation are used to prepare the
raw data for analysis. CGAN (Conditional Generative Adversarial Network) is
then used to generate synthetic signals from the processed data. The
experimental results demonstrate that the proposed HARDC model significantly
outperforms other existing models, achieving an accuracy of 99.60\%, F1 score
of 98.21\%, a precision of 97.66\%, and recall of 99.60\% using MIT-BIH
generated ECG. In addition, this approach substantially reduces run time when
using dilated CNN compared to normal convolution. Overall, this hybrid model
demonstrates an innovative and cost-effective strategy for ECG signal
compression and high-performance ECG recognition. Our results indicate that an
automated and highly computed method to classify multiple types of arrhythmia
signals holds considerable promise.Comment: 23 page
Multimodal hybrid deep learning approach to detect tomato leaf disease using attention based dilated convolution feature extractor with logistic regression classification
Automatic leaf disease detection techniques are effective for reducing the time-consuming effort of monitoring large crop farms and early identification of disease symptoms of plant leaves. Although crop tomatoes are seen to be susceptible to a variety of diseases that can reduce the production of the crop. In recent years, advanced deep learning methods show successful applications for plant disease detection based on observed symptoms on leaves. However, these methods have some limitations. This study proposed a high-performance tomato leaf disease detection approach, namely attention-based dilated CNN logistic regression (ADCLR). Firstly, we develop a new feature extraction method using attention-based dilated CNN to extract most relevant features in a faster time. In our preprocessing, we use Bilateral filtering to handle larger features to make the image smoother and the Ostu image segmentation process to remove noise in a fast and simple way. In this proposed method, we preprocess the image with bilateral filtering and Otsu segmentation. Then, we use the Conditional Generative Adversarial Network (CGAN) model to generate a synthetic image from the image which is preprocessed in the previous stage. The synthetic image is generated to handle imbalance and noisy or wrongly labeled data to obtain good prediction results. Then, the extracted features are normalized to lower the dimensionality. Finally, extracted features from preprocessed data are combined and then classified using fast and simple logistic regression (LR) classifier. The experimental outcomes show the state-of-the-art performance on the Plant Village database of tomato leaf disease by achieving 100%, 100%, 96.6% training, testing, and validation accuracy, respectively, for multiclass. From the experimental analysis, it is clearly demonstrated that the proposed multimodal approach can be utilized to detect tomato leaf disease precisely, simply and quickly. We have a potential plan to improve the model to make it cloud-based automated leaf disease classification for different plants
HARC-New Hybrid Method with Hierarchical Attention Based Bidirectional Recurrent Neural Network with Dilated Convolutional Neural Network to Recognize Multilabel Emotions from Text
We present a modern hybrid paradigm for managing tacit semantic awareness and qualitative meaning in short texts. The main goals of this proposed technique are to use deep learning approaches to identify multilevel textual sentiment with far less time and more accurate and simple network structure training for better performance. In this analysis, the proposed new hybrid deep learning HARC model architecture for the recognition of multilevel textual sentiment that combines hierarchical attention with Convolutional Neural
Network (CNN), Bidirectional Gated Recurrent Unit (BiGRU), and Bidirectional Long Short-Term Memory (BiLSTM) outperforms other compared approaches. BiGRU and BiLSTM were used in this model to eliminate individual context functions and to adequately manage long-range features. Dilated CNN was used to replicate the retrieved feature by forwarding vector instances for better support in the hierarchical attention layer, and it was used to eliminate better text information using higher coupling correlations. Our method handles the most important features to recover the limitations of handling context and semantics sufficiently. On a variety of datasets, our proposed HARC algorithm solution outperformed traditional machine learning approaches as well as comparable deep learning models by a margin of 1%. The accuracy of the proposed HARC method was
82.50 percent IMDB, 98.00 percent for toxic data, 92.31 percent for Cornflower, and 94.60 percent for Emotion recognition data. Our method works better than other basic and CNN and RNN based hybrid models. In the
future, we will work for more levels of text emotions from long and more complex text
Multimodal Hybrid Deep Learning Approach to Detect Tomato Leaf Disease Using Attention Based Dilated Convolution Feature Extractor with Logistic Regression Classification
Automatic leaf disease detection techniques are effective for reducing the time-consuming effort of monitoring large crop farms and early identification of disease symptoms of plant leaves. Although crop tomatoes are seen to be susceptible to a variety of diseases that can reduce the production of the crop. In recent years, advanced deep learning methods show successful applications for plant disease detection based on observed symptoms on leaves. However, these methods have some limitations. This study proposed a high-performance tomato leaf disease detection approach, namely attention-based dilated CNN logistic regression (ADCLR). Firstly, we develop a new feature extraction method using attention-based dilated CNN to extract most relevant features in a faster time. In our preprocessing, we use Bilateral filtering to handle larger features to make the image smoother and the Ostu image segmentation process to remove noise in a fast and simple way. In this proposed method, we preprocess the image with bilateral filtering and Otsu segmentation. Then, we use the Conditional Generative Adversarial Network (CGAN) model to generate a synthetic image from the image which is preprocessed in the previous stage. The synthetic image is generated to handle imbalance and noisy or wrongly labeled data to obtain good prediction results. Then, the extracted features are normalized to lower the dimensionality. Finally, extracted features from preprocessed data are combined and then classified using fast and simple logistic regression (LR) classifier. The experimental outcomes show the state-of-the-art performance on the Plant Village database of tomato leaf disease by achieving 100%, 100%, 96.6% training, testing, and validation accuracy, respectively, for multiclass. From the experimental analysis, it is clearly demonstrated that the proposed multimodal approach can be utilized to detect tomato leaf disease precisely, simply and quickly. We have a potential plan to improve the model to make it cloud-based automated leaf disease classification for different plants