50 research outputs found

    A framework for evaluating human action detection via multidimensional approach

    Get PDF
    This work discusses the application of an Artificial Intelligence technique called data extraction and a process-based ontology in constructing experimental qualitative models for video retrieval and detection. We present a framework architecture that uses multimodality features as the knowledge representation scheme to model the behaviors of a number of human actions in the video scenes. The main focus of this paper placed on the design of two main components (model classifier and inference engine) for a tool abbreviated as VASD (Video Action Scene Detector) for retrieving and detecting human actions from video scenes. The discussion starts by presenting the workflow of the retrieving and detection process and the automated model classifier construction logic. We then move on to demonstrate how the constructed classifiers can be used with multimodality features for detecting human actions. Finally, behavioral explanation manifestation is discussed. The simulator is implemented in bilingual; Matlab and C++ are at the backend supplying data and theories while Java handles all front-end GUI and action pattern updating

    Content modelling for human action detection via multidimensional approach

    Get PDF
    Video content analysis is an active research domain due to the availability and the increment of audiovisual data in the digital format. There is a need to automatically extracting video content for efficient access, understanding,browsing and retrieval of videos. To obtain the information that is of interest and to provide better entertainment, tools are needed to help users extract relevant content and to effectively navigate through the large amount of available video information. Existing methods do not seem to attempt to model and estimate the semantic content of the video. Detecting and interpreting human presence,actions and activities is one of the most valuable functions in this proposed framework. The general objectives of this research are to analyze and process the audio-video streams to a robust audiovisual action recognition system by integrating, structuring and accessing multimodal information via multidimensional retrieval and extraction model. The proposed technique characterizes the action scenes by integrating cues obtained from both the audio and video tracks. Information is combined based on visual features (motion,edge, and visual characteristics of objects), audio features and video for recognizing action. This model uses HMM and GMM to provide a framework for fusing these features and to represent the multidimensional structure of the framework. The action-related visual cues are obtained by computing the spatio temporal dynamic activity from the video shots and by abstracting specific visual events. Simultaneously, the audio features are analyzed by locating and compute several sound effects of action events that embedded in the video. Finally, these audio and visual cues are combined to identify the action scenes. Compared with using single source of either visual or audio track alone, such combined audio visual information provides more reliable performance and allows us to understand the story content of movies in more detail. To compare the usefulness of the proposed framework, several experiments were conducted and the results were obtained by using visual features only (77.89% for precision;72.10% for recall), audio features only (62.52% for precision; 48.93% for recall)and combined audiovisual (90.35% for precision; 90.65% for recall)

    Classification of herbs plant diseases via hierarchical dynamic artificial neural network

    Get PDF
    When herbs plants has disease, they can display a range of symptoms such as colored spots, or streaks that can occur on the leaves, stems, and seeds of the plant. These visual symptoms continuously change their color, shape and size as the disease progresses. Once the image of a target is captured digitally, a myriad of image processing algorithms can be used to extract features from it. The usefulness of each of these features will depend on the particular patterns to be highlighted in the image. A key point in the implementation of optimal classifiers is the selection of features that characterize the image. Basically, in this study, image processing and pattern classification are going to be used to implement a machine vision system that could identify and classify the visual symptoms of herb plants diseases. The image processing is divided into four stages: Image Pre-Processing to remove image noises (Fixed-Valued Impulse Noise, Random-Valued Impulse Noise and Gaussian Noise), Image Segmentation to identify regions in the image that were likely to qualify as diseased region, Image Feature Extraction and Selection to extract and select important image features and Image Classification to classify the image into different herbs diseases classes. This paper is to propose an unsupervised diseases pattern recognition and classification algorithm that is based on a modified Hierarchical Dynamic Artificial Neural Network which provides an adjustable sensitivity-specificity herbs diseases detection and classification from the analysis of noise-free colored herbs images. It is also to proposed diseases treatment algorithm that is capable to provide a suitable treatment and control for each identified herbs diseases

    Extracting and integrating multimodality features via multidimensional approach for video retrieval

    Get PDF
    This work discusses the application of an Artificial Intelligence technique called data extraction and a process-based ontology in constructing experimental qualitative models for video retrieval and detection. We present a framework architecture that uses multimodality features as the knowledge representation scheme to model the behaviors of a number of human actions in the video scenes. The main focus of this paper placed on the design of two main components (model classifier and inference engine) for a tool abbreviated as VSAD (Video Action Scene Detector) for retrieving and detecting human actions from video scenes. The discussion starts by presenting the workflow of the retrieving and detection process and the automated model classifier construction logic. We then move on to demonstrate how the constructed classifiers can be used with multimodality features for detecting human actions. Finally, behavioral explanation manifestation is discussed. The simulator is implemented in bilingual; Math Lab and C++ are at the backend supplying data and theories while Java handles all front-end GUI and action pattern updating

    Machining process classification using PCA reduced histogram features and the support vector machine

    Get PDF
    Being able to identify machining processes that produce specific machined surfaces is crucial in modern manufacturing production. Image processing and computer vision technologies have become indispensable tools for automated identification with benefits such as reduction in inspection time and avoidance of human errors due to inconsistency and fatigue. In this paper, the Support Vector Machine (SVM) classifier with various kernels is investigated for the categorization of machined surfaces into the six machining processes of Turning, Grinding, Horizontal Milling, Vertical Milling, Lapping, and Shaping. The effectiveness of the gray-level histogram as the discriminating feature is explored. Experimental results suggest that the SVM with the linear kernel provides superior performance for a dataset consisting of 72 workpiece images

    Durian recognition based on multiple features and linear discriminant analysis

    Get PDF
    Many fruit recognition approaches today are designed to classify different type of fruits but there is little effort being done for content-based fruit recognition specifically focuses on durian species. Durian, known as the king of tropical fruits, have few similar characteristics between different species where the skin have almost the same colour from green to yellowish brown with just slightly different shape and pattern of thorns. Therefore, it is hard to differentiate them with the current methods. It would be valuable to have an automated content-based recognition framework that can automatically represent and recognise a durian species given a durian image as the input. Therefore, this work aims to contribute to a new representation method based on multiple features for effective durian recognition. Two features based on shape and texture is considered in this work. Simple shape signatures which include area, perimeter, and circularity are used to determine the shape of the fruit durian and its base while the texture of the fruit is constructed based on Local Binary Pattern. We extracted these features from 240 durian images and trained this proposed method using few classifiers. Based on 10-fold cross validation, it is found that Logistic Regression, Gaussian Naïve Bayesian, and Linear Discriminant Analysis classifiers performed equally well with 100% achievement of accuracy, precision, recall, and F1-score. We further tested the proposed algorithm on larger dataset which consisted of 42337 fruit images (64 various categories). Experimental results based on larger and more general dataset have shown that the proposed multiple features trained on Linear Discriminant Analysis classifier able to achieve 72.38% accuracy, 73% precision, 72% recall, and 72% F1-score

    A colour-based building recognition using support vector machine

    Get PDF
    Many applications apply the concept of image recognition to help human in recognising objects simply by just using digital images. A content-based building recognition system could solve the problem of using just text as search input. In this paper, a building recognition system using colour histogram is proposed for recognising buildings in Ipoh city, Perak, Malaysia. The colour features of each building image will be extracted. A feature vector combining the mean, standard deviation, variance, skewness and kurtosis of gray level will be formed to represent each building image. These feature values are later used to train the system using supervised learning algorithm, which is Support Vector Machine (SVM). Lastly, the accuracy of the recognition system is evaluated using 10-fold cross validation. The evaluation results show that the building recognition system is well trained and able to effectively recognise the building images with low misclassification rate

    Deep learning mango fruits recognition based on tensorflow lite

    Get PDF
    Agricultural images such as fruits and vegetables have previously been recognised and classified using image analysis and computer vision techniques. Mangoes are currently being classified manually, whereby mango sellers must laboriously identify mangoes by hand. This is time-consuming and tedious. In this work, TensorFlow Lite was used as a transfer learning tool. Transfer learning is a fast approach in resolving classification problems effectively using small datasets. This work involves six categories, where four mango types are classified (Harum Manis, Langra, Dasheri and Sindhri), categories for other types of mangoes, and a non-mango category. Each category dataset comprises 100 images, and is split 70/30 between the training and testing set, respectively. This work was undertaken with a mobile-based application that can be used to distinguish various types of mangoes based on the proposed transfer learning method. The results obtained from the conducted experiment show that adopted transfer learning can achieve an accuracy of 95% for mango recognition. A preliminary user acceptance survey was also carried out to investigate the user’s requirements, the effectiveness of the proposed functionalities, and the ease of use of its proposed interfaces, with promising results

    A texture descriptor: BackGround Local Binary Pattern (BGLBP)

    Get PDF
    Local Binary Pattern (LBP) is invariant to the monotonic changes in the grey scale domain. This property enables LBP to present a texture descriptor being useful in applications dealing with the local illumination changes. However, the existing versions of LBP are not able to handle image illumination changes, especially in outdoor environments. The non-patterned illumination changes disturb performance of the background extraction methods. In this paper, an extended version of LBP which is called BackGround LBP (BGLBP) is presented. BGLBP is designed for the background extraction application but it is extendable to the other areas as a texture descriptor. BGLBP is an extension of D-LBP, Centre-Symmetric LBP, ULBP, and R-LBP and it has been designed to inherit the positive properties of previous versions. The performance of BGLBP as a part of background extraction method is investigated. In addition, a comparison between BGLBP as a general texture descriptor and a number of LBP versions is conducted
    corecore