25 research outputs found

    Spatial frequency based video stream analysis for object classification and recognition in clouds

    Get PDF
    The recent rise in multimedia technology has made it easier to perform a number of tasks. One of these tasks is monitoring where cheap cameras are producing large amount of video data. This video data is then processed for object classification to extract useful information. However, the video data obtained by these cheap cameras is often of low quality and results in blur video content. Moreover, various illumination effects caused by lightning conditions also degrade the video quality. These effects present severe challenges for object classification. We present a cloud-based blur and illumination invariant approach for object classification from images and video data. The bi-dimensional empirical mode decomposition (BEMD) has been adopted to decompose a video frame into intrinsic mode functions (IMFs). These IMFs further undergo to first order Reisz transform to generate monogenic video frames. The analysis of each IMF has been carried out by observing its local properties (amplitude, phase and orientation) generated from each monogenic video frame. We propose a stack based hierarchy of local pattern features generated from the amplitudes of each IMF which results in blur and illumination invariant object classification. The extensive experimentation on video streams as well as publically available image datasets reveals that our system achieves high accuracy from 0.97 to 0.91 for increasing Gaussian blur ranging from 0.5 to 5 and outperforms state of the art techniques under uncontrolled conditions. The system also proved to be scalable with high throughput when tested on a number of video streams using cloud infrastructure

    BEMDEC: An Adaptive and Robust Methodology for Digital Image Feature Extraction

    Get PDF
    The intriguing study of feature extraction, and edge detection in particular, has, as a result of the increased use of imagery, drawn even more attention not just from the field of computer science but also from a variety of scientific fields. However, various challenges surrounding the formulation of feature extraction operator, particularly of edges, which is capable of satisfying the necessary properties of low probability of error (i.e., failure of marking true edges), accuracy, and consistent response to a single edge, continue to persist. Moreover, it should be pointed out that most of the work in the area of feature extraction has been focused on improving many of the existing approaches rather than devising or adopting new ones. In the image processing subfield, where the needs constantly change, we must equally change the way we think. In this digital world where the use of images, for variety of purposes, continues to increase, researchers, if they are serious about addressing the aforementioned limitations, must be able to think outside the box and step away from the usual in order to overcome these challenges. In this dissertation, we propose an adaptive and robust, yet simple, digital image features detection methodology using bidimensional empirical mode decomposition (BEMD), a sifting process that decomposes a signal into its two-dimensional (2D) bidimensional intrinsic mode functions (BIMFs). The method is further extended to detect corners and curves, and as such, dubbed as BEMDEC, indicating its ability to detect edges, corners and curves. In addition to the application of BEMD, a unique combination of a flexible envelope estimation algorithm, stopping criteria and boundary adjustment made the realization of this multi-feature detector possible. Further application of two morphological operators of binarization and thinning adds to the quality of the operator

    Image processing and machine learning techniques used in computer-aided detection system for mammogram screening - a review

    Get PDF
    This paper aims to review the previously developed Computer-aided detection (CAD) systems for mammogram screening because increasing death rate in women due to breast cancer is a global medical issue and it can be controlled only by early detection with regular screening. Till now mammography is the widely used breast imaging modality. CAD systems have been adopted by the radiologists to increase the accuracy of the breast cancer diagnosis by avoiding human errors and experience related issues. This study reveals that in spite of the higher accuracy obtained by the earlier proposed CAD systems for breast cancer diagnosis, they are not fully automated. Moreover, the false-positive mammogram screening cases are high in number and over-diagnosis of breast cancer exposes a patient towards harmful overtreatment for which a huge amount of money is being wasted. In addition, it is also reported that the mammogram screening result with and without CAD systems does not have noticeable difference, whereas the undetected cancer cases by CAD system are increasing. Thus, future research is required to improve the performance of CAD system for mammogram screening and make it completely automated

    BC-DUnet-Based Segmentation of Fine Cracks in Bridges under a Complex Background

    Get PDF
    Crack is the external expression form of potential safety risks in bridge construction. Currently, automatic detection and segmentation of bridge cracks remains the top priority of civil engineers. With the development of image segmentation techniques based on convolutional neural networks, new opportunities emerge in bridge crack detection. Traditional bridge crack detection methods are vulnerable to complex background and small cracks, which is difficult to achieve effective segmentation. This study presents a bridge crack segmentation method based on a densely connected U-Net network (BC-DUnet) with a background elimination module and cross-attention mechanism. First, a dense connected feature extraction model (DCFEM) integrating the advantages of DenseNet is proposed, which can effectively enhance the main feature information of small cracks. Second, the background elimination module (BEM) is proposed, which can filter the excess information by assigning different weights to retain the main feature information of the crack. Finally, a cross-attention mechanism (CAM) is proposed to enhance the capture of long-term dependent information and further improve the pixel-level representation of the model. Finally, 98.18% of the Pixel Accuracy was obtained by comparing experiments with traditional networks such as FCN and Unet, and the IOU value was increased by 14.12% and 4.04% over FCN and Unet, respectively. In our non-traditional networks such as HU-ResNet and F U N-4s, SAM-DUnet has better and higher accuracy and generalization is not prone to overfitting. The BC-DUnet network proposed here can eliminate the influence of complex background on the segmentation accuracy of bridge cracks, improve the detection efficiency of bridge cracks, reduce the detection cost, and have practical application value

    PENGENALAN IRIS MATA MENGGUNAKAN EKSTRAKSI FITUR DIMENSI FRAKTAL BOX COUNTING

    Get PDF
    Biometrika adalah ilmu yang sekarang sedang berkembang pesat. Ciri khas biometrika yang mengambil keunikan ciri dari tubuh manusia membuat biometrika berkembang untuk sistem keamanan modern. Biometrika dibagi menjadi dua, yaitu behavioral (cara berjalan, cara mengetik, dll) dan phsycal (iris, retina, wajah, sidik jari, telapak tangan, dll). Iris dipilih karena setiap manusia memiliki ciri yang khusus yaitu setiap individu berbeda dan iris mata dilindungi oleh kornea sehingga akan memiliki bentuk yang tetap. Pada sistem pengenalan iris mata ini dilakukan 3 tahap yaitu, pra-pemrosesan data, ekstraksi ciri dan pencocokan. Pada proses pra-pemrosesan digunakan transformasi Hough untuk mencari daerah iris mata dan Daugman’s rubber sheet model untuk normalisasi dataset iris mata menjadi blok persegi panjang. Kemudian untuk mencari ciri dari iris mata ini digunakan metode box counting untuk mendapatkan nilai dimensi. Dimensi-dimensi dengan jarak euclidean yang berdekatan berarti berada pada kelas yang sama, teori ini dikenal dengan K-Nearest Neighbor sebagai klasifikasi data. Data yang digunakan 60 data dari 10 kelas. Dalam penelitian menggunakan 5-fold cross validation sehinnga diperoleh nilai akurasi terbesar 92,632 % untuk nilak K= 3 pada metode K-Nearest Neighbor(KNN). Kata Kunci: Biometrika, Pengenalan Iris, Dimensi Fraktal

    Feature Extraction Methods by Various Concepts using SOM

    Get PDF
    Image retrieval systems gained traction with the increased use of visual and media data. It is critical to understand and manage big data, lot of analysis done in image retrieval applications. Given the considerable difficulty involved in handling big data using a traditional approach, there is a demand for its efficient management, particularly regarding accuracy and robustness. To solve these issues, we employ content-based image retrieval (CBIR) methods within both supervised , unsupervised pictures. Self-Organizing Maps (SOM), a competitive unsupervised learning aggregation technique, are applied in our innovative multilevel fusion methodology to extract features that are categorised. The proposed methodology beat state-of-the-art algorithms with 90.3% precision, approximate retrieval precision (ARP) of 0.91, and approximate retrieval recall (ARR) of 0.82 when tested on several benchmark datasets

    A survey, review, and future trends of skin lesion segmentation and classification

    Get PDF
    The Computer-aided Diagnosis or Detection (CAD) approach for skin lesion analysis is an emerging field of research that has the potential to alleviate the burden and cost of skin cancer screening. Researchers have recently indicated increasing interest in developing such CAD systems, with the intention of providing a user-friendly tool to dermatologists to reduce the challenges encountered or associated with manual inspection. This article aims to provide a comprehensive literature survey and review of a total of 594 publications (356 for skin lesion segmentation and 238 for skin lesion classification) published between 2011 and 2022. These articles are analyzed and summarized in a number of different ways to contribute vital information regarding the methods for the development of CAD systems. These ways include: relevant and essential definitions and theories, input data (dataset utilization, preprocessing, augmentations, and fixing imbalance problems), method configuration (techniques, architectures, module frameworks, and losses), training tactics (hyperparameter settings), and evaluation criteria. We intend to investigate a variety of performance-enhancing approaches, including ensemble and post-processing. We also discuss these dimensions to reveal their current trends based on utilization frequencies. In addition, we highlight the primary difficulties associated with evaluating skin lesion segmentation and classification systems using minimal datasets, as well as the potential solutions to these difficulties. Findings, recommendations, and trends are disclosed to inform future research on developing an automated and robust CAD system for skin lesion analysis

    Detection of pathologies in retina digital images an empirical mode decomposition approach

    Get PDF
    Accurate automatic detection of pathologies in retina digital images offers a promising approach in clinicalapplications. This thesis employs the discrete wavelet transform (DWT) and empirical mode decomposition (EMD) to extract six statistical textural features from retina digital images. The statistical features are the mean, standard deviation, smoothness, third moment, uniformity, and entropy. The purpose is to classify normal and abnormal images. Five different pathologies are considered. They are Artery sheath (Coat’s disease), blot hemorrhage, retinal degeneration (circinates), age-related macular degeneration (drusens), and diabetic retinopathy (microaneurysms and exudates). Four classifiers are employed; including support vector machines (SVM), quadratic discriminant analysis (QDA), k-nearest neighbor algorithm (k-NN), and probabilistic neural networks (PNN). For each experiment, ten random folds are generated to perform cross-validation tests. In order to assess the performance of the classifiers, the average and standard deviation of the correct recognition rate, sensitivity and specificity are computed for each simulation. The experimental results highlight two main conclusions. First, they show the outstanding performance of EMD over DWT with all classifiers. Second, they demonstrate the superiority of the SVM classifier over QDA, k-NN, and PNN. Finally, principal component analysis (PCA) was employed to reduce the number of features in hope to improve the accuracy of classifiers. We find that there is no general and significant improvement of the performance, however. In sum, the EMD-SVM system provides a promising approach for the detection of pathologies in digital retina

    A framework for ancient and machine-printed manuscripts categorization

    Get PDF
    Document image understanding (DIU) has attracted a lot of attention and became an of active fields of research. Although, the ultimate goal of DIU is extracting textual information of a document image, many steps are involved in a such a process such as categorization, segmentation and layout analysis. All of these steps are needed in order to obtain an accurate result from character recognition or word recognition of a document image. One of the important steps in DIU is document image categorization (DIC) that is needed in many situations such as document image written or printed in more than one script, font or language. This step provides useful information for recognition system and helps in reducing its error by allowing to incorporate a category-specific Optical Character Recognition (OCR) system or word recognition (WR) system. This research focuses on the problem of DIC in different categories of scripts, styles and languages and establishes a framework for flexible representation and feature extraction that can be adapted to many DIC problem. The current methods for DIC have many limitations and drawbacks that restrict the practical usage of these methods. We proposed an efficient framework for categorization of document image based on patch representation and Non-negative Matrix Factorization (NMF). This framework is flexible and can be adapted to different categorization problem. Many methods exist for script identification of document image but few of them addressed the problem in handwritten manuscripts and they have many limitations and drawbacks. Therefore, our first goal is to introduce a novel method for script identification of ancient manuscripts. The proposed method is based on patch representation in which the patches are extracted using skeleton map of a document images. This representation overcomes the limitation of the current methods about the fixed level of layout. The proposed feature extraction scheme based on Projective Non-negative Matrix Factorization (PNMF) is robust against noise and handwriting variation and can be used for different scripts. The proposed method has higher performance compared to state of the art methods and can be applied to different levels of layout. The current methods for font (style) identification are mostly proposed to be applied on machine-printed document image and many of them can only be used for a specific level of layout. Therefore, we proposed new method for font and style identification of printed and handwritten manuscripts based on patch representation and Non-negative Matrix Tri-Factorization (NMTF). The images are represented by overlapping patches obtained from the foreground pixels. The position of these patches are set based on skeleton map to reduce the number of patches. Non-Negative Matrix Tri-Factorization is used to learn bases from each fonts (style) and then these bases are used to classify a new image based on minimum representation error. The proposed method can easily be extended to new fonts as the bases for each font are learned separately from the other fonts. This method is tested on two datasets of machine-printed and ancient manuscript and the results confirmed its performance compared to the state of the art methods. Finally, we proposed a novel method for language identification of printed and handwritten manuscripts based on patch representation and Non-negative Matrix Tri-Factorization (NMTF). The current methods for language identification are based on textual data obtained by OCR engine or images data through coding and comparing with textual data. The OCR based method needs lots of processing and the current image based method are not applicable to cursive scripts such as Arabic. In this work we introduced a new method for language identification of machine-printed and handwritten manuscripts based on patch representation and NMTF. The patch representation provides the component of the Arabic script (letters) that can not be extracted simply by segmentation methods. Then NMTF is used for dictionary learning and generating codebooks that will be used to represent document image with a histogram. The proposed method is tested on two datasets of machine-printed and handwritten manuscripts and compared to n-gram features (text-based), texture features and codebook features (imagebased) to validate the performance. The above proposed methods are robust against variation in handwritings, changes in the font (handwriting style) and presence of degradation and are flexible that can be used to various levels of layout (from a textline to paragraph). The methods in this research have been tested on datasets of handwritten and machine-printed manuscripts and compared to state-of-the-art methods. All of the evaluations show the efficiency, robustness and flexibility of the proposed methods for categorization of document image. As mentioned before the proposed strategies provide a framework for efficient and flexible representation and feature extraction for document image categorization. This frame work can be applied to different levels of layout, the information from different levels of layout can be merged and mixed and this framework can be extended to more complex situations and different tasks
    corecore