8 research outputs found

    Detection of curved lines with B-COSFIRE filters: A case study on crack delineation

    Full text link
    The detection of curvilinear structures is an important step for various computer vision applications, ranging from medical image analysis for segmentation of blood vessels, to remote sensing for the identification of roads and rivers, and to biometrics and robotics, among others. %The visual system of the brain has remarkable abilities to detect curvilinear structures in noisy images. This is a nontrivial task especially for the detection of thin or incomplete curvilinear structures surrounded with noise. We propose a general purpose curvilinear structure detector that uses the brain-inspired trainable B-COSFIRE filters. It consists of four main steps, namely nonlinear filtering with B-COSFIRE, thinning with non-maximum suppression, hysteresis thresholding and morphological closing. We demonstrate its effectiveness on a data set of noisy images with cracked pavements, where we achieve state-of-the-art results (F-measure=0.865). The proposed method can be employed in any computer vision methodology that requires the delineation of curvilinear and elongated structures.Comment: Accepted at Computer Analysis of Images and Patterns (CAIP) 201

    Deteksi Jumlah Percabangan pada Trabecular Bone Menggunakan COSFIRE Filter untuk Identifikasi Osteoporosis

    Get PDF
    Tulang rahang adalah salah satu tulang yang terkena pengaruh penurunan kepadatan mineral tulang yang diakibatkan oleh osteoporosis. Karena itu, citra radiograf panoramik gigi dapat digunakan untuk mengidentifikasi osteoporosis. Beberapa penelitian sebelumnya menunjukkan bahwa jumlah percabangan pada struktur tulang berbeda antara pasien normal dan pasien dengan kepadatan mineral tulang yang rendah. Namun, kontras yang rendah dan terdapatnya noise pada citra radiograf panoramik membuat ekstraksi struktur tulang menjadi sulit. Untuk itu, dibutuhkan sebuah metode untuk memperkuat struktur pada tulang tersebut.Pada penelitian ini, diusulkan sebuah metode untuk mendeteksi percabangan pada trabecular bone dengan enhancement pada struktur tulang menggunakan metode line operator. Dari struktur tersebut, lokasi percabangan dideteksi menggunakan metode COSFIRE. Kemudian, jumlah percabangan digunakan untuk membedakan antara radiograf pasien normal dan radiograf pasien osteoporosis.Pengujian klasifikasi dilakukan pada 98 citra yang terdiri atas 41 citra pasien osteoporosis dan 57 pasien normal. Hasilnya adalah sensitivity, specificity, dan akurasi masing-masing sebesar 0,90244, 0,23214, dan 0,51546. Hasil tersebut menunjukkan bahwa metode yang diusulkan menghasilkan performa yang lebih baik daripada metode sebelumnya.Tulang rahang adalah salah satu tulang yang terkena pengaruh penurunan kepadatan mineral tulang yang diakibatkan oleh osteoporosis. Karena itu, citra radiograf panoramik gigi dapat digunakan untuk mengidentifikasi osteoporosis. Beberapa penelitian sebelumnya menunjukkan bahwa jumlah percabangan pada struktur tulang berbeda antara pasien normal dan pasien dengan kepadatan mineral tulang yang rendah. Namun, kontras yang rendah dan terdapatnya noise pada citra radiograf panoramik membuat ekstraksi struktur tulang menjadi sulit. Untuk itu, dibutuhkan sebuah metode untuk memperkuat struktur pada tulang tersebut.Pada penelitian ini, diusulkan sebuah metode untuk mendeteksi percabangan pada trabecular bone dengan enhancement pada struktur tulang menggunakan metode line operator. Dari struktur tersebut, lokasi percabangan dideteksi menggunakan metode COSFIRE. Kemudian, jumlah percabangan digunakan untuk membedakan antara radiograf pasien normal dan radiograf pasien osteoporosis.Pengujian klasifikasi dilakukan pada 98 citra yang terdiri atas 41 citra pasien osteoporosis dan 57 pasien normal. Hasilnya adalah sensitivity, specificity, dan akurasi masing-masing sebesar 0,90244, 0,23214, dan 0,51546. Hasil tersebut menunjukkan bahwa metode yang diusulkan menghasilkan performa yang lebih baik daripada metode sebelumnya

    Automatic Assessment of Seed Germination Percentage

    Get PDF
    This research was designed to investigate an automatic seed germination rate for the top of paper germination method. Chili and guinea were adopted to be used in the experiment with a 4-time repetition and 2 sets of the germination group (4-separated plates with 50 seeds per plate, 2 sets per seed type, totally 400 seeds of chili and 400 seeds of quinea). Two detection methods were proposed binary thresholding and maximum likelihood; based on color analysis. An uncontrolled environment image taking was the way to collect image data. The results were compared to a hand-labeling groundtruth. Both methods achieved accuracy rate higher than 93% which was promising to implement this system. The binary thresholding was a lightweight method suitable for a very limited resource software environment system. The maximum likelihood was more complex. The method had more potential than the binary thresholding, it was flexible to the light condition, returned few false alarms per image (less than 3 false alarms per image). Maximum likelihood could be adopted to implement in a proper environment which still could be in a mobile device

    Convolutional Neural Networks Exploiting Attributes of Biological Neurons

    Full text link
    In this era of artificial intelligence, deep neural networks like Convolutional Neural Networks (CNNs) have emerged as front-runners, often surpassing human capabilities. These deep networks are often perceived as the panacea for all challenges. Unfortunately, a common downside of these networks is their ''black-box'' character, which does not necessarily mirror the operation of biological neural systems. Some even have millions/billions of learnable (tunable) parameters, and their training demands extensive data and time. Here, we integrate the principles of biological neurons in certain layer(s) of CNNs. Specifically, we explore the use of neuro-science-inspired computational models of the Lateral Geniculate Nucleus (LGN) and simple cells of the primary visual cortex. By leveraging such models, we aim to extract image features to use as input to CNNs, hoping to enhance training efficiency and achieve better accuracy. We aspire to enable shallow networks with a Push-Pull Combination of Receptive Fields (PP-CORF) model of simple cells as the foundation layer of CNNs to enhance their learning process and performance. To achieve this, we propose a two-tower CNN, one shallow tower and the other as ResNet 18. Rather than extracting the features blindly, it seeks to mimic how the brain perceives and extracts features. The proposed system exhibits a noticeable improvement in the performance (on an average of 5%โˆ’10%5\%-10\%) on CIFAR-10, CIFAR-100, and ImageNet-100 datasets compared to ResNet-18. We also check the efficiency of only the Push-Pull tower of the network.Comment: 20 pages, 6 figure

    To Perform Road Signs Recognition for Autonomous Vehicles Using Cascaded Deep Learning Pipeline

    Get PDF
    Autonomous vehicle is a vehicle that can guide itself without human conduction. It is capable of sensing its environment and moving with little or no human input. This kind of vehicle has become a concrete reality and may pave the way for future systems where computers take over the art of driving. Advanced artificial intelligence control systems interpret sensory information to identify appropriate navigation paths, as well as obstacles and relevant road signs. In this paper, we introduce an intelligent road signs classifier to help autonomous vehicles to recognize and understand road signs. The road signs classifier based on an artificial intelligence technique. In particular, a deep learning model is used, Convolutional Neural Networks (CNN). CNN is a widely used Deep Learning model to solve pattern recognition problems like image classification and object detection. CNN has successfully used to solve computer vision problems because of its methodology in processing images that are similar to the human brain decision making. The evaluation of the proposed pipeline was trained and tested using two different datasets. The proposed CNNs achieved high performance in road sign classification with a validation accuracy of 99.8% and a testing accuracy of 99.6%. The proposed method can be easily implemented for real time application

    A study on Image Caption using Double Embedding Technique and Bi-RNN

    Get PDF
    ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ๋ฌธ์žฅ ํ‘œํ˜„๋ ฅ์„ ํ–ฅ์ƒ์‹œํ‚ค๊ณ  ์ด๋ฏธ์ง€ ํŠน์ง• ๋ฒกํ„ฐ์˜ ์†Œ๋ฉธ์„ ๋ฐฉ์ง€ํ•  ์ˆ˜ ์žˆ๋Š” ์ด์ค‘ Embedding ๊ธฐ๋ฒ•๊ณผ ๋ฌธ๋งฅ์— ๋งž๋Š” ๋ฌธ์žฅ ์ˆœ์„œ๋ฅผ ์ƒ์„ฑํ•˜๋Š” Bidirectional Recurrent Neural Network(Bi-RNN)์„ ์ ์šฉํ•œ ๋””ํ…Œ์ผํ•œ ์ด๋ฏธ์ง€ ์บก์…˜ ๋ชจ๋ธ์„ ์ œ์•ˆํ•œ๋‹ค. ์ด์ค‘ Embedding ๊ธฐ๋ฒ•์—์„œ, Word Embedding ๊ณผ์ •์ธ Embeddingโ… ์€ ์บก์…˜์˜ ํ‘œํ˜„๋ ฅ์„ ํ–ฅ์ƒ์‹œํ‚ค๊ธฐ ์œ„ํ•ด ๋ฐ์ดํ„ฐ์„ธํŠธ์˜ ์บก์…˜ ๋‹จ์–ด๋ฅผ One-hot encoding ๋ฐฉ์‹์„ ํ†ตํ•ด ๋ฒกํ„ฐํ™”ํ•˜๊ณ  Embeddingโ…ก๋Š” ์บก์…˜ ์ƒ์„ฑ ๊ณผ์ •์—์„œ ๋ฐœ์ƒํ•˜๋Š” ์ด๋ฏธ์ง€ ํŠน์ง•์˜ ์†Œ๋ฉธ์„ ๋ฐฉ์ง€ํ•˜๊ธฐ ์œ„ํ•ด ์ด๋ฏธ์ง€ ํŠน์ง• ๋ฒกํ„ฐ์™€ ๋‹จ์–ด ๋ฒกํ„ฐ๋ฅผ ์œตํ•ฉํ•จ์œผ๋กœ์จ ๋ฌธ์žฅ ๊ตฌ์„ฑ ์š”์†Œ์˜ ๋ˆ„๋ฝ์„ ๋ฐฉ์ง€ํ•œ๋‹ค. ๋˜ํ•œ ๋””์ฝ”๋” ์˜์—ญ์€ ์–ดํœ˜ ๋ฐ ์ด๋ฏธ์ง€ ํŠน์ง•์„ ์–‘๋ฐฉํ–ฅ์œผ๋กœ ํš๋“ํ•˜๋Š” Bi-RNN์œผ๋กœ ๊ตฌ์„ฑํ•˜์—ฌ ๋ฌธ๋งฅ์— ๋งž๋Š” ๋ฌธ์žฅ์˜ ์ˆœ์„œ๋ฅผ ํ•™์Šตํ•œ๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ ์ธ์ฝ”๋”์™€ ๋””์ฝ”๋”๋ฅผ ํ†ตํ•˜์—ฌ ํš๋“๋œ ์ „์ฒด ์ด๋ฏธ์ง€, ๋ฌธ์žฅ ํ‘œํ˜„, ๋ฌธ์žฅ ์ˆœ์„œ ํŠน์ง•๋“ค์„ ํ•˜๋‚˜์˜ ๋ฒกํ„ฐ๊ณต๊ฐ„์ธ Multimodal ๋ ˆ์ด์–ด์— ์œตํ•ฉํ•จ์œผ๋กœ์จ ๋ฌธ์žฅ์˜ ์ˆœ์„œ์™€ ํ‘œํ˜„๋ ฅ์„ ๋ชจ๋‘ ๊ณ ๋ คํ•œ ๋””ํ…Œ์ผํ•œ ์บก์…˜์„ ์ƒ์„ฑํ•œ๋‹ค. ์ œ์•ˆํ•˜๋Š” ๋ชจ๋ธ์€ Flickr 8K ๋ฐ Flickr 30K, MSCOCO์™€ ๊ฐ™์€ ์ด๋ฏธ์ง€ ์บก์…˜ ๋ฐ์ดํ„ฐ์„ธํŠธ๋ฅผ ์ด์šฉํ•˜์—ฌ ํ•™์Šต ๋ฐ ํ‰๊ฐ€๋ฅผ ์ง„ํ–‰ํ•˜์˜€์œผ๋ฉฐ ๊ฐ๊ด€์ ์ธ BLEU์™€ METEOR ์ ์ˆ˜๋ฅผ ํ†ตํ•ด ๋ชจ๋ธ ์„ฑ๋Šฅ์˜ ์šฐ์ˆ˜์„ฑ์„ ์ž…์ฆํ•˜์˜€๋‹ค. ๊ทธ ๊ฒฐ๊ณผ, ์ œ์•ˆํ•œ ๋ชจ๋ธ์€ 3๊ฐœ์˜ ๋‹ค๋ฅธ ์บก์…˜ ๋ชจ๋ธ๋“ค์— ๋น„ํ•ด BLEU ์ ์ˆ˜๋Š” ์ตœ๋Œ€ 20.2์ , METEOR ์ ์ˆ˜๋Š” ์ตœ๋Œ€ 3.65์ ์ด ํ–ฅ์ƒ๋˜์—ˆ๋‹ค.|This thesis proposes a detailed image caption model that applies the double embedding technique to improve sentence expressiveness and to prevent vanishing of image feature vectors. It uses the bidirectional recurrent neural network (Bi-RNN) to generate a sequence of sentences and fit their contexts. In the double-embedding technique, embedding โ…  is a word-embedding process used to vectorize dataset captions through one-hot encoding to improve the expressiveness of the captions. Embedding โ…ก prevents missed sentence components by fusing image features and word vectors to prevent image features from vanishing during caption generation. The decoder area, composed of a Bi-RNN that acquires vocabulary and image features in both directions, learns the sequence of sentences that fits their contexts. Finally, through the encoder and decoder, the detailed image caption is generated by considering both sequence and sentence expressiveness by fusing the acquired image features, sentence presentation features, and sentence sequence features into a multimodal layer as a vector space. The proposed model was learned and evaluated using image caption datasets (e.g., Flickr 8K, Flickr 30K, and MSCOCO). The proven BLEU and METEOR scores demonstrate the superiority of the model. The proposed model achieved a BLEU score maximum of 20.2 points and a METEOR score maximum of 3.65 points, which is higher than the scores of other three caption models.๋ชฉ ์ฐจ ๋ชฉ ์ฐจ โ…ฐ ๊ทธ๋ฆผ ๋ฐ ํ‘œ ๋ชฉ์ฐจ โ…ฑ Abstract โ…ณ ์ œ 1 ์žฅ ์„œ ๋ก  01 ์ œ 2 ์žฅ ๋‰ด๋Ÿด ๋„คํŠธ์›Œํฌ ๋ฐ ํ‰๊ฐ€์ง€ํ‘œ 04 2.1 Convolutional Neural Network 04 2.2 Recurrent Neural Network 08 2.3 Long Short-Term Memory 10 2.4 Gated Recurrent Unit 13 2.5 Bidirectional Recurrent Neural Network 15 2.6 Bi-Lingual Evaluation Understudy 17 2.7 Metric for Evaluation of Translation with Explicit ORdering 20 ์ œ 3 ์žฅ ์ œ์•ˆํ•œ ์ด๋ฏธ์ง€ ์บก์…˜ ๋ชจ๋ธ 23 3.1 ์ด์ค‘ Embedding ๊ธฐ๋ฒ•๊ณผ Bi-RNN์„ ์ด์šฉํ•œ ์บก์…˜ ๊ตฌ์„ฑ ๊ณผ์ • 25 3.2 Multimodal ๋ ˆ์ด์–ด๋ฅผ ์ด์šฉํ•œ ์บก์…˜ ์ƒ์„ฑ ๊ณผ์ • 27 ์ œ 4 ์žฅ ์‹คํ—˜ ๋ฐ ๊ฒฐ๊ณผ 29 4.1 ๋ฐ์ดํ„ฐ์„ธํŠธ ๋ฐ ์ „์ฒ˜๋ฆฌ ๊ณผ์ • 29 4.2 ์‹คํ—˜ ๊ฒฐ๊ณผ ๋ถ„์„ 31 ์ œ 5 ์žฅ ๊ฒฐ ๋ก  41 ์ฐธ ๊ณ  ๋ฌธ ํ—Œ 42Maste

    Color-blob-based COSFIRE filters for object recognition

    Get PDF
    Most object recognition methods rely on contour-defined features obtained by edge detection or region segmentation. They are not robust to diffuse region boundaries. Furthermore, such methods do not exploit region color information. We propose color-blob-based COSFIRE (Combination of Shifted Filter Responses) filters to be selective for combinations of diffuse circular regions (blobs) in specific mutual spatial arrangements. Such a filter combines the responses of a certain selection of Difference-of-GausSians filters, essentially blob detectors, of different scales, in certain channels of a color space, and at certain relative positions to each other. Its parameters are determined learned in an automatic configuration process that analyzes the properties of a given prototype object of interest. We use these filters to compute features that are effective for the recognition of the prototype objects. We form feature vectors that we use With an SVM classifier. We evaluate the proposed method on a traffic sign (GTSRB) and a butterfly data sets. For the GTSRB data set we achieve a recognition rate of 98.94%, which is slightly higher than human performance and for the butterfly data set we achieve 89.029 The proposed color-blob-based COSFIRE filters are very effective and outperform the contour-based COSFIRE filters. A COSFIRE filter is trainable, it can be configured with a single prototype pattern and it does not require domain knowledge. (C) 2016 Elsevier B.V. All rights reserved
    corecore