24 research outputs found

    Application for White Spot Syndrome Virus (WSSV) Monitoring using Edge Machine Learning

    Full text link
    The aquaculture industry, strongly reliant on shrimp exports, faces challenges due to viral infections like the White Spot Syndrome Virus (WSSV) that severely impact output yields. In this context, computer vision can play a significant role in identifying features not immediately evident to skilled or untrained eyes, potentially reducing the time required to report WSSV infections. In this study, the challenge of limited data for WSSV recognition was addressed. A mobile application dedicated to data collection and monitoring was developed to facilitate the creation of an image dataset to train a WSSV recognition model and improve country-wide disease surveillance. The study also includes a thorough analysis of WSSV recognition to address the challenge of imbalanced learning and on-device inference. The models explored, MobileNetV3-Small and EfficientNetV2-B0, gained an F1-Score of 0.72 and 0.99 respectively. The saliency heatmaps of both models were also observed to uncover the "black-box" nature of these models and to gain insight as to what features in the images are most important in making a prediction. These results highlight the effectiveness and limitations of using models designed for resource-constrained devices and balancing their performance in accurately recognizing WSSV, providing valuable information and direction in the use of computer vision in this domain.Comment: 6 pages, 7 figures, conferenc

    Baybayin Character Instance Detection

    Full text link
    The Philippine Government recently passed the "National Writing System Act," which promotes using Baybayin in Philippine texts. In support of this effort to promote the use of Baybayin, we present a computer vision system which can aid individuals who cannot easily read Baybayin script. In this paper, we survey the existing methods of identifying Baybayin scripts using computer vision and machine learning techniques and discuss their capabilities and limitations. Further, we propose a Baybayin Optical Character Instance Segmentation and Classification model using state-of-the-art Convolutional Neural Networks (CNNs) that detect Baybayin character instances in an image then outputs the Latin alphabet counterparts of each character instance in the image. Most existing systems are limited to character-level image classification and often misclassify or not natively support characters with diacritics. In addition, these existing models often have specific input requirements that limit it to classifying Baybayin text in a controlled setting, such as limitations in clarity and contrast, among others. To our knowledge, our proposed method is the first end-to-end character instance detection model for Baybayin, achieving a mAP50 score of 93.30%, mAP50-95 score of 80.50%, and F1-Score of 84.84%

    Modeling human attention by learning from large amount of emotional images

    No full text
    Recent resurgence of neural networks in computer vision have resulted in tremendous improvements in saliency prediction, eventually, saturating some saliency metrics. This leads researchers to devise higher-level concepts in images in order to match the key image regions attended to by human observers. In this paper, we propose a saliency model which utilizes the top-down attention mechanism through the involvement of emotion-inducing region information in the predictor\u27s feature space. The proposed framework is inspired by psychological and neurological studies that emotion attracts attention. Using three publicly available datasets with emotion-rich images, we were able to show that awareness of the emotion-inducing region improves saliency prediction of images. Saliency metrics for probabilistic models, particularly information gain and KL-divergence, have improved with respect to the same architecture without emotion information. Statistical tests show that emotional regions generally have higher improvement than neutral regions corroborating psychological studies that emotion attracts attention. © 2019 IEEE

    Investigating biological feature detectors in simple pattern recognition towards complex saliency prediction tasks

    No full text
    The conventional convolution filter in deep architectures has proven its capability to extract semantic information from the input images and to use these in different visual tasks. For many researchers in computer vision, this raises the question, have pattern recognition models begun to converge on human performance? This thesis explores a new biologically-inspired feature detector for pattern recognition which learns via competition. We describe and exhaustively characterize our proposed alternative feature detector and compare this with the traditional convolution filter feature detector. Our experiments show the potential of the proposed feature detector and that its performance is at par with the performance of the convolution filter. Using the feature detector with more desirable result, we then design and propose a computational model for one of the primitive pattern recognition tasks of the visual system, the saliency map generation. The study provides a methodology for quantifying the contribution of the convolution filter in simple pattern recognition tasks and use this to benchmark our proposed competition-based feature detectors. Towards achieving an improved computational model for a complex prediction task of visual systems, we further use the biological feature detectors in extracting and incorporating emotion-evoking objects in saliency prediction

    Performance comparison of ASR classifiers for the development of an English CAPT system for Filipino students

    No full text
    Computer Assisted Pronunciation Training (CAPT) systems aim to provide immediate, individualized feedback to the user on the overall quality of the pronunciation made. In such systems, one must be able to extract features from a waveform and represent words in the vocabulary. This paper presents the performance of Hidden Markov Model (HMM), Support-Vector Machine (SVM) and Multilayer Perceptron (MLP) as automatic speech recognizers for the English digits spoken by Filipino speakers. Speech waveforms are translated into a set of feature vectors using Mel Frequency Cepstrum Coefficients (MFCC). The training set consists of speech samples recorded by native Filipinos who speak English. The HMM-trained model produced a recognition rate of 95.79% compared to 86.33% and 91.66% recognition rates of SVM and MLP, respectively. 1 © 2012 IEEE

    Fast emulation of self-organizing maps for large datasets

    No full text
    © 2015 The Authors. Published by Elsevier B.V. The self-organizing map (SOM) methodology does vector quantization and clustering on the dataset, and then projects the obtained clusters to a lower dimensional space, such as a 2D map, by positioning similar clusters in locations that are spatially closer in the lower dimension space. This makes the SOM methodology an effective tool for data visualization. However, in a world where mined information from big data have to be available immediately, SOM becomes an unattractive tool because of its time complexity. In this paper, we propose an alternative visualization methodology for large datasets that emulates SOM methodology without the speed constraints inherent to SOM. To demonstrate the efficiency and the potential of the proposed scheme as a fast visualization tool, the methodology is used to cluster and project the 3,823 image samples of handwritten digits of the Optical Recognition of Handwritten Digits dataset. Although the dataset is not, by any means large, it is sufficient to demonstrate the speed-up that can be achieved by using this proposed SOM emulation procedure

    A new method for emulating self-organizing maps for visualization of datasets

    No full text
    Several time-critical problems relying on large amount of data, e.g., business trends, disaster response and disease outbreak, require cost-effective, timely and accurate data summary and visualization, in order to come up with an efficient and effective decision. Self-organizing map (SOM) is a very effective data clustering and visualization tool as it provides intuitive display of data in lower-dimensional space. However, with O(N2) complexity, SOM becomes inappropriate for large datasets. In this paper, we propose a force-directed visualization method that emulates SOMs capability to display the data clusters with O(N) complexity. The main idea is to perform a force-directed fine-tuning of the 2D representation of data. To demonstrate the efficiency and the vast potential of the proposed method as a fast visualization tool, the methodology is used to do a 2D-projection of the MNIST handwritten digits dataset. © 2018 World Scientific Publishing Europe Ltd

    A computer assisted diagnosis system for the identification/auscultation of pulmonary pathologies

    No full text
    Statistics show that the primary cause of morbidity and mortality among Filipinos are pulmonary illnesses. These illnesses could have been prevented if detected and treated early. With the physicians medical knowledge and experience, early detection of possible common pulmonary diseases can be performed using a stethoscope. However, with the current physician-to-population ratio in the country, early detection of respiratory diseases may not be performed on most cases especially in the rural areas, causing even benign cases to lead to mortality. In this paper, we present the development of a system that classifies lung sound for possible pulmonary pathology.Using an electronic stethoscope, lung sounds were collected from healthy individuals and patients with common pulmonary problems for the developed systems training and evaluation. The collected data were pre-processed in order to remove mechanical and other external noises. Using Support Vector Machine (SVM) for modelling and classification, the developed system was able to achieve 100% identification of the normal lung sound from the adventitious lung sound, with an average cross-validation performance of 88%. The developed system, however, has low performance in classifying specific lung sounds, that is, normal vs. crackle vs. wheeze vs. ronchi, with an average accuracy of 61.42% and an average cross-validation performance of 90%

    Measuring the contribution of filter bank layer to performance of convolutional neural networks

    No full text
    Object identification is essential in diverse automated applications such as in health, business, and national security. It relies on the ability of the image processing scheme to detect visual features under a wide variety of conditions such as the object rotation, translation and geometric transformation. Machine learning methods, in this case, play an important role in improving the object identification performance by resolving whether the extracted visual patterns are from the possibly distorted target object or not. In recent works, systems that employ a Convolutional Neural Network (CNN) as the primary pattern recognition scheme demonstrate superior performance over other object identification systems based on handpicked shape-based features. Several studies credit this to the invariance of CNN to small distortion and spatial translation which in turn is attributed to its filter bank layer or the convolution layer. However, there has been no study to carefully test this claim. Towards studying the source of CNN\u27s superior performance, a methodology is designed that tracks the CNN performance when spatial information for visual features (e.g. edges, corners and end points) are gradually removed. Using the MNIST dataset, results show that as the spatial correlation information among pixels is slowly decreased, the performance of the CNN in recognizing handwritten digits also correspondingly decreases. The drop in accuracy continues until the accuracy approximates the performance of the classifier that was obtained without the filter bank. Conducted using a more complex dataset consisting of images of land vehicles, a similar set of experiments show the same drop in classification performance as spatial information among pixels is slowly removed. © 2017 - IOS Press and the authors. All rights reserved

    Self-organizing maps as feature detectors for supervised neural network pattern recognition

    No full text
    Convolutional neural network (CNN)-based works show that learned features, rather than handpicked features, produce more desirable performance in pattern recognition. This learning approach is based on higher organisms visual system which are developed based on the input environment. However, the feature detectors of CNN are trained using an error-correcting teacher as opposed to the natural competition to build node connections. As such, a neural network model using self-organizing map (SOM) as feature detector is proposed in this work. As proof of concept, the handwritten digits dataset is used to test the performance of the proposed architecture. The size of the feature detector as well as the different arrangement of receptive fields are considered to benchmark the performance of the proposed network. The performance for the proposed architecture achieved comparable performance to vanilla MLP, being 96.93% using 4×4 SOM and six receptive field regions. © Springer International Publishing AG 2016
    corecore