23 research outputs found

    Recurrent Neural Network-Based Multimodal Deep Learning for Estimating Missing Values in Healthcare

    No full text
    This estimation method operates by integrating the input values that are redundantly collected from heterogeneous devices through the selection of a representative value and estimating missing values by using a multimodal RNN. Users use a heterogeneous healthcare platform mainly in a mobile environment. Users who pay a relatively large amount of attention to healthcare possess various types of healthcare devices and collect data through their mobile devices. The collected data may be duplicated depending on the types of these devices. This data duplication causes an ambiguity issue in that it is difficult to determine which value among multiple data should be taken as the user’s actual value. Accordingly, it is necessary to create a neural network structure that considers the data value at the time previous to the current time. RNNs are appropriate for handling data with a time series characteristic. To learn an RNN-based neural network, learning data that have the same time step are required. Therefore, an RNN in which one variable becomes single-modal was designed for each learning run. In the RNN, a cell is a gated recurrent unit (GRU) cell that presents sufficient accuracy in the small resource environment of mobile devices. The RNNs that are learned according to the variables can each operate without additional learning, even if the situation of the user’s mobile device changes. In a heterogeneous environment, missing values are generated by various types of errors, including errors caused by battery charge and discharge, sensor failure, equipment exchange, and near-field communication errors. The higher the missing value ratio, the greater the number of errors that are likely to occur. For this reason, to achieve a more stable heterogeneous health platform, missing values must be considered. In this study, a missing value was estimated by means of multimodal deep learning; that is, a multimodal deep learning method was designed with one neural network that was connected with each learned single-modal RNN using a fully connected network (FCN). Each RNN input value delivers mutual influence through the weights of the FCN, and thereby, it is possible to estimate an output value even if any one of the input values is missing. According to the evaluation in terms of representative value selection, when a representative value was selected by using the mean or median, the most stable service was achieved. As a result of the evaluation according to the estimation method, the accuracy of the RNN-based multimodal deep learning method is 3.91%p higher than that of the SVD method

    ViT-Based Multi-Scale Classification Using Digital Signal Processing and Image Transformation

    No full text
    The existing classification of time-series data has difficulties that traditional methodologies struggle to address, such as complexity and dynamic variation. Difficulty with pattern recognition and long-term dependency modeling, high dimensionality and complex interactions between variables, and incompleteness of irregular intervals, missing values, and noise are the main causes for the degradation of model performance. Therefore, it is necessary to develop new classification methodologies to effectively process time-series data and make real-world applications. Accordingly, this study proposes ViT-based multi-scale classification using digital signal processing and image transformation. It comprises feature extraction through digital signal processing (DSP), image transformation, and vision transformer (ViT) based classification. In the DSP stage, a total of five features are extracted through sampling, quantization, and discrete fourier transform (DFT), which are sampling time, sampled signal, quantized signal, and magnitudes and phases extracted through DFT processing. Subsequently, the extracted multi-scale features are used to generate new images. Finally, based on the generated images, a ViT model is applied to make multi-class classification. This study confirms the superiority of the proposed approach by comparing traditional models with ViT and convolutional neural network (CNN) models. Particularly, by showing excellent classification performance even for the most challenging classes, it proves effective data processing in terms of data diversity. Ultimately, this study suggests a methodology for the analysis and classification of time-series data and shows that it has the potential to be applied to a wide range of data analysis problems

    A Frequency Pattern Mining Model Based on Deep Neural Network for Real-Time Classification of Heart Conditions

    No full text
    Recently, a massive amount of big data of bioinformation is collected by sensor-based IoT devices. The collected data are also classified into different types of health big data in various techniques. A personalized analysis technique is a basis for judging the risk factors of personal cardiovascular disorders in real-time. The objective of this paper is to provide the model for the personalized heart condition classification in combination with the fast and effective preprocessing technique and deep neural network in order to process the real-time accumulated biosensor input data. The model can be useful to learn input data and develop an approximation function, and it can help users recognize risk situations. For the analysis of the pulse frequency, a fast Fourier transform is applied in preprocessing work. With the use of the frequency-by-frequency ratio data of the extracted power spectrum, data reduction is performed. To analyze the meanings of preprocessed data, a neural network algorithm is applied. In particular, a deep neural network is used to analyze and evaluate linear data. A deep neural network can make multiple layers and can establish an operation model of nodes with the use of gradient descent. The completed model was trained by classifying the ECG signals collected in advance into normal, control, and noise groups. Thereafter, the ECG signal input in real time through the trained deep neural network system was classified into normal, control, and noise. To evaluate the performance of the proposed model, this study utilized a ratio of data operation cost reduction and F-measure. As a result, with the use of fast Fourier transform and cumulative frequency percentage, the size of ECG reduced to 1:32. According to the analysis on the F-measure of the deep neural network, the model had 83.83% accuracy. Given the results, the modified deep neural network technique can reduce the size of big data in terms of computing work, and it is an effective system to reduce operation time

    Detection of Emotion Using Multi-Block Deep Learning in a Self-Management Interview App

    No full text
    Recently, domestic universities have constructed and operated online mock interview systems for students’ preparation for employment. Students can have a mock interview anywhere and at any time through the online mock interview system, and can improve any problems during the interviews via images stored in real time. For such practice, it is necessary to analyze the emotional state of the student based on the situation, and to provide coaching through accurate analysis of the interview. In this paper, we propose detection of user emotions using multi-block deep learning in a self-management interview application. Unlike the basic structure for learning about whole-face images, the multi-block deep learning method helps the user learn after sampling the core facial areas (eyes, nose, mouth, etc.), which are important factors for emotion analysis from face detection. Through the multi-block process, sampling is carried out using multiple AdaBoost learning. For optimal block image screening and verification, similarity measurement is also performed during this process. A performance evaluation of the proposed model compares the proposed system with AlexNet, which has mainly been used for facial recognition in the past. As comparison items, the recognition rate and extraction time of the specific area are compared. The extraction time of the specific area decreased by 2.61%, and the recognition rate increased by 3.75%, indicating that the proposed facial recognition method is excellent. It is expected to provide good-quality, customized interview education for job seekers by establishing a systematic interview system using the proposed deep learning method

    RGB Channel Combinations Method for Feature Extraction in Image Analysis

    No full text
    Latest image analysis deep learning algorithms use diverse methods to extract features from images based on the Convolution Neural Network (CNN). CNN has a convolution layer consisting of RGB as three overlapping channels in the feature extraction process, and such architecture enables the backbone network to flow without losing each hue information. Therefore, 3D input data consisting of 3 channels to process the RGB channel consists of a large-scale neural network with many layer blocks. This processing method exhibits high accuracy. However, in terms of practicality, it results in big inefficiencies such as memory overhead and computational overhead. This study proposes the RGB Channel Combinations Method for Feature Extraction in Image Analysis to resolve such inefficiencies. This is a method that converts the RGB value into one tensor structure by combining each weight and bias and makes it possible to pass through the backbone network without damaging hue information. Based on the experiment results, it is confirmed that the accuracy decreased by 1.42% compared to the pre-existing method, but the number of parameters used by the input layer decreased. It is confirmed that the pre-processing used in the proposed method gained an additional computational overhead, but the number of input parameters decreased to 1/3 in the feature extraction stage performed afterward. As the proposed method applies to all image analysis algorithms, its expandability is extremely high and can process a large amount of image data

    Health Risk Detection and Classification Model Using Multi-Model-Based Image Channel Expansion and Visual Pattern Standardization

    No full text
    Although mammography is an effective screening method for early detection of breast cancer, it is also difficult for experts to use since it requires a high level of sensitivity and expertise. A computer-aided detection system was introduced to improve the detection accuracy of breast cancer in mammography, which is difficult to read. In addition, research to find lesions in mammography images using artificial intelligence has been actively conducted in recent days. However, the images generally used for breast cancer diagnosis are high-resolution and thus require high-spec equipment and a significant amount of time and money to learn and recognize the images and process calculations. This can lower the accuracy of the diagnosis since it depends on the performance of the equipment. To solve this problem, this paper will propose a health risk detection and classification model using multi-model-based image channel expansion and visual pattern shaping. The proposed method expands the channels of breast ultrasound images and detects tumors quickly and accurately through the YOLO model. In order to reduce the amount of computation to enable rapid diagnosis of the detected tumors, the model reduces the dimensions of the data by normalizing the visual information and use them as an input for the RNN model to diagnose breast cancer. When the channels were expanded through the proposed brightness smoothing and visual pattern shaping, the accuracy was the highest at 94.9%. Based on the images generated, the study evaluated the breast cancer diagnosis performance. The results showed that the accuracy of the proposed model was 97.3%, CRNN 95.2%, VGG 93.6%, AlexNet 62.9%, and GoogleNet 75.3%, confirming that the proposed model had the best performance

    Activity Recommendation Model Using Rank Correlation for Chronic Stress Management

    No full text
    Korean people are exposed to stress due to the constant competitive structure caused by rapid industrialization. As a result, there is a need for ways that can effectively manage stress and help improve quality of life. Therefore, this study proposes an activity recommendation model using rank correlation for chronic stress management. Using Spearman’s rank correlation coefficient, the proposed model finds the correlations between users’ Positive Activity for Stress Management (PASM), Negative Activity for Stress Management (NASM), and Perceived Stress Scale (PSS). Spearman’s rank correlation coefficient improves the accuracy of recommendations by putting a basic rank value in a missing value to solve the sparsity problem and cold-start problem. For the performance evaluation of the proposed model, F-measure is applied using the average precision and recall after five times of recommendations for 20 users. As a result, the proposed method has better performance than other models, since it recommends activities with the use of the correlation between PASM and NASM. The proposed activity recommendation model for stress management makes it possible to manage user’s stress effectively by lowering the user’s PSS using correlation

    Line-segment Feature Analysis Algorithm Using Input Dimensionality Reduction for Handwritten Text Recognition

    No full text
    Recently, demand for handwriting recognition, such as automation of mail sorting, license plate recognition, and electronic memo pads, has exponentially increased in various industrial fields. In addition, in the image recognition field, methods using artificial convolutional neural networks, which show outstanding performance, have been applied to handwriting recognition. However, owing to the diversity of recognition application fields, the number of dimensions in the learning and reasoning processes is increasing. To solve this problem, a principal component analysis (PCA) technique is used for dimensionality reduction. However, PCA is likely to increase the accuracy loss due to data compression. Therefore, in this paper, we propose a line-segment feature analysis (LFA) algorithm for input dimensionality reduction in handwritten text recognition. This proposed algorithm extracts the line segment information, constituting the image of input data, and assigns a unique value to each segment using 3 × 3 and 5 × 5 filters. Using the unique values to identify the number of line segments and adding them up, a 1-D vector with a size of 512 is created. This vector is used as input to machine-learning. For the performance evaluation of the method, the Extending Modified National Institute of Standards and Technology (EMNIST) database was used. In the evaluation, PCA showed 96.6% and 93.86% accuracy with k-nearest neighbors (KNN) and support vector machine (SVM), respectively, while LFA showed 97.5% and 98.9% accuracy with KNN and SVM, respectively

    Driver Facial Expression Analysis Using LFA-CRNN-Based Feature Extraction for Health-Risk Decisions

    No full text
    As people communicate with each other, they use gestures and facial expressions as a means to convey and understand emotional state. Non-verbal means of communication are essential to understanding, based on external clues to a person’s emotional state. Recently, active studies have been conducted on the lifecare service of analyzing users’ facial expressions. Yet, rather than a service necessary for everyday life, the service is currently provided only for health care centers or certain medical institutions. It is necessary to conduct studies to prevent accidents that suddenly occur in everyday life and to cope with emergencies. Thus, we propose facial expression analysis using line-segment feature analysis-convolutional recurrent neural network (LFA-CRNN) feature extraction for health-risk assessments of drivers. The purpose of such an analysis is to manage and monitor patients with chronic diseases who are rapidly increasing in number. To prevent automobile accidents and to respond to emergency situations due to acute diseases, we propose a service that monitors a driver’s facial expressions to assess health risks and alert the driver to risk-related matters while driving. To identify health risks, deep learning technology is used to recognize expressions of pain and to determine if a person is in pain while driving. Since the amount of input-image data is large, analyzing facial expressions accurately is difficult for a process with limited resources while providing the service on a real-time basis. Accordingly, a line-segment feature analysis algorithm is proposed to reduce the amount of data, and the LFA-CRNN model was designed for this purpose. Through this model, the severity of a driver’s pain is classified into one of nine types. The LFA-CRNN model consists of one convolution layer that is reshaped and delivered into two bidirectional gated recurrent unit layers. Finally, biometric data are classified through softmax. In addition, to evaluate the performance of LFA-CRNN, the performance was compared through the CRNN and AlexNet Models based on the University of Northern British Columbia and McMaster University (UNBC-McMaster) database
    corecore