14 research outputs found
A hybrid camera- and ultrasound-based approach for needle localization and tracking using a 3D motorized curvilinear ultrasound probe
Three-dimensional (3D) motorized curvilinear ultrasound probes provide an effective, low-cost tool to guide needle interventions, but localizing and tracking the needle in 3D ultrasound volumes is often challenging. In this study, a new method is introduced to localize and track the needle using 3D motorized curvilinear ultrasound probes. In particular, a low-cost camera mounted on the probe is employed to estimate the needle axis. The camera-estimated axis is used to identify a volume of interest (VOI) in the ultrasound volume that enables high needle visibility. This VOI is analyzed using local phase analysis and the random sample consensus algorithm to refine the camera-estimated needle axis. The needle tip is determined by searching the localized needle axis using a probabilistic approach. Dynamic needle tracking in a sequence of 3D ultrasound volumes is enabled by iteratively applying a Kalman filter to estimate the VOI that includes the needle in the successive ultrasound volume and limiting the localization analysis to this VOI. A series of ex vivo animal experiments are conducted to evaluate the accuracy of needle localization and tracking. The results show that the proposed method can localize the needle in individual ultrasound volumes with maximum error rates of 0.7 mm for the needle axis, 1.7° for the needle angle, and 1.2 mm for the needle tip. Moreover, the proposed method can track the needle in a sequence of ultrasound volumes with maximum error rates of 1.0 mm for the needle axis, 2.0° for the needle angle, and 1.7 mm for the needle tip. These results suggest the feasibility of applying the proposed method to localize and track the needle using 3D motorized curvilinear ultrasound probes
A Fusion-Based Approach for Breast Ultrasound Image Classification Using Multiple-ROI Texture and Morphological Analyses
Ultrasound imaging is commonly used for breast cancer diagnosis, but accurate interpretation of breast ultrasound (BUS) images is often challenging and operator-dependent. Computer-aided diagnosis (CAD) systems can be employed to provide the radiologists with a second opinion to improve the diagnosis accuracy. In this study, a new CAD system is developed to enable accurate BUS image classification. In particular, an improved texture analysis is introduced, in which the tumor is divided into a set of nonoverlapping regions of interest (ROIs). Each ROI is analyzed using gray-level cooccurrence matrix features and a support vector machine classifier to estimate its tumor class indicator. The tumor class indicators of all ROIs are combined using a voting mechanism to estimate the tumor class. In addition, morphological analysis is employed to classify the tumor. A probabilistic approach is used to fuse the classification results of the multiple-ROI texture analysis and morphological analysis. The proposed approach is applied to classify 110 BUS images that include 64 benign and 46 malignant tumors. The accuracy, specificity, and sensitivity obtained using the proposed approach are 98.2%, 98.4%, and 97.8%, respectively. These results demonstrate that the proposed approach can effectively be used to differentiate benign and malignant tumors
On human emotion and activity analysis
This thesis aims at investigating methodologies for recognizing human emotions, single human daily life activities and human-human interactions by utilizing different types of non-verbal human behavioral signals such as facial expressions, body postures, actions and interactions as video input signals. Two recognition schemes have been investigated and developed for recognizing human activity and emotion from an input video. In the first recognition scheme, we propose to decouple the spatial context from the temporal dynamics of human body and facial expressions instead of dealing with both temporal and spatial modalities simultaneously. To achieve this decoupling, we have developed two techniques for automatically identifying temporal phases of human actions and emotions in input videos, such that each temporal phase is completely characterized by its spatial context. Furthermore, we have developed a two-layered framework for recognizing human emotional and action states based on analyzing the temporal dynamics of human emotions/actions using the decoupled spatial context. In the first layer, the decoupled spatial context is utilized to convert the input video into a string of labels, where each label represents the class of the temporal phase to which its corresponding frame belongs. Then, in the second layer, we perform a temporal analysis to classify the sequence of labels that was generated from the first layer into one of different emotional/action states. In our second approach, we propose a video classification platform that is based on utilizing a Nonlinear AutoRegressive with eXogenous input (NARX) model with recurrent neural network realization - we name it recurrent NARX network model. The proposed recurrent NARX network model is utilized for recognizing human emotions and actions from given input videos. This approach formulates video classification problem as a parametric temporal sequence regression problem, and solves it in a temporal-spatial fashion. Computer simulations and experiments using publicly available databases were conducted to evaluate the performance of both recognition schemes. Experimental results showed that using the decoupling recognition scheme, the average recognition rate for human activities and emotions were 98.89% and 93.53%, respectively. These results outperformed the average recognition rates obtained when the recurrent NARX network model was used as a recognition engine by approximately 4%. Unlike human activities that involve single human, recognizing human-human interactions is more challenging and requires taking into consideration the semantic meanings and the inter-relations of the moving body-parts of each human. Hence, for this purpose, we have developed a view-invariant geometric representation that utilizes 3D joint pose of human body-parts to capture the semantic meaning of different spatiotemporal configurations of two interacting persons using an RGBD input data from a Kinect sensor. The proposed representation utilizes the concept of anatomical planes to construct motion and pose profiles for each interacting person, and then these two profiles are concatenated to form a geometric descriptor for two interacting humans. Using the proposed geometric representation, we have developed frameworks to perform human-human interaction analysis from two perspectives: human-human interaction classification and prediction from an input video. The performance of the proposed human-human interaction classification and prediction frameworks were evaluated using a human-human interaction dataset that have been collected in our lab, which consists of 27500 frames for 12 individuals performing 12 different interactions. Using the proposed geometric descriptor, human-human interaction classification framework was able to achieve an average recognition rate of 94.86%, while human-human interaction prediction framework was able to achieve an average prediction accuracy of 82.46% at a progress level of 50%
Fall Detection for Elderly from Partially Observed Depth-Map Video Sequences Based on View-Invariant Human Activity Representation
This paper presents a new approach for fall detection from partially-observed depth-map video sequences. The proposed approach utilizes the 3D skeletal joint positions obtained from the Microsoft Kinect sensor to build a view-invariant descriptor for human activity representation, called the motion-pose geometric descriptor (MPGD). Furthermore, we have developed a histogram-based representation (HBR) based on the MPGD to construct a length-independent representation of the observed video subsequences. Using the constructed HBR, we formulate the fall detection problem as a posterior-maximization problem in which the posteriori probability for each observed video subsequence is estimated using a multi-class SVM (support vector machine) classifier. Then, we combine the computed posteriori probabilities from all of the observed subsequences to obtain an overall class posteriori probability of the entire partially-observed depth-map video sequence. To evaluate the performance of the proposed approach, we have utilized the Kinect sensor to record a dataset of depth-map video sequences that simulates four fall-related activities of elderly people, including: walking, sitting, falling form standing and falling from sitting. Then, using the collected dataset, we have developed three evaluation scenarios based on the number of unobserved video subsequences in the testing videos, including: fully-observed video sequence scenario, single unobserved video subsequence of random lengths scenarios and two unobserved video subsequences of random lengths scenarios. Experimental results show that the proposed approach achieved an average recognition accuracy of 93 . 6 % , 77 . 6 % and 65 . 1 % , in recognizing the activities during the first, second and third evaluation scenario, respectively. These results demonstrate the feasibility of the proposed approach to detect falls from partially-observed videos
EEG-Based Emotion Recognition Using Quadratic Time-Frequency Distribution
Accurate recognition and understating of human emotions is an essential skill that can improve the collaboration between humans and machines. In this vein, electroencephalogram (EEG)-based emotion recognition is considered an active research field with challenging issues regarding the analyses of the nonstationary EEG signals and the extraction of salient features that can be used to achieve accurate emotion recognition. In this paper, an EEG-based emotion recognition approach with a novel time-frequency feature extraction technique is presented. In particular, a quadratic time-frequency distribution (QTFD) is employed to construct a high resolution time-frequency representation of the EEG signals and capture the spectral variations of the EEG signals over time. To reduce the dimensionality of the constructed QTFD-based representation, a set of 13 time- and frequency-domain features is extended to the joint time-frequency-domain and employed to quantify the QTFD-based time-frequency representation of the EEG signals. Moreover, to describe different emotion classes, we have utilized the 2D arousal-valence plane to develop four emotion labeling schemes of the EEG signals, such that each emotion labeling scheme defines a set of emotion classes. The extracted time-frequency features are used to construct a set of subject-specific support vector machine classifiers to classify the EEG signals of each subject into the different emotion classes that are defined using each of the four emotion labeling schemes. The performance of the proposed approach is evaluated using a publicly available EEG dataset, namely the DEAPdataset. Moreover, we design three performance evaluation analyses, namely the channel-based analysis, feature-based analysis and neutral class exclusion analysis, to quantify the effects of utilizing different groups of EEG channels that cover various regions in the brain, reducing the dimensionality of the extracted time-frequency features and excluding the EEG signals that correspond to the neutral class, on the capability of the proposed approach to discriminate between different emotion classes. The results reported in the current study demonstrate the efficacy of the proposed QTFD-based approach in recognizing different emotion classes. In particular, the average classification accuracies obtained in differentiating between the various emotion classes defined using each of the four emotion labeling schemes are within the range of 73.8 % – 86.2 % . Moreover, the emotion classification accuracies achieved by our proposed approach are higher than the results reported in several existing state-of-the-art EEG-based emotion recognition studies
Accurate Needle Localization Using Two-Dimensional Power Doppler and B-Mode Ultrasound Image Analyses: A Feasibility Study
Curvilinear ultrasound transducers are commonly used in various needle insertion interventions, but localizing the needle in curvilinear ultrasound images is usually challenging. In this paper, a new method is proposed to localize the needle in curvilinear ultrasound images by exciting the needle using a piezoelectric buzzer and imaging the excited needle using a curvilinear ultrasound transducer to acquire a power Doppler image and a B-mode image. The needle-induced Doppler responses that appear in the power Doppler image are analyzed to estimate the needle axis initially and identify the candidate regions that are expected to include the needle. The candidate needle regions in the B-mode image are analyzed to improve the localization of the needle axis. The needle tip is determined by analyzing the intensity variations of the power Doppler and B-mode images around the needle axis. The proposed method is employed to localize different needles that are inserted in three ex vivo animal tissue types at various insertion angles, and the results demonstrate the capability of the method to achieve automatic, reliable and accurate needle localization. Furthermore, the proposed method outperformed two existing needle localization methods
A Game-Based Rehabilitation System for Upper-Limb Cerebral Palsy: A Feasibility Study
Game-based rehabilitation systems provide an effective tool to engage cerebral palsy patients in physical exercises within an exciting and entertaining environment. A crucial factor to ensure the effectiveness of game-based rehabilitation systems is to assess the correctness of the movements performed by the patient during the game-playing sessions. In this study, we propose a game-based rehabilitation system for upper-limb cerebral palsy that includes three game-based exercises and a computerized assessment method. The game-based exercises aim to engage the participant in shoulder flexion, shoulder horizontal abduction/adduction, and shoulder adduction physical exercises that target the right arm. Human interaction with the game-based rehabilitation system is achieved using a Kinect sensor that tracks the skeleton joints of the participant. The computerized assessment method aims to assess the correctness of the right arm movements during each game-playing session by analyzing the tracking data acquired by the Kinect sensor. To evaluate the performance of the computerized assessment method, two groups of participants volunteered to participate in the game-based exercises. The first group included six cerebral palsy children and the second group included twenty typically developing subjects. For every participant, the computerized assessment method was employed to assess the correctness of the right arm movements in each game-playing session and these computer-based assessments were compared with matching gold standard evaluations provided by an experienced physiotherapist. The results reported in this study suggest the feasibility of employing the computerized assessment method to evaluate the correctness of the right arm movements during the game-playing sessions
EEG-Based Brain-Computer Interface for Decoding Motor Imagery Tasks within the Same Hand Using Choi-Williams Time-Frequency Distribution
This paper presents an EEG-based brain-computer interface system for classifying eleven motor imagery (MI) tasks within the same hand. The proposed system utilizes the Choi-Williams time-frequency distribution (CWD) to construct a time-frequency representation (TFR) of the EEG signals. The constructed TFR is used to extract five categories of time-frequency features (TFFs). The TFFs are processed using a hierarchical classification model to identify the MI task encapsulated within the EEG signals. To evaluate the performance of the proposed approach, EEG data were recorded for eighteen intact subjects and four amputated subjects while imagining to perform each of the eleven hand MI tasks. Two performance evaluation analyses, namely channel- and TFF-based analyses, are conducted to identify the best subset of EEG channels and the TFFs category, respectively, that enable the highest classification accuracy between the MI tasks. In each evaluation analysis, the hierarchical classification model is trained using two training procedures, namely subject-dependent and subject-independent procedures. These two training procedures quantify the capability of the proposed approach to capture both intra- and inter-personal variations in the EEG signals for different MI tasks within the same hand. The results demonstrate the efficacy of the approach for classifying the MI tasks within the same hand. In particular, the classification accuracies obtained for the intact and amputated subjects are as high as 88 . 8 % and 90 . 2 % , respectively, for the subject-dependent training procedure, and 80 . 8 % and 87 . 8 % , respectively, for the subject-independent training procedure. These results suggest the feasibility of applying the proposed approach to control dexterous prosthetic hands, which can be of great benefit for individuals suffering from hand amputations
A CSI-Based Multi-Environment Human Activity Recognition Framework
Passive human activity recognition (HAR) systems, in which no sensors are attached to the subject, provide great potentials compared to conventional systems. One of the recently used techniques showing tremendous potential is channel state information (CSI)-based HAR systems. In this work, we present a multi-environment human activity recognition system based on observing the changes in the CSI values of the exchanged wireless packets carried by OFDM subcarriers. In essence, we introduce a five-stage CSI-based human activity recognition approach. First, the acquired CSI values associated with each recorded activity instance are processed to remove the existing noise from the recorded data. A novel segmentation algorithm is then presented to identify and extract the portion of the signal that contains the activity. Next, the extracted activity segment is processed using the procedure proposed in the first stage. After that, the relevant features are extracted, and the important features are selected. Finally, the selected features are used to train a support vector machine (SVM) classifier to identify the different performed activities. To validate the performance of the proposed approach, we collected data in two different environments. In each of the environments, several activities were performed by multiple subjects. The performed experiments showed that our proposed approach achieved an average activity recognition accuracy of 91.27%
A comprehensive review of the deep learning-based tumor analysis approaches in histopathological images: segmentation, classification and multi-learning tasks
Medical Imaging has become a vital technique that has been embraced in the diagnosis and treatment process of cancer. Histopathological slides, which microscopically examine the suspicious tissue, are considered the golden standard for tumor prognosis and diagnosis. This excellent performance caused a sudden and growing interest in digitizing these slides to generate Whole Slide Images (WSI). However, analyzing WSI is a very challenging task due to the multiple-resolution, large-scale nature of these images. Therefore, WSI-based Computer-Aided Diagnosis (CAD) analysis gains increasing attention as a secondary decision support tool to enhance healthcare by alleviating pathologists’ workload and reducing misdiagnosis rates. Recent revolutionized deep learning techniques are promising and have the potential to achieve efficient automatic representation of WSI features in a data-driven manner. Thus, in this survey, we focus mainly on deep learning-based CAD systems in the context of tumor analysis in histopathological images, i.e., segmentation and classification of tumor regions. We present a visual taxonomy of deep learning approaches that provides a systematic structure to the vast number of diverse models proposed until now. We sought to identify challenges that face the automation of histopathological analysis, the commonly used public datasets, and evaluation metrics and discuss recent methodologies for addressing them through a systematic examination of presented deep solutions. The survey aims to highlight the existing gaps and limitations of the recent deep learning-based WSI approaches to explore the possible avenues for potential enhancements