76 research outputs found

    A hybrid camera- and ultrasound-based approach for needle localization and tracking using a 3D motorized curvilinear ultrasound probe

    Get PDF
    Three-dimensional (3D) motorized curvilinear ultrasound probes provide an effective, low-cost tool to guide needle interventions, but localizing and tracking the needle in 3D ultrasound volumes is often challenging. In this study, a new method is introduced to localize and track the needle using 3D motorized curvilinear ultrasound probes. In particular, a low-cost camera mounted on the probe is employed to estimate the needle axis. The camera-estimated axis is used to identify a volume of interest (VOI) in the ultrasound volume that enables high needle visibility. This VOI is analyzed using local phase analysis and the random sample consensus algorithm to refine the camera-estimated needle axis. The needle tip is determined by searching the localized needle axis using a probabilistic approach. Dynamic needle tracking in a sequence of 3D ultrasound volumes is enabled by iteratively applying a Kalman filter to estimate the VOI that includes the needle in the successive ultrasound volume and limiting the localization analysis to this VOI. A series of ex vivo animal experiments are conducted to evaluate the accuracy of needle localization and tracking. The results show that the proposed method can localize the needle in individual ultrasound volumes with maximum error rates of 0.7 mm for the needle axis, 1.7° for the needle angle, and 1.2 mm for the needle tip. Moreover, the proposed method can track the needle in a sequence of ultrasound volumes with maximum error rates of 1.0 mm for the needle axis, 2.0° for the needle angle, and 1.7 mm for the needle tip. These results suggest the feasibility of applying the proposed method to localize and track the needle using 3D motorized curvilinear ultrasound probes

    A Fusion-Based Approach for Breast Ultrasound Image Classification Using Multiple-ROI Texture and Morphological Analyses

    Get PDF
    Ultrasound imaging is commonly used for breast cancer diagnosis, but accurate interpretation of breast ultrasound (BUS) images is often challenging and operator-dependent. Computer-aided diagnosis (CAD) systems can be employed to provide the radiologists with a second opinion to improve the diagnosis accuracy. In this study, a new CAD system is developed to enable accurate BUS image classification. In particular, an improved texture analysis is introduced, in which the tumor is divided into a set of nonoverlapping regions of interest (ROIs). Each ROI is analyzed using gray-level cooccurrence matrix features and a support vector machine classifier to estimate its tumor class indicator. The tumor class indicators of all ROIs are combined using a voting mechanism to estimate the tumor class. In addition, morphological analysis is employed to classify the tumor. A probabilistic approach is used to fuse the classification results of the multiple-ROI texture analysis and morphological analysis. The proposed approach is applied to classify 110 BUS images that include 64 benign and 46 malignant tumors. The accuracy, specificity, and sensitivity obtained using the proposed approach are 98.2%, 98.4%, and 97.8%, respectively. These results demonstrate that the proposed approach can effectively be used to differentiate benign and malignant tumors

    On human emotion and activity analysis

    No full text
    This thesis aims at investigating methodologies for recognizing human emotions, single human daily life activities and human-human interactions by utilizing different types of non-verbal human behavioral signals such as facial expressions, body postures, actions and interactions as video input signals. Two recognition schemes have been investigated and developed for recognizing human activity and emotion from an input video. In the first recognition scheme, we propose to decouple the spatial context from the temporal dynamics of human body and facial expressions instead of dealing with both temporal and spatial modalities simultaneously. To achieve this decoupling, we have developed two techniques for automatically identifying temporal phases of human actions and emotions in input videos, such that each temporal phase is completely characterized by its spatial context. Furthermore, we have developed a two-layered framework for recognizing human emotional and action states based on analyzing the temporal dynamics of human emotions/actions using the decoupled spatial context. In the first layer, the decoupled spatial context is utilized to convert the input video into a string of labels, where each label represents the class of the temporal phase to which its corresponding frame belongs. Then, in the second layer, we perform a temporal analysis to classify the sequence of labels that was generated from the first layer into one of different emotional/action states. In our second approach, we propose a video classification platform that is based on utilizing a Nonlinear AutoRegressive with eXogenous input (NARX) model with recurrent neural network realization - we name it recurrent NARX network model. The proposed recurrent NARX network model is utilized for recognizing human emotions and actions from given input videos. This approach formulates video classification problem as a parametric temporal sequence regression problem, and solves it in a temporal-spatial fashion. Computer simulations and experiments using publicly available databases were conducted to evaluate the performance of both recognition schemes. Experimental results showed that using the decoupling recognition scheme, the average recognition rate for human activities and emotions were 98.89% and 93.53%, respectively. These results outperformed the average recognition rates obtained when the recurrent NARX network model was used as a recognition engine by approximately 4%. Unlike human activities that involve single human, recognizing human-human interactions is more challenging and requires taking into consideration the semantic meanings and the inter-relations of the moving body-parts of each human. Hence, for this purpose, we have developed a view-invariant geometric representation that utilizes 3D joint pose of human body-parts to capture the semantic meaning of different spatiotemporal configurations of two interacting persons using an RGBD input data from a Kinect sensor. The proposed representation utilizes the concept of anatomical planes to construct motion and pose profiles for each interacting person, and then these two profiles are concatenated to form a geometric descriptor for two interacting humans. Using the proposed geometric representation, we have developed frameworks to perform human-human interaction analysis from two perspectives: human-human interaction classification and prediction from an input video. The performance of the proposed human-human interaction classification and prediction frameworks were evaluated using a human-human interaction dataset that have been collected in our lab, which consists of 27500 frames for 12 individuals performing 12 different interactions. Using the proposed geometric descriptor, human-human interaction classification framework was able to achieve an average recognition rate of 94.86%, while human-human interaction prediction framework was able to achieve an average prediction accuracy of 82.46% at a progress level of 50%

    EEG-based brain-computer interface for decoding motor imagery tasks within the same hand using Choi-Williams time-frequency distribution

    No full text
    This paper presents an EEG-based brain-computer interface system for classifying eleven motor imagery (MI) tasks within the same hand. The proposed system utilizes the Choi-Williams time-frequency distribution (CWD) to construct a time-frequency representation (TFR) of the EEG signals. The constructed TFR is used to extract five categories of time-frequency features (TFFs). The TFFs are processed using a hierarchical classification model to identify the MI task encapsulated within the EEG signals. To evaluate the performance of the proposed approach, EEG data were recorded for eighteen intact subjects and four amputated subjects while imagining to perform each of the eleven hand MI tasks. Two performance evaluation analyses, namely channel- and TFF-based analyses, are conducted to identify the best subset of EEG channels and the TFFs category, respectively, that enable the highest classification accuracy between the MI tasks. In each evaluation analysis, the hierarchical classification model is trained using two training procedures, namely subject-dependent and subject-independent procedures. These two training procedures quantify the capability of the proposed approach to capture both intra- and inter-personal variations in the EEG signals for different MI tasks within the same hand. The results demonstrate the efficacy of the approach for classifying the MI tasks within the same hand. In particular, the classification accuracies obtained for the intact and amputated subjects are as high as 88.8% and 90.2% , respectively, for the subject-dependent training procedure, and 80.8% and 87.8% , respectively, for the subject-independent training procedure. These results suggest the feasibility of applying the proposed approach to control dexterous prosthetic hands, which can be of great benefit for individuals suffering from hand amputations

    A Context-Based Personalization for Mobile Applications’ Network Access

    Full text link

    A dataset for Wi-Fi-based human-to-human interaction recognition

    No full text
    This dataset contains Wi-Fi signals that were recorded from 40 different pairs of subjects while performing twelve different human-to-human interactions in an indoor environment. Each pair of subjects performed ten trials of each of the twelve interactions and the total number of trials recorded in our dataset for all the 40 pairs of subjects is 4800 trials (i.e., 40 pairs of subjects × 12 interactions × 10 trials). The publicly available CSI tool is used to record the Wi-Fi signals transmitted from a commercial off-the-shelf access point, namely the Sagemcom 2704 access point, to a desktop computer that is equipped with an Intel 5300 network interface card. The recorded Wi-Fi signals consist of the Received Signal Strength Indicator (RSSI) values and the Channel State Information (CSI) values.THIS DATASET IS ARCHIVED AT DANS/EASY, BUT NOT ACCESSIBLE HERE. TO VIEW A LIST OF FILES AND ACCESS THE FILES IN THIS DATASET CLICK ON THE DOI-LINK ABOV
    corecore