14 research outputs found

    Learning spatio-temporal representations with a dual-stream 3-D residual network for nondriving activity recognition

    Get PDF
    Accurate recognition of non-driving activity (NDA) is important for the design of intelligent Human Machine Interface to achieve a smooth and safe control transition in the conditionally automated driving vehicle. However, some characteristics of such activities like limited-extent movement and similar background pose a challenge to the existing 3D convolutional neural network (CNN) based action recognition methods. In this paper, we propose a dual-stream 3D residual network, named D3D ResNet, to enhance the learning of spatio-temporal representation and improve the activity recognition performance. Specifically, a parallel 2-stream structure is introduced to focus on the learning of short-time spatial representation and small-region temporal representation. A 2-feed driver behaviour monitoring framework is further build to classify 4 types of NDAs and 2 types of driving behaviour based on the drivers head and hand movement. A novel NDA dataset has been constructed for the evaluation, where the proposed D3D ResNet achieves 83.35% average accuracy, at least 5% above three selected state-of-the-art methods. Furthermore, this study investigates the spatio-temporal features learned in the hidden layer through the saliency map, which explains the superiority of the proposed model on the selected NDAs

    An adaptive pig face recognition approach using convolutional neural networks

    Get PDF
    The evolution of agriculture towards intensive farming leads to an increasing demand for animal identification associated with high traceability, driven by the need for quality control and welfare management in agricultural animals. Automatic identification of individual animals is an important step to achieve individualised care in terms of disease detection and control, and improvement of the food quality. For example, as feeding patterns can differ amongst pigs in the same pen, even in homogenous groups, automatic registration shows the most potential when applied to an individual pig. In the EU for instance, this capability is required for certification purposes. Although the RFID technology has been gradually developed and widely applied for this task, chip implanting might still be time-consuming and costly for current practical applications. In this paper, a novel framework composed of computer vision algorithms, machine learning and deep learning techniques is proposed to offer a relatively low-cost and scalable solution of pig recognition. Firstly, pig faces and eyes are detected automatically by two Haar feature-based cascade classifiers and one shallow convolutional neural network to extra high-quality images. Secondly, face recognition is performed by employing a deep convolutional neural network. Additionally, class activation maps generated by grad-CAM and saliency maps are utilised to visually understand how the discriminating parameters have been learned by the neural network. By applying the proposed approach on 10 randomly selected pigs filmed in farm condition, the proposed method demonstrates the superior performance against the state-of-art method with an accuracy of 83% over 320 testing images. The outcome of this study will facilitate the real-application of AI-based animal identification in swine production

    A refined non-driving activity classification using a two-stream convolutional neural network

    Get PDF
    It is of great importance to monitor the driver’s status to achieve an intelligent and safe take-over transition in the level 3 automated driving vehicle. We present a camera-based system to recognise the non-driving activities (NDAs) which may lead to different cognitive capabilities for take-over based on a fusion of spatial and temporal information. The region of interest (ROI) is automatically selected based on the extracted masks of the driver and the object/device interacting with. Then, the RGB image of the ROI (the spatial stream) and its associated current and historical optical flow frames (the temporal stream) are fed into a two-stream convolutional neural network (CNN) for the classification of NDAs. Such an approach is able to identify not only the object/device but also the interaction mode between the object and the driver, which enables a refined NDA classification. In this paper, we evaluated the performance of classifying 10 NDAs with two types of devices (tablet and phone) and 5 types of tasks (emailing, reading, watching videos, web-browsing and gaming) for 10 participants. Results show that the proposed system improves the averaged classification accuracy from 61.0% when using a single spatial stream to 90.5

    Ultra-high-resolution time-frequency analysis of EEG to characterise brain functional connectivity with the application in Alzheimer's disease

    Get PDF
    Objective. This study aims to explore the potential of high-resolution brain functional connectivity based on electroencephalogram, a non-invasive low-cost technique, to be translated into a long-overdue biomarker and a diagnostic method for Alzheimer's disease (AD). Approach. The paper proposes a novel ultra-high-resolution time-frequency nonlinear cross-spectrum method to construct a promising biomarker of AD pathophysiology. Specifically, using the peak frequency estimated from a revised Hilbert–Huang transformation (RHHT) cross-spectrum as a biomarker, the support vector machine classifier is used to distinguish AD from healthy controls (HCs). Main results. With the combinations of the proposed biomarker and machine learning, we achieved a promising accuracy of 89%. The proposed method performs better than the wavelet cross-spectrum and other functional connectivity measures in the temporal or frequency domain, particularly in the Full, Delta and Alpha bands. Besides, a novel visualisation approach developed from topography is introduced to represent the brain functional connectivity, with which the difference between AD and HCs can be clearly displayed. The interconnections between posterior and other brain regions are obviously affected in AD. Significance. Those findings imply that the proposed RHHT approach could better track dynamic and nonlinear functional connectivity information, paving the way for the development of a novel diagnostic approach

    Using interictal seizure-free EEG data to recognise patients with epilepsy based on machine learning of brain functional connectivity

    Get PDF
    Most seizures in adults with epilepsy occur rather infrequently and as a result, the interictal EEG plays a crucial role in the diagnosis and classification of epilepsy. However, empirical interpretation, of a first EEG in adult patients, has a very low sensitivity ranging between 29-55%. Useful EEG information remains buried within the signals in seizure-free EEG epochs, far beyond the observational capabilities of any specialised physician in this field. Unlike most of the existing works focusing on either seizure data or single-variate method, we introduce a multi-variate method to characterise sensor level brain functional connectivity from interictal EEG data to identify patients with generalised epilepsy. A total of 9 connectivity features based on 5 different measures in time, frequency and time frequency domains have been tested. The solution has been validated by the K-Nearest Neighbour algorithm, classifying an epilepsy group (EG) vs healthy controls (HC) and subsequently with another cohort of patients characterised by non-epileptic attacks (NEAD), a psychogenic type of disorder. A high classification accuracy (97%) was achieved for EG vs HC while revealing significant spatio temporal deficits in the frontocentral areas in the beta frequency band. For EG vs NEAD, the classification accuracy was only about 73%, which might be a reflection of the well-described coexistence of NEAD with epileptic attacks. Our work demonstrates that seizure-free interictal EEG data can be used to accurately classify patients with generalised epilepsy from HC and that more systematic work is required in this direction aiming to produce a clinically useful diagnostic method

    Spectral Decomposition and a Waveform Cluster to Characterize Strongly Heterogeneous Paleokarst Reservoirs in the Tarim Basin, China

    No full text
    The main components of the Ordovician carbonate reservoirs in the Tahe Oilfield are paleokarst fracture-cavity paleo-channel systems formed by karstification. Detailed characterization of these paleokarst reservoirs is challenging because of heterogeneities in characteristics and strong vertical and lateral non-uniformities. Traditional seismic analysis methods are not able to solve the identification problem of such strongly heterogeneous reservoirs. Recent developments in seismic interpretation have heightened the need to describe the fracture-cavity structure of a paleo-channel with more accuracy. We propose a new prediction model for fracture-cavity carbonate reservoirs based on spectral decomposition and a waveform cluster. By the Matching Pursuit decomposition algorithm, the single-frequency data volumes are obtained. The specific frequency data volume that is the most sensitive to the reservoir is chosen based on seismic synthesis traces of well-logging data and geological interpretability. The waveform cluster is then applied to delineate the complex paleokarst systems, particularly the fracture-caves in the runoff zone. This method was applied to the area around Well T615 in the Tahe oilfield, and a paleokarst fracture-cavity system with strong heterogeneity in the runoff zone was delineated and characterized. The findings of this research provide insights for predicting other similar karst systems, such as karstic groundwater and karst hydrogeological systems

    Application of Geologically Constrained Machine Learning Method in Characterizing Paleokarst Reservoirs of Tarim Basin, China

    No full text
    As deep carbonate fracture-cavity paleokarst reservoirs are deeply buried and highly heterogeneous, and the responded seismic signals have weak amplitudes and low signal-to-noise ratios. Machine learning in seismic exploration provides a new perspective to solve the above problems, which is rapidly developing with compelling results. Applying machine learning algorithms directly on deep seismic signals or seismic attributes of deep carbonate fracture-cavity reservoirs without any prior knowledge constraints will result in wasted computation and reduce the accuracy. We propose a method of combining geological constraints and machine learning to describe deep carbonate fracture-cavity paleokarst reservoirs. By empirical mode decomposition, the time–frequency features of the seismic data are obtained and then a sensitive frequency is selected using geological prior constraints, which is input to fuzzy C-means cluster for characterizing the reservoir distribution. Application on Tahe oilfield data shows the potential of highlighting subtle geologic structures that might otherwise escape unnoticed by applying machine learning directly

    Spatial–temporal graph convolutional network for Alzheimer classification based on brain functional connectivity imaging of electroencephalogram

    Get PDF
    Functional connectivity of the human brain, representing statistical dependence of information flow between cortical regions, significantly contributes to the study of the intrinsic brain network and its functional mechanism. To fully explore its potential in the early diagnosis of Alzheimer's disease (AD) using electroencephalogram (EEG) recordings, this article introduces a novel dynamical spatial–temporal graph convolutional neural network (ST-GCN) for better classification performance. Different from existing studies that are based on either topological brain function characteristics or temporal features of EEG, the proposed ST-GCN considers both the adjacency matrix of functional connectivity from multiple EEG channels and corresponding dynamics of signal EEG channel simultaneously. Different from the traditional graph convolutional neural networks, the proposed ST-GCN makes full use of the constrained spatial topology of functional connectivity and the discriminative dynamic temporal information represented by the 1D convolution. We conducted extensive experiments on the clinical EEG data set of AD patients and Healthy Controls. The results demonstrate that the proposed method achieves better classification performance (92.3%) than the state-of-the-art methods. This approach can not only help diagnose AD but also better understand the effect of normal ageing on brain network characteristics before we can accurately diagnose the condition based on resting-state EEG
    corecore