13 research outputs found
Recommended from our members
CEPS: An Open Access MATLAB Graphical User Interface (GUI) for the Analysis of Complexity and Entropy in Physiological Signals
Background: We developed CEPS as an open access MATLAB® GUI (graphical user interface) for the analysis of Complexity and Entropy in Physiological Signals (CEPS), and demonstrate its use with an example data set that shows the effects of paced breathing (PB) on variability of heart, pulse and respiration rates. CEPS is also sufficiently adaptable to be used for other time series physiological data such as EEG (electroencephalography), postural sway or temperature measurements. Methods: Data were collected from a convenience sample of nine healthy adults in a pilot for a larger study investigating the effects on vagal tone of breathing paced at various different rates, part of a development programme for a home training stress reduction system. Results: The current version of CEPS focuses on those complexity and entropy measures that appear most frequently in the literature, together with some recently introduced entropy measures which may have advantages over those that are more established. Ten methods of estimating data complexity are currently included, and some 28 entropy measures. The GUI also includes a section for data pre-processing and standard ancillary methods to enable parameter estimation of embedding dimension m and time delay τ (‘tau’) where required. The software is freely available under version 3 of the GNU Lesser General Public License (LGPLv3) for non-commercial users. CEPS can be downloaded at https://bitbucket.org/deepak_panday/ceps/src/pipeline_v2/. In our illustration on PB, most complexity and entropy measures decreased significantly in response to breathing at 7 breaths per minute, differentiating more clearly than conventional linear, time- and frequency-domain measures between breathing states. In contrast, Higuchi fractal dimension increased during paced breathing. Conclusions: We have developed CEPS software as a physiological data visualiser able to integrate state of the art techniques. The interface is designed for clinical research and has a structure designed for integrating new tools. The aim is to strengthen collaboration between clinicians and the biomedical community, as demonstrated here by using CEPS to analyse various physiological responses to paced breathing
Advanced random forest approaches for outlier detection
Outlier Detection (OD) is a Pattern Recognition task which consists of finding those patterns in a set of data which are likely to have been generated by a different mechanism than the one underlying the rest of the data. The importance of OD is visible in everyday life. Indeed, fast, and accurate detection of outliers is crucial: for example, in the electrocardiogram of a patient, an abnormality in the heart rhythm can cause severe health problems. Due to the high number of fields in which OD is needed, several approaches have been designed. Among them, Random Forest-based techniques have raised great interest in the research community: a Random Forest (RF) is an ensemble of Decision Trees where each tree is diverse and independent. They are characterized by a high degree of flexibility, robustness, and high generalization capabilities. Even though originally designed for classification and regression, in the latest years, due to their success, there has been an increased development of RF-based approaches for other learning tasks, including OD. The forerunner of several RF methods for OD is Isolation Forest (iForest), a technique which main principle is isolation, i.e. the separation of each object from the rest of the data. Since outliers are different from the rest of the data and thus easier to separate, we can easily identify them as those objects isolated after few splits in the tree. iForests have been employed in a great variety of application fields, showing excellent performances. This thesis is inserted into the above scenario: even if some extensions of basic RF-based approaches for OD have been proposed, their potentialities have not been fully exploited and there is large room for improvements. In this thesis, we introduce some advanced RF-based techniques for OD, investigating both methodological issues and alternative uses of these flexible approaches. In detail, we moved along four research directions. The starting point of the first one is the absence of RF methods for OD able to work with non-vectorial data: here we propose ProxIForest, an approach which works with all types of data for which a distance measure can be defined, thus including non-vectorial data as well. Indeed, for the latter, many powerful distances have been proposed. The second direction focuses on how to measure the outlierness degree of an object in an RF, i.e. the anomaly score, since most extensions of iForest concern only the tree building procedure. In detail, we propose two novel classes of methods: the first class exploits the information contained within a tree. The second one focuses on the ensemble aspect of RFs: the aggregation of the anomaly scores extracted from each tree is crucial to correctly identify outliers. As to the third research direction we took a different perspective exploiting the fact that each tree in a forest is a space partitioner encoding relations, i.e. distances, between objects. Whereas this aspect has been widely researched in the clustering field, it has never been investigated for OD: we extract from an iForest a distance measure and input it to an outlier detector. As last research direction, we designed a new variant of iForest to characterize multiple sclerosis given a brain connectivity network: we cast the problem as an OD task, by making an analogy between disconnected brain regions, the hallmark of the disease, and outliers. All proposals have been thoroughly empirically validated on either classical or ad hoc datasets: we performed several analyses, including comparisons to state-of-the-art approaches and statistical tests. This thesis proves the suitability of RF-based approaches for OD from different perspectives: not only they can be successfully used for the task, but we can also use them to extract distances or features. Further, by contributing to this field, this thesis proves that there are still many aspects requiring further investigation
Acoustic analysis of the knee joint in the study of osteoarthritis detection during walking
This thesis investigates the potential of non-invasive detection of knee Osteoarthritis (OA) using the sounds emitted by the knee joint during walking and captured by a single microphone. This is a novel application since, until now, there are no other methods that considered this type of signals. Clinical detection of knee OA relies on imaging techniques such as X-radiology and Magnetic Resonance Imaging. Some of these methods are expensive and impractical while others pose health risks due to radiation. Knee sounds on the other hand may offer a quick, practical and cost-effective alternative for the detection of the disease.
In this thesis, the knee sound signal structure is investigated using signal processing methods for information extraction from the time, frequency, cepstral and modulation domains. Feature representations are obtained and their discriminant properties are studied using statistical methods such as the Bhattacharyya distance and supervised learning techniques such as Support Vector Machine. From this work, a statistical feature parameterisation is proposed and its efficacy for the task of healthy vs OA knee condition classification is investigated using a comprehensive experimental framework proposed in this thesis.
Feature-based representations that incorporate spatiotemporal information using gait pattern variables, were also investigated for classification. Using the waveform characteristics of the acoustic pulse events detected in the signal, such representations are proposed and evaluated. This approach utilised a novel stride detection and segmentation algorithm that is based on dynamic programming and is also proposed in the thesis. This algorithm opens up potential applications in other research fields such as gait analysis.Open Acces
Recommended from our members
Data Analytics in Test: Recognizing and Reducing Subjectivity
Applying data analytics in production test has become a widely adopted industrial practice in recent years. As the complexity of semiconductor devices scales and the amounts of available test data continue to grow, the research direction in this field is forced to shift away from solving specific problems with ad hoc approaches and demands for deeper understanding of the fundamental issues. Two data-driven test applications where this shift is apparent are production yield optimization and defect screening, where the respective underlying data analytics approaches are correlation analysis and outlier analysis. A core issue present in these two approaches stems from the subjectivity that is inherent to data analytics. This dissertation delves into how subjectivity manifests itself and what can be done to reduce it with respect to the two test applications.Outlier analysis is an approach used for identifying anomalies. The main goal of outlier analysis in test is to capture statistically outlying parts with the hope that their abnormal behavior is attributed to some defectivity. During creation of an outlier model, the decisions about outlying behavior in the existing data are made by utilizing known failures and the test engineer's best judgment. In practice, outlier screening methods are simply used for transforming data into an outlier score space. Even if outlier analysis techniques are able to successfully classify a dataset into inliers and outliers, outlier models require thresholds to be decided. A concept called Consistency is introduced to provide an objective data-driven way to evaluate outlier models by utilizing all available data. The key observation underlying this concept is that outlier analysis should be immune to noise introduced by sources of systematic variation.Correlation analysis is a process comprising a search for related variables. The application of production yield optimization involves searching for correlation between the yield and various controllable parameters. The goal of this process is to uncover parameters that, when adjusted, can result in yield improvement. This analytics process is subjective to the perspective of the analyst and the quality of the result is highly dependent on the analyst’s previous experiences. In order to reduce the subjectivity in this application, a process mining methodology is introduced to learn from the experiences of analysts. The key advantage of this methodology is that in addition to having the capability to record and reproduce these analyses, it can also generalize to analytics processes not contained in the learned experiences
Feature Extraction and Selection in Automatic Sleep Stage Classification
Sleep stage classification is vital for diagnosing many sleep related
disorders and Polysomnography (PSG) is an important tool in this regard.
The visual process of sleep stage classification is time consuming, subjective
and costly. To improve the accuracy and efficiency of the sleep stage
classification, researchers have been trying to develop automatic
classification algorithms.
The automatic sleep stage classification mainly consists of three steps:
pre-processing, feature extraction and classification. In this research work,
we focused on feature extraction and selection steps. The main goal of this
thesis was identifying a robust and reliable feature set that can lead to
efficient classification of sleep stages. For achieving this goal, three types of
contributions were introduced in feature selection, feature extraction and
feature vector quality enhancement.
Several feature ranking and rank aggregation methods were evaluated and
compared for finding the best feature set. Evaluation results indicated that
the decision on the precise feature selection method depends on the system
design requirements such as low computational complexity, high stability
or high classification accuracy. In addition to conventional feature ranking
methods, in this thesis, novel methods such as Stacked Sparse AutoEncoder
(SSAE) was used for dimensionality reduction.
In feature extration area, new and effective features such as distancebased
features were utilized for the first time in sleep stage classification.
The results showed that these features contribute positively to the
classification performance. For signal quality enhancement, a loss-less EEG
artefact removal algorithm was proposed. The proposed adaptive algorithm
led to a significant enhancement in the overall classification accuracy
25th Annual Computational Neuroscience Meeting: CNS-2016
Abstracts of the 25th Annual Computational Neuroscience
Meeting: CNS-2016
Seogwipo City, Jeju-do, South Korea. 2–7 July 201