7,520 research outputs found
Partially Synthesised Dataset to Improve Prediction Accuracy (Case Study: Prediction of Heart Diseases)
The real world data sources, such as statistical agencies, library data-banks and research institutes are the major data sources for researchers. Using this type of data involves several advantages including, the improvement of credibility and validity of the experiment and more importantly, it is related to a real world problems and typically unbiased. However, this type of data is most likely unavailable or inaccessible for everyone due to the following reasons. First, privacy and confidentiality concerns, since the data must to be protected on legal and ethical basis. Second, collecting real world data is costly and time consuming. Third, the data may be unavailable, particularly in the newly arises research subjects. Therefore, many studies have attributed the use of fully and/or partially synthesised data instead of real world data due to simplicity of creation, requires a relatively small amount of time and sufficient quantity can be generated to fit the requirements. In this context, this study introduces the use of partially synthesised data to improve the prediction of heart diseases from risk factors. We are proposing the generation of partially synthetic data from agreed principles using rule-based method, in which an extra risk factor will be added to the real-world data. In the conducted experiment, more than 85% of the data was derived from observed values (i.e., real-world data), while the remaining data has been synthetically generated using a rule-based method and in accordance with the World Health Organisation criteria. The analysis revealed an improvement of the variance in the data using the first two principal components of partially synthesised data. A further evaluation has been con-ducted using five popular supervised machine-learning classifiers. In which, partially synthesised data considerably improves the prediction of heart diseases. Where the majority of classifiers have approximately doubled their predictive performance using an extra risk factor
Flowing ConvNets for Human Pose Estimation in Videos
The objective of this work is human pose estimation in videos, where multiple
frames are available. We investigate a ConvNet architecture that is able to
benefit from temporal context by combining information across the multiple
frames using optical flow.
To this end we propose a network architecture with the following novelties:
(i) a deeper network than previously investigated for regressing heatmaps; (ii)
spatial fusion layers that learn an implicit spatial model; (iii) optical flow
is used to align heatmap predictions from neighbouring frames; and (iv) a final
parametric pooling layer which learns to combine the aligned heatmaps into a
pooled confidence map.
We show that this architecture outperforms a number of others, including one
that uses optical flow solely at the input layers, one that regresses joint
coordinates directly, and one that predicts heatmaps without spatial fusion.
The new architecture outperforms the state of the art by a large margin on
three video pose estimation datasets, including the very challenging Poses in
the Wild dataset, and outperforms other deep methods that don't use a graphical
model on the single-image FLIC benchmark (and also Chen & Yuille and Tompson et
al. in the high precision region).Comment: ICCV'1
Learning to Personalize in Appearance-Based Gaze Tracking
Personal variations severely limit the performance of appearance-based gaze
tracking. Adapting to these variations using standard neural network model
adaptation methods is difficult. The problems range from overfitting, due to
small amounts of training data, to underfitting, due to restrictive model
architectures. We tackle these problems by introducing the SPatial Adaptive
GaZe Estimator (SPAZE). By modeling personal variations as a low-dimensional
latent parameter space, SPAZE provides just enough adaptability to capture the
range of personal variations without being prone to overfitting. Calibrating
SPAZE for a new person reduces to solving a small optimization problem. SPAZE
achieves an error of 2.70 degrees with 9 calibration samples on MPIIGaze,
improving on the state-of-the-art by 14 %. We contribute to gaze tracking
research by empirically showing that personal variations are well-modeled as a
3-dimensional latent parameter space for each eye. We show that this
low-dimensionality is expected by examining model-based approaches to gaze
tracking. We also show that accurate head pose-free gaze tracking is possible
Autonomous Fault Detection in Self-Healing Systems using Restricted Boltzmann Machines
Autonomously detecting and recovering from faults is one approach for
reducing the operational complexity and costs associated with managing
computing environments. We present a novel methodology for autonomously
generating investigation leads that help identify systems faults, and extends
our previous work in this area by leveraging Restricted Boltzmann Machines
(RBMs) and contrastive divergence learning to analyse changes in historical
feature data. This allows us to heuristically identify the root cause of a
fault, and demonstrate an improvement to the state of the art by showing
feature data can be predicted heuristically beyond a single instance to include
entire sequences of information.Comment: Published and presented in the 11th IEEE International Conference and
Workshops on Engineering of Autonomic and Autonomous Systems (EASe 2014
The use of a quantitative structure-activity relationship (QSAR) model to predict GABA-A receptor binding of newly emerging benzodiazepines
The illicit market for new psychoactive substances is forever expanding. Benzodiazepines and their derivatives are one of a number of groups of these substances and thus far their number has grown year upon year. For both forensic and clinical purposes it is important to be able to rapidly understand these emerging substances. However as a consequence of the illicit nature of these compounds, there is a deficiency in the pharmacological data available for these ‘new’ benzodiazepines. In order to further understand the pharmacology of ‘new’ benzodiazepines we utilised a quantitative structure-activity relationship (QSAR) approach. A set of 69 benzodiazepine-based compounds was analysed to develop a QSAR training set with respect to published binding values to GABAA receptors. The QSAR model returned an R2 value of 0.90. The most influential factors were found to be the positioning of two H-bond acceptors, two aromatic rings and a hydrophobic group. A test set of nine random compounds was then selected for internal validation to determine the predictive ability of the model and gave an R2 value of 0.86 when comparing the binding values with their experimental data. The QSAR model was then used to predict the binding for 22 benzodiazepines that are classed as new psychoactive substances. This model will allow rapid prediction of the binding activity of emerging benzodiazepines in a rapid and economic way, compared with lengthy and expensive in vitro/in vivo analysis. This will enable forensic chemists and toxicologists to better understand both recently developed compounds and prediction of substances likely to emerge in the future
Spatio-temporal bivariate statistical models for atmospheric trace-gas inversion
Atmospheric trace-gas inversion refers to any technique used to predict
spatial and temporal fluxes using mole-fraction measurements and atmospheric
simulations obtained from computer models. Studies to date are most often of a
data-assimilation flavour, which implicitly consider univariate statistical
models with the flux as the variate of interest. This univariate approach
typically assumes that the flux field is either a spatially correlated Gaussian
process or a spatially uncorrelated non-Gaussian process with prior expectation
fixed using flux inventories (e.g., NAEI or EDGAR in Europe). Here, we extend
this approach in three ways. First, we develop a bivariate model for the
mole-fraction field and the flux field. The bivariate approach allows optimal
prediction of both the flux field and the mole-fraction field, and it leads to
significant computational savings over the univariate approach. Second, we
employ a lognormal spatial process for the flux field that captures both the
lognormal characteristics of the flux field (when appropriate) and its spatial
dependence. Third, we propose a new, geostatistical approach to incorporate the
flux inventories in our updates, such that the posterior spatial distribution
of the flux field is predominantly data-driven. The approach is illustrated on
a case study of methane (CH) emissions in the United Kingdom and Ireland.Comment: 39 pages, 8 figure
OFSET_mine:an integrated framework for cardiovascular diseases risk prediction based on retinal vascular function
As cardiovascular disease (CVD) represents a spectrum of disorders that often manifestfor the first time through an acute life-threatening event, early identification of seemingly healthy subjects with various degrees of risk is a priority.More recently, traditional scores used for early identification of CVD risk are slowly being replaced by more sensitive biomarkers that assess individual, rather than population risks for CVD. Among these, retinal vascular function, as assessed by the retinal vessel analysis method (RVA), has been proven as an accurate reflection of subclinical CVD in groups of participants without overt disease but with certain inherited or acquired risk factors. Furthermore, in order to correctly detect individual risk at an early stage, specialized machine learning methods and featureselection techniques that can cope with the characteristics of the data need to bedevised.The main contribution of this thesis is an integrated framework, OFSET_mine, that combinesnovel machine learning methods to produce a bespoke solution for Cardiovascular Risk Prediction based on RVA data that is also applicable to other medical datasets with similar characteristics. The three identified essential characteristics are 1) imbalanced dataset,2) high dimensionality and 3) overlapping feature ranges with the possibility of acquiring new samples. The thesis proposes FiltADASYN as an oversampling method that deals with imbalance, DD_Rank as a feature selection method that handles high dimensionality, and GCO_mine as a method for individual-based classification, all three integrated within the OFSET_mine framework.The new oversampling method FiltADASYN extends Adaptive Synthetic Oversampling(ADASYN) with an additional step to filter the generated samples and improve the reliability of the resultant sample set. The feature selection method DD_Rank is based on Restricted Boltzmann Machine (RBM) and ranks features according to their stability and discrimination power. GCO_mine is a lazy learning method based on Graph Cut Optimization (GCO), which considers both the local arrangements and the global structure of the data.OFSET_mine compares favourably to well established composite techniques. Itex hibits high classification performance when applied to a wide range of benchmark medical datasets with variable sample size, dimensionality and imbalance ratios.When applying OFSET _mine on our RVA data, an accuracy of 99.52% is achieved. In addition, using OFSET, the hybrid solution of FiltADASYN and DD_Rank, with Random Forest on our RVA data produces risk group classifications with accuracy 99.68%. This not only reflects the success of the framework but also establishes RVAas a valuable cardiovascular risk predicto
A review of digital video tampering: from simple editing to full synthesis.
Video tampering methods have witnessed considerable progress in recent years. This is partly due to the rapid development of advanced deep learning methods, and also due to the large volume of video footage that is now in the public domain. Historically, convincing video tampering has been too labour intensive to achieve on a large scale. However, recent developments in deep learning-based methods have made it possible not only to produce convincing forged video but also to fully synthesize video content. Such advancements provide new means to improve visual content itself, but at the same time, they raise new challenges for state-of-the-art tampering detection methods. Video tampering detection has been an active field of research for some time, with periodic reviews of the subject. However, little attention has been paid to video tampering techniques themselves. This paper provides an objective and in-depth examination of current techniques related to digital video manipulation. We thoroughly examine their development, and show how current evaluation techniques provide opportunities for the advancement of video tampering detection. A critical and extensive review of photo-realistic video synthesis is provided with emphasis on deep learning-based methods. Existing tampered video datasets are also qualitatively reviewed and critically discussed. Finally, conclusions are drawn upon an exhaustive and thorough review of tampering methods with discussions of future research directions aimed at improving detection methods
- …