6,818 research outputs found
Automatic Bayesian Density Analysis
Making sense of a dataset in an automatic and unsupervised fashion is a
challenging problem in statistics and AI. Classical approaches for {exploratory
data analysis} are usually not flexible enough to deal with the uncertainty
inherent to real-world data: they are often restricted to fixed latent
interaction models and homogeneous likelihoods; they are sensitive to missing,
corrupt and anomalous data; moreover, their expressiveness generally comes at
the price of intractable inference. As a result, supervision from statisticians
is usually needed to find the right model for the data. However, since domain
experts are not necessarily also experts in statistics, we propose Automatic
Bayesian Density Analysis (ABDA) to make exploratory data analysis accessible
at large. Specifically, ABDA allows for automatic and efficient missing value
estimation, statistical data type and likelihood discovery, anomaly detection
and dependency structure mining, on top of providing accurate density
estimation. Extensive empirical evidence shows that ABDA is a suitable tool for
automatic exploratory analysis of mixed continuous and discrete tabular data.Comment: In proceedings of the Thirty-Third AAAI Conference on Artificial
Intelligence (AAAI-19
Improving Malware Detection Accuracy by Extracting Icon Information
Detecting PE malware files is now commonly approached using statistical and
machine learning models. While these models commonly use features extracted
from the structure of PE files, we propose that icons from these files can also
help better predict malware. We propose an innovative machine learning approach
to extract information from icons. Our proposed approach consists of two steps:
1) extracting icon features using summary statics, histogram of gradients
(HOG), and a convolutional autoencoder, 2) clustering icons based on the
extracted icon features. Using publicly available data and by using machine
learning experiments, we show our proposed icon clusters significantly boost
the efficacy of malware prediction models. In particular, our experiments show
an average accuracy increase of 10% when icon clusters are used in the
prediction model.Comment: Full version. IEEE MIPR 201
Unsupervised routine discovery in egocentric photo-streams
The routine of a person is defined by the occurrence of activities throughout
different days, and can directly affect the person's health. In this work, we
address the recognition of routine related days. To do so, we rely on
egocentric images, which are recorded by a wearable camera and allow to monitor
the life of the user from a first-person view perspective. We propose an
unsupervised model that identifies routine related days, following an outlier
detection approach. We test the proposed framework over a total of 72 days in
the form of photo-streams covering around 2 weeks of the life of 5 different
camera wearers. Our model achieves an average of 76% Accuracy and 68% Weighted
F-Score for all the users. Thus, we show that our framework is able to
recognise routine related days and opens the door to the understanding of the
behaviour of people
- …