39,078 research outputs found
Scalable and Interpretable One-class SVMs with Deep Learning and Random Fourier features
One-class support vector machine (OC-SVM) for a long time has been one of the
most effective anomaly detection methods and extensively adopted in both
research as well as industrial applications. The biggest issue for OC-SVM is
yet the capability to operate with large and high-dimensional datasets due to
optimization complexity. Those problems might be mitigated via dimensionality
reduction techniques such as manifold learning or autoencoder. However,
previous work often treats representation learning and anomaly prediction
separately. In this paper, we propose autoencoder based one-class support
vector machine (AE-1SVM) that brings OC-SVM, with the aid of random Fourier
features to approximate the radial basis kernel, into deep learning context by
combining it with a representation learning architecture and jointly exploit
stochastic gradient descent to obtain end-to-end training. Interestingly, this
also opens up the possible use of gradient-based attribution methods to explain
the decision making for anomaly detection, which has ever been challenging as a
result of the implicit mappings between the input space and the kernel space.
To the best of our knowledge, this is the first work to study the
interpretability of deep learning in anomaly detection. We evaluate our method
on a wide range of unsupervised anomaly detection tasks in which our end-to-end
training architecture achieves a performance significantly better than the
previous work using separate training.Comment: Accepted at European Conference on Machine Learning and Principles
and Practice of Knowledge Discovery in Databases (ECML-PKDD) 201
Remembering history : the work of the information services sub-committee of the Joint Information Services Committee in the UK
The paper seeks to record the work of the committee and its interaction with the much better known Electronic Libraries (eLib) Programme. It also examines the principles that underlay the development of content acquisition and supporting infrastructure in UK university libraries in the 1990s
Attributes of Big Data Analytics for Data-Driven Decision Making in Cyber-Physical Power Systems
Big data analytics is a virtually new term in power system terminology. This concept delves into the way a massive volume of data is acquired, processed, analyzed to extract insight from available data. In particular, big data analytics alludes to applications of artificial intelligence, machine learning techniques, data mining techniques, time-series forecasting methods. Decision-makers in power systems have been long plagued by incapability and weakness of classical methods in dealing with large-scale real practical cases due to the existence of thousands or millions of variables, being time-consuming, the requirement of a high computation burden, divergence of results, unjustifiable errors, and poor accuracy of the model. Big data analytics is an ongoing topic, which pinpoints how to extract insights from these large data sets. The extant article has enumerated the applications of big data analytics in future power systems through several layers from grid-scale to local-scale. Big data analytics has many applications in the areas of smart grid implementation, electricity markets, execution of collaborative operation schemes, enhancement of microgrid operation autonomy, management of electric vehicle operations in smart grids, active distribution network control, district hub system management, multi-agent energy systems, electricity theft detection, stability and security assessment by PMUs, and better exploitation of renewable energy sources. The employment of big data analytics entails some prerequisites, such as the proliferation of IoT-enabled devices, easily-accessible cloud space, blockchain, etc. This paper has comprehensively conducted an extensive review of the applications of big data analytics along with the prevailing challenges and solutions
Predicting customer's gender and age depending on mobile phone data
In the age of data driven solution, the customer demographic attributes, such
as gender and age, play a core role that may enable companies to enhance the
offers of their services and target the right customer in the right time and
place. In the marketing campaign, the companies want to target the real user of
the GSM (global system for mobile communications), not the line owner. Where
sometimes they may not be the same. This work proposes a method that predicts
users' gender and age based on their behavior, services and contract
information. We used call detail records (CDRs), customer relationship
management (CRM) and billing information as a data source to analyze telecom
customer behavior, and applied different types of machine learning algorithms
to provide marketing campaigns with more accurate information about customer
demographic attributes. This model is built using reliable data set of 18,000
users provided by SyriaTel Telecom Company, for training and testing. The model
applied by using big data technology and achieved 85.6% accuracy in terms of
user gender prediction and 65.5% of user age prediction. The main contribution
of this work is the improvement in the accuracy in terms of user gender
prediction and user age prediction based on mobile phone data and end-to-end
solution that approaches customer data from multiple aspects in the telecom
domain
- …