11,600 research outputs found
Power System Parameters Forecasting Using Hilbert-Huang Transform and Machine Learning
A novel hybrid data-driven approach is developed for forecasting power system
parameters with the goal of increasing the efficiency of short-term forecasting
studies for non-stationary time-series. The proposed approach is based on mode
decomposition and a feature analysis of initial retrospective data using the
Hilbert-Huang transform and machine learning algorithms. The random forests and
gradient boosting trees learning techniques were examined. The decision tree
techniques were used to rank the importance of variables employed in the
forecasting models. The Mean Decrease Gini index is employed as an impurity
function. The resulting hybrid forecasting models employ the radial basis
function neural network and support vector regression. Apart from introduction
and references the paper is organized as follows. The section 2 presents the
background and the review of several approaches for short-term forecasting of
power system parameters. In the third section a hybrid machine learning-based
algorithm using Hilbert-Huang transform is developed for short-term forecasting
of power system parameters. Fourth section describes the decision tree learning
algorithms used for the issue of variables importance. Finally in section six
the experimental results in the following electric power problems are
presented: active power flow forecasting, electricity price forecasting and for
the wind speed and direction forecasting
Proximal Methods for Hierarchical Sparse Coding
Sparse coding consists in representing signals as sparse linear combinations
of atoms selected from a dictionary. We consider an extension of this framework
where the atoms are further assumed to be embedded in a tree. This is achieved
using a recently introduced tree-structured sparse regularization norm, which
has proven useful in several applications. This norm leads to regularized
problems that are difficult to optimize, and we propose in this paper efficient
algorithms for solving them. More precisely, we show that the proximal operator
associated with this norm is computable exactly via a dual approach that can be
viewed as the composition of elementary proximal operators. Our procedure has a
complexity linear, or close to linear, in the number of atoms, and allows the
use of accelerated gradient techniques to solve the tree-structured sparse
approximation problem at the same computational cost as traditional ones using
the L1-norm. Our method is efficient and scales gracefully to millions of
variables, which we illustrate in two types of applications: first, we consider
fixed hierarchical dictionaries of wavelets to denoise natural images. Then, we
apply our optimization tools in the context of dictionary learning, where
learned dictionary elements naturally organize in a prespecified arborescent
structure, leading to a better performance in reconstruction of natural image
patches. When applied to text documents, our method learns hierarchies of
topics, thus providing a competitive alternative to probabilistic topic models
Classification of sporting activities using smartphone accelerometers
In this paper we present a framework that allows for the automatic identification of sporting activities using commonly available smartphones. We extract discriminative informational features from smartphone accelerometers using the Discrete Wavelet Transform (DWT). Despite the poor quality of their accelerometers, smartphones were used as capture devices due to their prevalence in today’s society. Successful classification on this basis potentially makes the technology accessible to both elite and non-elite athletes. Extracted features are used to train different categories of classifiers. No one classifier family has a reportable direct advantage in activity classification problems to date; thus we examine classifiers from each of the most widely used classifier families. We investigate three classification approaches; a commonly used SVM-based approach, an optimized classification model and a fusion of classifiers. We also investigate the effect of changing several of the DWT input parameters, including mother wavelets, window lengths and DWT decomposition levels. During the course of this work we created a challenging
sports activity analysis dataset, comprised of soccer and field-hockey activities. The average maximum F-measure accuracy of 87% was achieved using a fusion of classifiers, which was 6% better than a single classifier model and 23% better than a standard SVM approach
Algorithms for Approximate Minimization of the Difference Between Submodular Functions, with Applications
We extend the work of Narasimhan and Bilmes [30] for minimizing set functions
representable as a difference between submodular functions. Similar to [30],
our new algorithms are guaranteed to monotonically reduce the objective
function at every step. We empirically and theoretically show that the
per-iteration cost of our algorithms is much less than [30], and our algorithms
can be used to efficiently minimize a difference between submodular functions
under various combinatorial constraints, a problem not previously addressed. We
provide computational bounds and a hardness result on the mul- tiplicative
inapproximability of minimizing the difference between submodular functions. We
show, however, that it is possible to give worst-case additive bounds by
providing a polynomial time computable lower-bound on the minima. Finally we
show how a number of machine learning problems can be modeled as minimizing the
difference between submodular functions. We experimentally show the validity of
our algorithms by testing them on the problem of feature selection with
submodular cost features.Comment: 17 pages, 8 figures. A shorter version of this appeared in Proc.
Uncertainty in Artificial Intelligence (UAI), Catalina Islands, 201
An ontology enhanced parallel SVM for scalable spam filter training
This is the post-print version of the final paper published in Neurocomputing. The published article is available from the link below. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. Copyright @ 2013 Elsevier B.V.Spam, under a variety of shapes and forms, continues to inflict increased damage. Varying approaches including Support Vector Machine (SVM) techniques have been proposed for spam filter training and classification. However, SVM training is a computationally intensive process. This paper presents a MapReduce based parallel SVM algorithm for scalable spam filter training. By distributing, processing and optimizing the subsets of the training data across multiple participating computer nodes, the parallel SVM reduces the training time significantly. Ontology semantics are employed to minimize the impact of accuracy degradation when distributing the training data among a number of SVM classifiers. Experimental results show that ontology based augmentation improves the accuracy level of the parallel SVM beyond the original sequential counterpart
- …