11,427 research outputs found
Fast Non-Parametric Learning to Accelerate Mixed-Integer Programming for Online Hybrid Model Predictive Control
Today's fast linear algebra and numerical optimization tools have pushed the
frontier of model predictive control (MPC) forward, to the efficient control of
highly nonlinear and hybrid systems. The field of hybrid MPC has demonstrated
that exact optimal control law can be computed, e.g., by mixed-integer
programming (MIP) under piecewise-affine (PWA) system models. Despite the
elegant theory, online solving hybrid MPC is still out of reach for many
applications. We aim to speed up MIP by combining geometric insights from
hybrid MPC, a simple-yet-effective learning algorithm, and MIP warm start
techniques. Following a line of work in approximate explicit MPC, the proposed
learning-control algorithm, LNMS, gains computational advantage over MIP at
little cost and is straightforward for practitioners to implement
Classification hardness for supervised learners on 20 years of intrusion detection data
This article consolidates analysis of established (NSL-KDD) and new intrusion detection datasets (ISCXIDS2012, CICIDS2017, CICIDS2018) through the use of supervised machine learning (ML) algorithms. The uniformity in analysis procedure opens up the option to compare the obtained results. It also provides a stronger foundation for the conclusions about the efficacy of supervised learners on the main classification task in network security. This research is motivated in part to address the lack of adoption of these modern datasets. Starting with a broad scope that includes classification by algorithms from different families on both established and new datasets has been done to expand the existing foundation and reveal the most opportune avenues for further inquiry. After obtaining baseline results, the classification task was increased in difficulty, by reducing the available data to learn from, both horizontally and vertically. The data reduction has been included as a stress-test to verify if the very high baseline results hold up under increasingly harsh constraints. Ultimately, this work contains the most comprehensive set of results on the topic of intrusion detection through supervised machine learning. Researchers working on algorithmic improvements can compare their results to this collection, knowing that all results reported here were gathered through a uniform framework. This work's main contributions are the outstanding classification results on the current state of the art datasets for intrusion detection and the conclusion that these methods show remarkable resilience in classification performance even when aggressively reducing the amount of data to learn from
A fine-grained approach to scene text script identification
This paper focuses on the problem of script identification in unconstrained
scenarios. Script identification is an important prerequisite to recognition,
and an indispensable condition for automatic text understanding systems
designed for multi-language environments. Although widely studied for document
images and handwritten documents, it remains an almost unexplored territory for
scene text images.
We detail a novel method for script identification in natural images that
combines convolutional features and the Naive-Bayes Nearest Neighbor
classifier. The proposed framework efficiently exploits the discriminative
power of small stroke-parts, in a fine-grained classification framework.
In addition, we propose a new public benchmark dataset for the evaluation of
joint text detection and script identification in natural scenes. Experiments
done in this new dataset demonstrate that the proposed method yields state of
the art results, while it generalizes well to different datasets and variable
number of scripts. The evidence provided shows that multi-lingual scene text
recognition in the wild is a viable proposition. Source code of the proposed
method is made available online
Simple and Effective Visual Models for Gene Expression Cancer Diagnostics
In the paper we show that diagnostic classes in cancer gene expression data sets, which most often include thousands of features (genes), may be effectively separated with simple two-dimensional plots such as scatterplot and radviz graph. The principal innovation proposed in the paper is a method called VizRank, which is able to score and identify the best among possibly millions of candidate projections for visualizations. Compared to recently much applied techniques in the field of cancer genomics that include neural networks, support vector machines and various ensemble-based approaches, VizRank is fast and finds visualization models that can be easily examined and interpreted by domain experts. Our experiments on a number of gene expression data sets show that VizRank was always able to find data visualizations with a small number of (two to seven) genes and excellent class separation. In addition to providing grounds for gene expression cancer diagnosis, VizRank and its visualizations also identify small sets of relevant genes, uncover interesting gene interactions and point to outliers and potential misclassifications in cancer data sets
- …