3 research outputs found
The Importance of Generalizability to Anomaly Detection
In security-related areas there is concern over novel “zero-day” attacks that penetrate system defenses and wreak havoc. The best methods for countering these threats are recognizing “nonself” as in an Artificial Immune System or recognizing “self” through clustering. For either case, the concern remains that something that appears similar to self could be missed. Given this situation, one could incorrectly assume that a preference for a tighter fit to self over generalizability is important for false positive reduction in this type of learning problem. This article confirms that in anomaly detection as in other forms of classification a tight fit, although important, does not supersede model generality. This is shown using three systems each with a different geometric bias in the decision space. The first two use spherical and ellipsoid clusters with a k-means algorithm modified to work on the one-class/blind classification problem. The third is based on wrapping the self points with a multidimensional convex hull (polytope) algorithm capable of learning disjunctive concepts via a thresholding constant. All three of these algorithms are tested using the Voting dataset from the UCI Machine Learning Repository, the MIT Lincoln Labs intrusion detection dataset, and the lossy-compressed steganalysis domain
Soft computing and non-parametric techniques for effective video surveillance systems
Esta tesis propone varios objetivos interconectados para el diseño de un sistema de vĂdeovigilancia cuyo funcionamiento es pensado para un amplio rango de condiciones. Primeramente se propone una mĂ©trica de evaluaciĂłn del detector y sistema de seguimiento basada en una mĂnima referencia. Dicha tĂ©cnica es una respuesta a la demanda de ajuste de forma rápida y fácil del sistema adecuándose a distintos entornos. TambiĂ©n se propone una tĂ©cnica de optimizaciĂłn basada en Estrategias Evolutivas y la combinaciĂłn de funciones de idoneidad en varios pasos. El objetivo es obtener los parámetros de ajuste del detector y el sistema de seguimiento adecuados para el mejor funcionamiento en una amplia gama de situaciones posibles Finalmente, se propone la construcciĂłn de un clasificador basado en tĂ©cnicas no paramĂ©tricas que pudieran modelar la distribuciĂłn de datos de entrada independientemente de la fuente de generaciĂłn de dichos datos. Se escogen actividades detectables a corto plazo que siguen un patrĂłn de tiempo que puede ser fácilmente modelado mediante HMMs. La propuesta consiste en una modificaciĂłn del algoritmo de Baum-Welch con el fin de modelar las probabilidades de emisiĂłn del HMM mediante una tĂ©cnica no paramĂ©trica basada en estimaciĂłn de densidad con kernels (KDE). _____________________________________This thesis proposes several interconnected objectives for the design of a video-monitoring
system whose operation is thought for a wide rank of conditions.
Firstly an evaluation technique of the detector and tracking system is proposed and it is based
on a minimum reference or ground-truth. This technique is an answer to the demand of fast and
easy adjustment of the system adapting itself to different contexts.
Also, this thesis proposes a technique of optimization based on Evolutionary Strategies and
the combination of fitness functions. The objective is to obtain the parameters of adjustment of
the detector and tracking system for the best operation in an ample range of possible situations.
Finally, it is proposed the generation of a classifier in which a non-parametric statistic technique
models the distribution of data regardless the source generation of such data. Short term
detectable activities are chosen that follow a time pattern that can easily be modeled by Hidden
Markov Models (HMMs). The proposal consists in a modification of the Baum-Welch algorithm
with the purpose of modeling the emission probabilities of the HMM by means of a nonparametric
technique based on the density estimation with kernels (KDE)
Generalization and Generalizability Measures
In this paper, we define the generalization problem, summarize various approaches in generalization, identify the credit assignment problem, and present the problem and some solutions in measuring generalizability. We discuss anomalies in the ordering of hypotheses in a subdomain when performance is normalized and averaged, and show conditions under which anomalies can be eliminated. To generalize performance across subdomains, we present a measure called probability of win that measures the probability whether a hypothesis is better than another. Finally, we discuss some limitations in using probabilities of win and illustrate their application in finding new parameter values for TimberWolf, a package for VLSI cell placement and routing. 1 Introduction Generalization in psychology is the tendency to respond in the same way to different but similar stimuli [6]. Such transfer of tendency may be based on temporal stimuli, spatial cues, or other physical characteristics. Learning, on the..