9 research outputs found

    A new method for anomaly detection based on non-convex boundaries with random two-dimensional projections

    Get PDF
    [Abstract] The implementation of anomaly detection systems represents a key problem that has been focusing the efforts of scientific community. In this context, the use one-class techniques to model a training set of non-anomalous objects can play a significant role. One common approach to face the one-class problem is based on determining the geometric boundaries of the target set. More specifically, the use of convex hull combined with random projections offers good results but presents low performance when it is applied to non-convex sets. Then, this work proposes a new method that face this issue by implementing non-convex boundaries over each projection. The proposal was assessed and compared with the most common one-class techniques, over different sets, obtaining successful results

    Intelligent One-Class Classifiers for the Development of an Intrusion Detection System: The MQTT Case Study

    Get PDF
    [EN] The ever-increasing number of smart devices connected to the internet poses an unprecedented security challenge. This article presents the implementation of an Intrusion Detection System (IDS) based on the deployment of different one-class classifiers to prevent attacks over the Internet of Things (IoT) protocol Message Queuing Telemetry Transport (MQTT). The utilization of real data sets has allowed us to train the one-class algorithms, showing a remarkable performance in detecting attacks.SIInstituto Nacional de CiberseguridadInstituto de Ciencias Aplicadas a la Cibersegurida

    Intelligent one-class classifiers for the development of an intrusion detection system: the MQTT case study

    Get PDF
    [Abstarct] The ever-increasing number of smart devices connected to the internet poses an unprecedented security challenge. This article presents the implementation of an Intrusion Detection System (IDS) based on the deployment of different one-class classifiers to prevent attacks over the Internet of Things (IoT) protocol Message Queuing Telemetry Transport (MQTT). The utilization of real data sets has allowed us to train the one-class algorithms, showing a remarkable performance in detecting attacks

    Classifier-based constraint acquisition

    Get PDF
    Modeling a combinatorial problem is a hard and error-prone task requiring significant expertise. Constraint acquisition methods attempt to automate this process by learning constraints from examples of solutions and (usually) non-solutions. Active methods query an oracle while passive methods do not. We propose a known but not widely-used application of machine learning to constraint acquisition: training a classifier to discriminate between solutions and non-solutions, then deriving a constraint model from the trained classifier. We discuss a wide range of possible new acquisition methods with useful properties inherited from classifiers. We also show the potential of this approach using a Naive Bayes classifier, obtaining a new passive acquisition algorithm that is considerably faster than existing methods, scalable to large constraint sets, and robust under errors

    Anomaly detection based on intelligent techniques over a bicomponent production plant used on wind generator blades manufacturing

    Full text link
    [ES] Los avances tecnológicos en general, y en el ámbito de la industria en particular, conllevan el desarrollo y optimización de las actividades que en ella tienen lugar. Para alcanzar este objetivo, resulta de vital importancia detectar cualquier tipo de anomalía en su fase más incipiente, contribuyendo, entre otros, al ahorro energético y económico, y a una reducción del impacto ambiental. En un contexto en el que se fomenta la reducción de emisión de gases contaminantes, las energías alternativas, especialmente la energía eólica, juegan un papel crucial. En la fabricación de las palas de aerogenerador se recurre comúnmente a materiales de tipo bicomponente, obtenidos a través del mezclado de dos substancias primarias. En la presente investigación se evalúan distintas técnicas inteligentes de clasificación one-class para detectar anomalías en un sistema de mezclado para la obtención de materiales bicomponente empleados en la elaboración de palas de aerogenerador. Para lograr los modelos[EN] Technological advances, especially in the industrial field, have led to the development and optimization of the activities that takes place on it. To achieve this goal, an early detection of any kind of anomaly is very important. This can contribute to energy and economic savings and an environmental impact reduction. In a context where the reduction of pollution gasses emission is promoted, the use of alternative energies, specially the wind energy, plays a key role. The wind generator blades are usually manufactured from bicomponent material, obtained from the mixture of two dierent primary components. The present research assesses dierent one-class intelligent techniques to perform anomaly detection on a bicomponent mixing system used on the wind generator manufacturing. To perform the anomaly detection, the intelligent models were obtained from real dataset recorded during the right operation of a bicomponent mixing plant. The classifiers for each technique were validated using artJove, E.; Casteleiro-Roca, J.; Quintián, H.; Méndez-Pérez, JA.; Calvo-Rolle, JL. (2020). Detección de anomalías basada en técnicas inteligentes de una planta de obtención de material bicomponente empleado en la fabricación de palas de aerogenerador. Revista Iberoamericana de Automática e Informática industrial. 17(1):84-93. https://doi.org/10.4995/riai.2019.11055OJS8493171Bradley, A. P., 1997. The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recognition 30 (7), 1145 - 1159. https://doi.org/10.1016/S0031-3203(96)00142-2Casale, P., Pujol, O., Radeva, P., 2011. Approximate convex hulls family for one-class classification. In: Sansone, C., Kittler, J., Roli, F. (Eds.), Multiple Classifier Systems. Springer Berlin Heidelberg, Berlin, Heidelberg, pp. 106-115. https://doi.org/10.1007/978-3-642-21557-5_13Casale, P., Pujol, O., Radeva, P., 2014. Approximate polytope ensemble for oneclass classification. Pattern Recognition 47 (2), 854 - 864. https://doi.org/10.1016/j.patcog.2013.08.007Chandola, V., Banerjee, A., Kumar, V., 2009. Anomaly detection: A survey. ACM computing surveys (CSUR) 41 (3), 15. https://doi.org/10.1145/1541880.1541882Chen, Y., Zhou, X. S., Huang, T. S., 2001. One-class svm for learning in image retrieval. In: Image Processing, 2001. Proceedings. 2001 International Conference on. Vol. 1. IEEE, pp. 34-37.Chiang, L. H., Russell, E. L., Braatz, R. D., 2000. Fault detection and diagnosis in industrial systems. Springer Science & Business Media.de la Portilla, M. P., Piñeiro, A. L., Sánchez, J. A. S., Herrera, R. M., 2017. Modelado dinámico y control de un dispositivo sumergido provisto de actuadores hidrostáticos. Revista Iberoamericana de Automtica e Informática industrial 15 (1), 12-23. https://doi.org/10.4995/riai.2017.8824Fan, H.,Wong, C., Yuen, M.-F., April 2006. Prediction of material properties of epoxy materials using molecular dynamic simulation. In: Thermal, Mechanical and Multiphysics Simulation and Experiments in Micro-Electronics and Micro-Systems, 2006. EuroSime 2006. 7th International Conference on. pp. 1-4. https://doi.org/10.1109/ESIME.2006.1644033Fernández-Francos, D., Fontenla-Romero, O., Alonso-Betanzos, A., 2018. One-class convex hull-based algorithm for classification in distributed environments. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 1-11. https://doi.org/10.1109/TSMC.2017.2771341González, G., Angelo, C. D., Forchetti, D., Aligia, D., 2018. Diagnósico de fallas en el convertidor del rotor en generadores de inducción con rotor bobinado. Revista Iberoamericana de Automática e Informática industrial 15 (3), 297-308. https://doi.org/10.4995/riai.2017.9042Goodfellow, I., Bengio, Y., Courville, A., Bengio, Y., 2016. Deep learning. Vol. 1. MIT press Cambridge.Heller, K. A., Svore, K. M., Keromytis, A. D., Stolfo, S. J., 2003. One class support vector machines for detecting anomalous windows registry accesses. In: Proc. of the workshop on Data Mining for Computer Security. Vol. 9.Hobday, M., 1998. Product complexity, innovation and industrial organisation. Research policy 26 (6), 689-710. https://doi.org/10.1016/S0048-7333(97)00044-9Hodge, V., Austin, J., 2004. A survey of outlier detection methodologies. Artificial intelligence review 22 (2), 85-126. https://doi.org/10.1023/B:AIRE.0000045502.10941.a9Hwang, B., Cho, S., 1999. Characteristics of auto-associative mlp as a novelty detector. In: Neural Networks, 1999. IJCNN'99. International Joint Conference on. Vol. 5. IEEE, pp. 3086-3091.Jove, E., Casteleiro-Roca, J.-L., Quintián, H., Méndez-Pérez, J. A., Calvo-Rolle, J. L., 2018. A new approach for system malfunctioning over an industrial system control loop based on unsupervised techniques. In: Graña, M., López-Guede, J. M., Etxaniz, O., Herrero, Á., Sáez, J. A., Quintián, H., Corchado, E. (Eds.), International Joint Conference SOCO'18-CISIS'18- ICEUTE'18. Springer International Publishing, Cham, pp. 415-425. https://doi.org/10.1007/978-3-319-94120-2_40Krstajic, D., Buturovic, L. J., Leahy, D. E., Thomas, S., Mar 2014. Crossvalidation pitfalls when selecting and assessing regression and classification models. Journal of Cheminformatics 6 (1), 10. URL: https://doi.org/10.1186/1758-2946-6-10 https://doi.org/10.1186/1758-2946-6-10Li, K.-L., Huang, H.-K., Tian, S.-F., Xu, W., 2003. Improving one-class svm for anomaly detection. In: Machine Learning and Cybernetics, 2003 International Conference on. Vol. 5. IEEE, pp. 3077-3081.Miljkovic, D., 2011. Fault detection methods: A literature survey. In: MIPRO, 2011 proceedings of the 34th international convention. IEEE, pp. 750-755.Sakurada, M., Yairi, T., 2014. Anomaly detection using autoencoders with nonlinear dimensionality reduction. In: Proceedings of the MLSDA 2014 2nd Workshop on Machine Learning for Sensory Data Analysis. ACM, p. 4 https://doi.org/10.1145/2689746.2689747Schölkopf, B., Platt, J. C., Shawe-Taylor, J., Smola, A. J., Williamson, R. C., 2001. Estimating the support of a high-dimensional distribution. Neural computation 13 (7), 1443-1471. https://doi.org/10.1162/089976601750264965Schwartz, J., 1994. Air pollution and daily mortality: A review and meta analysis. Environmental Research 64 (1), 36 - 52. https://doi.org/10.1006/enrs.1994.1005Shalabi, L. A., Shaaban, Z., May 2006. Normalization as a preprocessing engine for data mining and the approach of preference matrix. In: 2006 International Conference on Dependability of Computer Systems. pp. 207-214. https://doi.org/10.1109/DEPCOS-RELCOMEX.2006.38Tax, D., Jan 2018. Ddtools, the data description toolbox for matlab. Version 2.1.3.Tax, D. M. J., 2001. One-class classification: concept-learning in the absence of counter-examples [ph. d. thesis]. Delft University of Technology.Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.-A., 2010. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. Journal of Machine Learning Research 11 (Dec), 3371-3408.Wei, X., Huang, G., Li, Y., Aug 2007. Mahalanobis ellipsoidal learning machine for one class classification. In: 2007 International Conference on Machine Learning and Cybernetics. Vol. 6. pp. 3528-3533. https://doi.org/10.1109/ICMLC.2007.4370758Westerhuis, J. A., Gurden, S. P., Smilde, A. K., 2000. Generalized contribution plots in multivariate statistical process monitoring. Chemometrics and intelligent laboratory systems 51 (1), 95-114. https://doi.org/10.1016/S0169-7439(00)00062-9Wu, J., Zhang, X., 2001. A pca classifier and its application in vehicle detection. In: IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No. 01CH37222). Vol. 1. IEEE, pp. 600-604.Young, W.-B., Wu, W.-H., Aug 2011. Optimization of the skin thickness distribution in the composite wind turbine blade. In: Fluid Power and Mechatronics (FPM), 2011 International Conference on. pp. 62-66. https://doi.org/10.1109/FPM.2011.6045730Zeng, Z., Wang, J., 2010. Advances in neural network research and applications, 1st Edition. Springer Publishing Company, Incorporated. https://doi.org/10.1007/978-3-642-12990-2Zuo, Y., Liu, H., June 2012. Evaluation on comprehensive benefit of wind power generation and utilization of wind energy. In: Software Engineering and Service Science (ICSESS), 2012 IEEE 3rd International Conference on. pp. 635-638. https://doi.org/10.1109/ICSESS.2012.626954

    Computational learning algorithms for large-scale datasets

    Get PDF
    Programa Oficial de Doutoramento en Computación . 5009V01[Resumen]Actualmente nos encontramos sumidos en una avalancha de datos. Este hecho ha modificado fundamentalmente la manera en que se comparte la información y ha puesto de manifiesto la necesidad de desarrollar nuevos métodos eficientes para procesar y almacenar grandes cantidades de datos. El aprendizaje computacional es el área de la inteligencia artificial dedicada a estudiar algoritmos que puedan aprender a partir de los datos, hacer predicciones o crear representaciones exactas basadas en las observaciones. En este contexto, en el que el número de datos crece más rápido que la velocidad de los procesadores, la capacidad de los algoritmos tradicionales de aprendizaje máquina se encuentra limitada por el tiempo de computación y no por el tamaño de la muestra. Además, al tratar con gran cantidad de datos, los algoritmos de aprendizaje pueden degenerar su rendimiento debido al sobreajuste y su eficiencia decae de acuerdo con el tamaño. Por lo tanto, la escalabilidad de los algoritmos de aprendizaje ha dejado de ser una característica deseable de los algoritmos de aprendizaje para convertirse en una propiedad crucial cuando se trabaja con conjuntos de datos muy grandes. Existen, básicamente, tres enfoques diferentes para asegurar la escalabilidad de los algoritmos a medida que los conjuntos de datos continúan creciendo en tamaño y complejidad: aprendizaje en tiempo real, aprendizaje no iterativo y aprendizaje distribuido. Esta tesis desarrolla nuevos métodos de aprendizaje computacional escalables y eficientes siguiendo los tres enfoques anteriores. Específicamente, se desarrollan cuatro nuevos algoritmos: (1) El primero combina selección de características y clasificación en tiempo real, mediante la adaptación de un filtro clásico y la modificación de un algoritmo de aprendizaje incremental basado en una red neuronal de una capa. (2) El siguiente consiste en nuevo clasificador uniclase basado en una función de coste no iterativa para redes neuronales autoasociativas que lleva a cabo la reducción de dimensionalidad en la capa oculta mediante la técnica de Decomposición en Valores Singulares. (3) El tercer método es un nuevo clasificador uniclase basado en el cierre convexo para entornos de datos distribuidos que reduce la dimensionalidad del problema y, por lo tanto, la complejidad, mediante la utilización de proyecciones aleatorias. (4) Por último, se presenta una versión incremental del anterior algoritmo de clasificación uniclase.[Resumo] Hoxe en día atopámonos soterrados nunha morea de datos. Isto cambiou fundamentalmente a fonna na que a infonnación é compartida e puxo de manifesto a necesidade de desenvolver novos métodos eficientes para o procesamento e o almacenamento de grandes cantidades de datos. A aprendizaxe computacional é a área da intelixencia artificial dedicada a estudar algoritmos que poden aprender a partir dos datos. facer previsións 00 crear representacións precisas con base nas observacións. Neste contexto, no cal o número de datos crece roáis rápido que a velocidade dos procesadores, a capacidade dos algoritmos de aprendizaxe máquina tradicionais vese limitada polo tempo de computación e non polo tamaño da mostra. Ademais, cando se trata de grandes cantidades de datos, os algoritmos de aprendizaxe poden dexenerar o seu rendemento debido ó sobreaxuste e a súa eficiencia decae segundo o tamaño. Polo tanto, a escalabilidade dos algoritmos de aprendizaxe xa non é unha caracteristica desexable senón que se trata de unha propiedade fundamental cando se traballa con conxuntos de datos IDoi grandes. Existen basicamente tres enfoques diferentes para garantir a escalabilidade dos algoritmos namentres os conxuntos de datos seguen a medrar en tamaño e complexidade: aprendizaxe en tempo real, aprendizaxe non iterativa e aprendizaxe distribuida. Esta tese presenta novos métodos de aprendizaxe computacional escalables e eficientes seguindo os tres enfoques anteriores. En concreto, desenvólvense catro novos algoritmos: (1) O primeiro método mistura selección de características e clasificación en tempo real, a través da adaptación dun filtro convencional e da modificación de un algoritmo incrementábel baseado nunha rede de neuronas de unha capa: (2) O seguinte é un novo clasificador uniclase con base nunha función de custo non iterativa para redes de neuronas auto asociativas que leva a cabo a redución da dirnensionalidade na capa oculta pola técnica de Descomposición en Valores Singulares. (3) O terceiro método é un novo clasificador uniclase baseado no convex hull para conxuntos de datos distribuidos que reduce a dimensión dos datos do problema e, polo tanto, a complexidade, utilizando proxeccións aleatorias. (4) Por último, preséntase unha versión incremental do algoritmo de clasifición unicIase anterior.[Abstract] Nowadays we are engulfed in a flood of data. Tbis faet has fundamentally changed the ways that infonnation is shared, and has marle it clear that efficient methods fOI processing and staring vast amounts oi data should be put forward. Computationallearning theory i5 the area of artificial intelligence devoted to study algorithms aim at leaming froro data, building accurate models based on observations. In this context, where data has grown faster than the speed Di processors, the capabilities of traditional machine Iearning algorithms are limited by the computational time rather than by the sample size. Besides, when dealing with large quantities of data, learning algorithms can degenerate the:ir performance due to ayer-fitting and their efficiency declines in accordance with size. Therefore, the scalability Di the learning algorithms has turned froro a desirable property into a crucial one when very large datasets are envisioned, There exists, basically, three intersecting approaches to ensure algorithms scalability as datasets continue to grow in size and complexity: online learning, non-iterative learning and distributed leaming, This thesis develops new efficient and scalable machine leaming methods following the three previous approaches. Specifically, four new algorithms are developed, (1) The first one perfonns onIine feature selection and classification at the sarue time, by the adaptation of a c1assical fiIter method and the modification of an ooline leaming algorithm for one-Iayer neuraI network, (2) The next one is a new fast and efficient one-c1ass c1assifier based 00 a non-iterative cost function for autoassociative neural networks that perfonns dimensionality reduction io the hidden layer by means of Singular VaIue Decomposition. (3) The third method is a new onec1ass convex hull-based c1assifier fer distributed environments that reduces the dimeosionality of the problem and hence the complexity by means of Random Projections, (4) FinaIly, an onlioe version of the previous one-class classification algorithm is presented
    corecore