Search CORE

101 research outputs found

Improving the Generation of Labeled Network Traffic Datasets Through Machine Learning Techniques

Author: Catania Carlos
Guerra Jorge
Publication venue
Publication date: 01/10/2017
Field of study

The problem of detecting malicious behavior in network traffic has become an extremely difficult challenge for the security community. Consequently, several intelligence-based tools have been proposed to generate models capable of understanding the information traveling through the network and to help in the identification of suspicious connections as soon as possible. However, the lack of high-quality datasets has been one of the main obstacles in the developing of reliable intelligence-based tools. A well-labeled dataset is fundamental not only for the process of automatically learning models but also for testing its performance. Recently, RiskID emerged with the goal of providing to the network security community a collaborative tool for helping the labeling process. Through the use of visual and statistical techniques, RiskID facilitates to the user the generation of labeled datasets from real connections. In this article, we present a machine learning extension for RiskID, to help the user in the malware identification process. A preliminary study shows that as the size of labeled data increases, the use of machine learning models can be a valuable tool during the labeling process of future traffic connections.VI Workshop de Seguridad Informática (WSI).Red de Universidades con Carreras en Informática (RedUNCI

Servicio de Difusión de la Creación Intelectual

Towards efficient intrusion detection systems based on machine learning techniques

Author: Catania Carlos
García Garino Carlos
Vallés Mariano
Publication venue
Publication date: 01/10/2010
Field of study

Intrusion Detection System (IDS) have been the key in the network manager daily fight against continuous attacks. However, with the Internet growth, network security issues have become more difficult to handle. Jointly, Machine Learning (ML) techniques for traffic classification have been successful in terms of performance classification. Unfortunately, most of these techniques are extremely CPU time consuming, making the whole approach unsuitable for real traffic situations. In this work, a description of a simple software architecture for ML based is presented together with the first steps towards improving algorithms efficience in some of the proposed modules. A set experiments on the 199 DARPA dataset are conducted in order to evaluate two atribute selecting algorithms considering not only classsification perfomance but also the required CPU time. Preliminary results show that computadtioal effort can be reduced by 50% maintaining similar accuaracy levels, progressing towards a real world implementation of an ML based IDS.Presentado en el V Workshop Arquitectura, Redes y Sistemas Operativos (WARSO)Red de Universidades con Carreras en Informática (RedUNCI

An autonomous labeling approach to support vector machines algorithms for network traffic anomaly detection

Author: Bromberg Facundo
Catania Carlos Adrian
Garcia Garino Carlos Gabriel
Publication venue: Pergamon-Elsevier Science Ltd
Publication date: 01/02/2012
Field of study

In the past years, several support vector machines (SVM) novelty detection approaches have been applied on the network intrusion detection field. The main advantage of these approaches is that they can characterize normal traffic even when trained with datasets containing not only normal traffic but also a number of attacks. Unfortunately, these algorithms seem to be accurate only when the normal traffic vastly outnumbers the number of attacks present in the dataset. A situation which can not be always hold. This work presents an approach for autonomous labeling of normal traffic as a way of dealing with situations where class distribution does not present the imbalance required for SVM algorithms. In this case, the autonomous labeling process is made by SNORT, a misuse-based intrusion detection system. Experiments conducted on the 1998 DARPA dataset show that the use of the proposed autonomous labeling approach not only outperforms existing SVM alternatives but also, under some attack distributions, obtains improvements over SNORT itself.Fil: Catania, Carlos Adrian. Universidad Nacional de Cuyo; ArgentinaFil: Bromberg, Facundo. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Mendoza; Argentina. Universidad Tecnológica Nacional. Facultad Regional Mendoza. Departamento de Sistemas de Información. Laboratorio DHARMA; ArgentinaFil: Garcia Garino, Carlos Gabriel. Universidad Nacional de Cuyo. Facultad de Ingeniería; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Mendoza; Argentin

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

CONICET Digital

Behavior Classification of A Grazing Goat in the Argentine Monte Desert by Using Inertial Sensors

Author: Catania Carlos
González Rodrigo
Páez Lama Sebastián
Publication venue
Publication date: 01/09/2019
Field of study

The knowledge generated by animal behavior studies has been gaining importance due to it can be used to improve the efficiency of animal production systems. In recent years, sensor-based approaches for animal behavior classification has emerged as a promising alternative for analyzing animals grazing patterns. In the present article it is proposed the use of a classification system based on inertial sensors for identifying a goat’s grazing behavior in the Argentine Monte Desert. The data acquisition system is based on commercial off-the-self devices. It is used to create a reliable dataset for performing the animal behavior predictions. By fixing the system on the head of a goat it was possible to log its movements when it was grazing in a natural pasture. A preliminary version of the dataset is evaluated using a classical statistical learning algorithm. Results show that goat activities can be predicted with an average precision value above 85% and a recall of 84%.Sociedad Argentina de Informática e Investigación Operativ

Behavior Classification of A Grazing Goat in the Argentine Monte Desert by Using Inertial Sensors

Author: Catania Carlos
González Rodrigo
Páez Lama Sebastián
Publication venue
Publication date: 01/09/2019
Field of study

Servicio de Difusión de la Creación Intelectual

LLM in the Shell: Generative Honeypots

Author: Catania Carlos
Garcia Sebastian
Sladić Muris
Valeros Veronica
Publication venue
Publication date: 31/08/2023
Field of study

Honeypots are essential tools in cybersecurity. However, most of them (even the high-interaction ones) lack the required realism to engage and fool human attackers. This limitation makes them easily discernible, hindering their effectiveness. This work introduces a novel method to create dynamic and realistic software honeypots based on Large Language Models. Preliminary results indicate that LLMs can create credible and dynamic honeypots capable of addressing important limitations of previous honeypots, such as deterministic responses, lack of adaptability, etc. We evaluated the realism of each command by conducting an experiment with human attackers who needed to say if the answer from the honeypot was fake or not. Our proposed honeypot, called shelLM, reached an accuracy rate of 0.92.Comment: 5 pages. 1 figure 1 tabl

arXiv.org e-Print Archive

An application of Deep Neural Networks for automatic detection of randomly generated Domain Names

Author: Catania Carlos
Publication venue
Publication date
Field of study

En el contexto de la seguridad de redes de datos, un nombre de dominio generado de manera algorítmica (DGA, de sus siglas en inglés) es utilizado por el software malicioso (malware) para generar de manera dinámica un gran número de nombres de dominios de manera pseudo aleatoria, y luego utilizar un subconjunto de estos como parte del canal de Comando y Control (C&C). Este canal podrá luego ser utilizado para indicar, a las máquinas infectadas con el malware, diferentes acciones maliciosas como ser SPAM, campañas de Clicks, Denegación de servicio, etc. El presente proyecto propone el desarrollo de algoritmos de detección de DGA mediante la utilización de algoritmos de aprendizaje de máquinas en general y las redes neuronales profundas en particular. En los últimos 10 años la utilización de redes neuronales profundas ha sido la causa detrás de los mayores avances en el reconocimiento automático de imágenes, audio, video y análisis de texto. Se espera que la aplicación de redes neuronales profundas para el aprendizaje de los patrones comunes a los DGA permita desarrollar herramientas de detección no solo con una baja tasa de falsos positivos sino también con la capacidad de operar en tiempo real. Esto último resulta fundamental para lidiar con las amenazas de seguridad de hoy.A domain generation algorithm (DGA) is used to dynamically generate a large number of pseudo random domain names and then selecting a small subset of these domains for the Command Control (C&C) communication channel. The idea behind the dynamic nature of DGA was to avoid the inclusion of hard-coded domain names inside malware binaries, complicating the extraction of this information by reverse engineering. The C&C channel can be used for instructing the botnet to take different malicious actions such as SPAM, click campaign, DDOS, etc. The present project proposes the development of an algorithm for DGA detection based on machine learning algorithms. In particular, we propose the use of Deep Neural Networks. In the last 10 years, deep learning techniques has been the cause behind the major advances in the automatic recognition of images, audio, video and text. We expect the ability of deep neural networks for recognizing common patterns in DGA facilitates the development of a detection tool. A tool what will operate not only with a low false positive rate but also in real time. Both requirements are fundamental for dealing with today security threats

Repositorio OAI Biblioteca Digital Universidad Nacional de Cuyo

Predicting Harbertson-Adams Assay Phenolic Parameters In Red Wines Using Visible Spectra

Author: Catania Aníbal
Catania Carlos
Fanzone Martín
Sari Santiago
Publication venue
Publication date: 19/03/2021
Field of study

The Harbertson-Adams phenolic parameter assay is a well- known method to measure a panel of phenolic compounds in red wines. However, the multistep analyses required by the method fail at producing results on multiple parameters rapidly. In the present article, we analyze the bene ts of applying a statistical model based on Principal Component Analysis (PCA) and a statistical learning technique denoted as Support Vector Regression Machines (SVR) for correlating sample spectra data to the Harbertson-Adams assay, on each of the phenolics components. The resulting model showed a high correlation between the measured and predicted values for each of the phenolic parameters despite the multicollinearity and high dimensions of the dataset.Sociedad Argentina de Informática e Investigación Operativ

Servicio de Difusión de la Creación Intelectual

LibreSense: software para análisis sensorial de alimentos

Author: Catania Aníbal
Catania Carlos
Fanzone Martín
Sari Santiago
Publication venue
Publication date: 06/02/2019
Field of study

El tiempo empleado en la recolección de datos y su tratamiento estadístico constituye una seria limitante para el análisis sensorial de alimentos. Si bien hay diversos programas comerciales que realizan dicha labor de manera automática, los mismos tienen un alto costo por su elevada licencia anual. Aquellos organismos que no pueden afrontar dicho costo, normalmente trabajan con planillas físicas las cuales decodifican manualmente con el consecuente gasto en tiempo y recursos. LibreSense es una aplicación desarrollada en lenguaje R utilizando los paquetes Shiny y SensoMineR. Esta aplicación permite la captura de datos sensoriales y el análisis estadístico “in situ” tanto de los resultados de los distintos tratamientos como de la performance de los panelistas. La aplicación realiza la captura de los datos a través de cualquier dispositivo con conexión inalámbrica para luego realizar el procesamiento estadístico de los mismos. Para la evaluación de las distintas muestras, los panelistas se conectan a través de sus dispositivos a un servidor que corre con shiny server. Si bien LibreSense todavía se encuentra en un estado de desarrollo preliminar, actualmente constituye una herramienta indispensable para el panel sensorial de vinos de INTA EEAA Mendoza como así también varias organizaciones del medio han mostrado interés para su uso y adquisiciónSociedad Argentina de Informática e Investigación Operativ

Servicio de Difusión de la Creación Intelectual

Improving the Generation of Labeled Network Traffic Datasets Through Machine Learning Techniques

Author: Catania Carlos
Guerra Jorge
Publication venue
Publication date: 05/12/2017
Field of study

Servicio de Difusión de la Creación Intelectual