5 research outputs found

    Agrupamiento jerárquico para la detección de condiciones de tráfico anómalo en subestaciones de energía

    Get PDF
    The IEC 61850 standard has contributed significantly to the substation management and automation process by incorporating the advantages of communications networks into the operation of power substations. However, this modernization process also involves new challenges in other areas. For example, in the field of security, several academic works have shown that the same attacks used in computer networks (DoS, Sniffing, Tampering, Spoffing among others), can also compromise the operation of a substation. This article evaluates the applicability of hierarchical clustering algorithms and statistical type descriptors (averages), in the identification of anomalous patterns of traffic in communication networks for power substations based on the IEC 61850 standard. The results obtained show that, using a hierarchical algorithm with Euclidean distance proximity criterion and simple link grouping method, a correct classification is achieved in the following operation scenarios: 1) Normal traffic, 2) IED disconnection, 3) Network discovery attack, 4) DoS attack, 5) IED spoofing attack and 6) Failure on the high voltage line. In addition, the descriptors used for the classification proved equally effective with other unsupervised clustering techniques such as K-means (partitional-type clustering), or LAMDA (diffuse-type clustering).El estándar IEC 61850 ha contribuido notablemente con el proceso de gestión y automatización de las subestaciones, al incorporar las ventajas de las redes de comunicaciones en la operación de las subestaciones de energía. Sin embargo, este proceso de modernización también involucra nuevos desafíos en otros campos. Por ejemplo, en el área de la seguridad, diversos trabajos académicos han puesto en evidencia que la operación de una subestación también puede ser comprometida por los mismos ataques utilizados en las redes de cómputo (DoS, Sniffing, Tampering, Spoffing entre otros). Este artículo evalúa la aplicabilidad de los algoritmos de agrupamiento no supervisado de tipo jerárquico y el uso de descriptores de tipo estadístico (promedios), en la identificación de patrones de tráfico anómalo en redes de comunicación para subestaciones eléctricas basadas en el estándar IEC 61850. Los resultados obtenidos demuestran que, utilizando un algoritmo jerárquico con criterio de proximidad distancia Euclidiana y método de agrupación vínculo simple, se logra una correcta clasificación de los siguientes escenarios de operación: 1) Tráfico normal, 2) Desconexión de dispositivo IED, 3) Ataque de descubrimiento de red, 4) Ataque de denegación de servicio, 5) Ataque de suplantación de IED y 6) Falló en la línea de alta tensión. Además, los descriptores utilizados para la clasificación demostraron ser robustos al lograrse idénticos resultados con otras técnicas de agrupamiento no supervisado de tipo particional como K-medias o de tipo difuso como LAMDA (Learning Algorithm Multivariable and Data Analysis)

    Hierarchical Clustering for Anomalous Traffic Conditions Detection in Power Substations

    Get PDF
    The IEC 61850 standard has contributed significantly to the substation management and automation process by incorporating the advantages of communications networks into the operation of power substations. However, this modernization process also involves new challenges in other areas. For example, in the field of security, several academic works have shown that the same attacks used in computer networks (DoS, Sniffing, Tampering, Spoffing among others), can also compromise the operation of a substation. This article evaluates the applicability of hierarchical clustering algorithms and statistical type descriptors (averages), in the identification of anomalous patterns of traffic in communication networks for power substations based on the IEC 61850 standard. The results obtained show that, using a hierarchical algorithm with Euclidean distance proximity criterion and simple link grouping method, a correct classification is achieved in the following operation scenarios: 1) Normal traffic, 2) IED disconnection, 3) Network discovery attack, 4) DoS attack, 5) IED spoofing attack and 6) Failure on the high voltage line. In addition, the descriptors used for the classification proved equally effective with other unsupervised clustering techniques such as K-means (partitional-type clustering), or LAMDA (diffuse-type clustering)

    P2P traffic identification and optimization using fuzzy c-means clustering

    No full text
    Accurate identification of P2P traffic is critical for efficient network management and reasonable utilization of network resources, as P2P applications have been growing dramatically. Fuzzy clustering is more flexible than hard clustering and is practical for P2P traffic identification because of the natural treatment of data using fuzzy clustering. Fuzzy c-means clustering (FCM) is an iteratively optimal algorithm normally based on the least square method to partition data sets, which has high computational overhead. This paper proposes modifications to the objective function and the distance function that greatly reduces the computational complexity of FCM while keeping the clustering accurate. The proposed FCM clustering technology can be incorporated into a Fuzzy Inference System (FIS) to implement real-time network traffic classification by updating the training data set continuously and efficiently

    Análise de agrupamento pelos métodos hierárquico aglomerativo e particional fuzzy utilizados para educational data mining em dados de educação a distância

    Get PDF
    Trabalho de Conclusão de Curso, apresentado para obtenção do grau de Bacharel no Curso de Ciência da Computação da Universidade do Extremo Sul Catarinense, UNESC.O crescimento da tecnologia, faz com que a quantidade de dados em repositórios aumente, impossibilitando a análise por métodos tradicionais, surgindo à mineração de dados, aplicada por meio da descoberta de conhecimento. A educação gera dados relacionados a alunos, principalmente a educação à distância em que os dados são provenientes de um ambiente virtual de aprendizagem, se tornando uma área de interesse dos pesquisadores educacionais. Com isso, surge o educational data mining, que utiliza métodos da mineração de dados. Mediante as técnicas e tarefas de mineração, tem o agrupamento, que é dividido em agrupamento hierárquico aglomerativo e agrupamento particional. De modo que nesta pesquisa é realizada a comparação entre o algoritmo de AGNES para o agrupamento hierárquico aglomerativo e o algoritmo fuzzy c-means para o agrupamento particional, com o objetivo de identificar qual dos métodos possui melhor desempenho em dados educacionais. Os dados são provenientes da disciplina ministrada a distância de Introdução a Engenharia de Segurança do Trabalho, na Universidade do Extremo Sul Catarinense. A ferramenta R foi usada, por ser um software livre, para implementação dos algoritmos e métodos de validação. Ao iniciar a mineração, é necessário definir a distância da matriz de similaridade, em que é aplicado as distâncias manhattan e euclidiana em AGNES e manhattan, euclidiana, correlattion e seuclidean no fuzzy c-means. O algoritmo AGNES, precisa da identificação do método de conexão, para gerar os resultados, sendo aplicado teste com os métodos de ward, distância média, maior distância e menor distância. A verificação dos resultados apresentados pelos algoritmos é realizada por meio das medidas de qualidade, aplicando índices de validação. O modelo final definido para fuzzy c-means, foi o que aplica a matriz de similaridade seuclidean e para o AGNES o que tem a matriz de similaridade de manhattan, pelo método de conexão distância média. Comparando o resultado gerado pelo índice de silhouette, o agrupamento particional, foi definido como modelo final de agrupamento sobre os dados educacionais

    Profiling and Identification of Web Applications in Computer Network

    Get PDF
    Characterising network traffic is a critical step for detecting network intrusion or misuse. The traditional way to identify the application associated with a set of traffic flows uses port number and DPI (Deep Packet Inspection), but it is affected by the use of dynamic ports and encryption. The research community proposed models for traffic classification that determined the most important requirements and recommendations for a successful approach. The suggested alternatives could be categorised into four techniques: port-based, packet payload based, host behavioural, and statistical-based. The traditional way to identifying traffic flows typically focuses on using IANA assigned port numbers and deep packet inspection (DPI). However, an increasing number of Internet applications nowadays that frequently use dynamic post assignments and encryption data traffic render these techniques in achieving real-time traffic identification. In recent years, two other techniques have been introduced, focusing on host behaviour and statistical methods, to avoid these limitations. The former technique is based on the idea that hosts generate different communication patterns at the transport layer; by extracting these behavioural patterns, activities and applications can be classified. However, it cannot correctly identify the application names, classifying both Yahoo and Gmail as email. Thereby, studies have focused on using statistical features approach for identifying traffic associated with applications based on machine learning algorithms. This method relies on characteristics of IP flows, minimising the overhead limitations associated with other schemes. Classification accuracy of statistical flow-based approaches, however, depends on the discrimination ability of the traffic features used. NetFlow represents the de-facto standard in monitoring and analysing network traffic, but the information it provides is not enough to describe the application behaviour. The primary challenge is to describe the activity within entirely and among network flows to understand application usage and user behaviour. This thesis proposes novel features to describe precisely a web application behaviour in order to segregate various user activities. Extracting the most discriminative features, which characterise web applications, is a key to gain higher accuracy without being biased by either users or network circumstances. This work investigates novel and superior features that characterize a behaviour of an application based on timing of arrival packets and flows. As part of describing the application behaviour, the research considered the on/off data transfer, defining characteristics for many typical applications, and the amount of data transferred or exchanged. Furthermore, the research considered timing and patterns for user events as part of a network application session. Using an extended set of traffic features output from traffic captures, a supervised machine learning classifier was developed. To this effect, the present work customised the popular tcptrace utility to generate classification features based on traffic burstiness and periods of inactivity for everyday Internet usage. A C5.0 decision tree classifier is applied using the proposed features for eleven different Internet applications, generated by ten users. Overall, the newly proposed features reported a significant level of accuracy (~98%) in classifying the respective applications. Afterwards, uncontrolled data collected from a real environment for a group of 20 users while accessing different applications was used to evaluate the proposed features. The evaluation tests indicated that the method has an accuracy of 87% in identifying the correct network application.Iraqi cultural Attach
    corecore