377 research outputs found
Stochastic Analysis of Self-Sustainability in Peer-Assisted VoDSystems
Abstract—We consider a peer-assisted Video-on-demand system, in which video distribution is supported both by peers caching the whole video and by peers concurrently downloading it. We propose a stochastic fluid framework that allows to characterize the additional bandwidth requested from the servers to satisfy all users watching a given video. We obtain analytical upper bounds to the server bandwidth needed in the case in which users download the video content sequentially. We also present a methodology to obtain exact solutions for special cases of peer upload bandwidth distribution. Our bounds permit to tightly characterize the performance of peer-assisted VoD systems as the number of users increases, for both sequential and nonsequential delivery schemes. In particular, we rigorously prove that the simple sequential scheme is asymptotically optimal both in the bandwidth surplus and in the bandwidth deficit mode, and that peer-assisted systems become totally self-sustaining in the surplus mode as the number of users grows large. I
Peer-to-peer television for the IP multimedia subsystem
Peer-to-peer (P2P) video streaming has generated a significant amount of interest in both the research community and the industry, which find it a cost-effective solution to the user scalability problem. However, despite the success of Internet-based applications, the adoption has been limited for commercial services, such as Internet Protocol Television (IPTV). With the advent of the next-generation-networks (NGN) based on the IP Multimedia Subsystem (IMS), advocating for an open and inter-operable architecture, P2P emerges as a possible alternative in situations where the traditional mechanisms are not possible or economically feasible. This work proposes a P2P IPTV architecture for an IMS-based NGN, called P2PTV, which allows one or more service providers to use a common P2P infrastructure for streaming the TV channels to their subscribers. Instead of using servers, we rely on the uploading capabilities of the user equipments, like set-top boxes, located at the customers’ premise. We comply with the existing IMS and IPTV standards from the 3rd Generation Partnership Project (3GPP) and the Telecommunications and Internet converged Services and Protocols for Advanced Networking (TISPAN) bodies, where a centralized P2PTV application server (AS) manages the customer access to the service and the peer participation. Because watching TV is a complex and demanding user activity, we face two significant challenges. The first is to accommodate the mandatory IMS signaling, which reserves in the network the necessary QoS resources during every channel change, establishing a multimedia session between communicating peers. The second is represented by the streaming interruptions, or churn, when the uploading peer turns off or changes its current TV channel. To tackle these problems, we propose two enhancements. A fast signaling method, which uses inactive uploading sessions with reserved but unused QoS, to improve the tuning delay for new channel users. At every moment, the AS uses a feedback based algorithm to compute the number of necessary sessions that accommodates well the demand, while preventing the over-reservation of resources. We approach with special care mobility situations, where a proactive transfer of the multimedia session context using the IEEE 802.21 standard offers the best alternative to current methods. The second enhancement addresses the peer churn during channel changes. With every TV channel divided into a number of streams, we enable peers to download and upload streams different from their current channel, increasing the stability of their participation. Unlike similar work, we benefit from our estimation of the user demand and propose a decentralized method for a balanced assignment of peer bandwidth. We evaluate the performance of the P2PTV through modeling and large-scale computer simulations. A simpler experimental setting, with pure P2P streaming, indicates the improvements over the delay and peer churn. In more complex scenarios, especially with resource-poor peers having a limited upload capacity, we envision P2P as a complementary solution to traditional approaches like IP multicast. Reserving P2P for unpopular TV channels exploits the peer capacity and prevents the necessity of a large number of sparsely used multicast trees. Future work may refine the AS algorithms, address different experimental scenarios, and extend the lessons learned to non-IMS networks. ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------La transmisión de vídeo con tecnologías peer-to-peer (P2P) ha generado un gran interés, tanto en la industria como en la comunidad científica, quienes han encontrado en dicha unión la solución para afrontar los problemas de escalabilidad de la transmisión de vídeo, reduciendo al mismo tiempo sus costes. A pesar del éxito de estos mecanismos en Internet, la transmisión de vídeo mediante técnicas P2P no se ha utilizado en servicios comerciales como puede ser el de televisión por IP (IPTV). Con la aparición de propuestas de redes de próxima generación basadas en el IP Multimedia Subsystem (IMS), que permite una arquitectura abierta e interoperable, los mecanismos basados en P2P emergen como posibles alternativas en situaciones donde los mecanismos tradicionales de transmisión de vídeo no se pueden desplegar o no son económicamente viables. Esta tesis propone una arquitectura de servicio de televisión peer-to-peer para una red de siguiente generación basada en IMS, que abreviaremos como P2PTV, que permite a uno o más proveedores de servicio utilizar una infraestructura P2P común para la transmisión de canales de TV a sus suscriptores. En vez de utilizar varios servidores, proponemos utilizar la capacidad de envío de los equipos de usuario, como los set-top boxes, localizados en el lado del cliente. En esta tesis extendemos los trabajos de estandarización sobre IMS IPTV de los organismos 3rd Generation Partnership Project (3GPP) y del Telecommunications and Internet converged Services and Protocols for Advanced Networking (TISPAN), donde un servidor de aplicación (AS) central de P2PTV administra el acceso de los clientes al servicio y permite compartir los recursos de los equipos. Debido a que el acceso a los canales de TV por parte de los usuarios es una actividad compleja, nos enfrentamos a dos retos importantes. El primero es administrar la señalización de IMS, con la cual se reservan los recursos de QoS necesarios durante cada cambio de canal, estableciendo una sesión multimedia entre los diferentes elementos de la comunicación. El segundo está representado por las interrupciones de la reproducción de video, causado por los equipos que sirven dicho vídeo cuando estos se desconectan del sistema o cuando cambian de canal. Para afrontar estos retos, proponemos dos mejoras al sistema. La primera mejora introduce el método de señalización rápida, en la cual se utilizan sesiones multimedia inactivas pero con recursos reservados para acelerar las conexiones entre usuarios. En cada momento, el AS utiliza la información extraída del algoritmo propuesto, que calcula el número de sesiones necesarias para administrar la demanda de conexiones, pero sin realizar una sobre-estimación, manteniendo bajo el uso de los recursos. Hemos abordado con especial cuidado la movilidad de los usuarios, donde se ha propuesto una transferencia de sesión pro-activa utilizando el estándar IEEE 802.21, el cual brinda una mejor alternativa que los métodos propuestos hasta la fecha. La segunda mejora se enfoca en las desconexiones de usuarios durante cambios de canal. Dividiendo los canales de TV en varios segmentos, permitimos a los equipos descargar o enviar diferentes partes de cualquier canal, aumentando la estabilidad de su participación. A diferencia de otros trabajos, nuestra propuesta se beneficia de la estimación de la demanda futura de los usuarios, proponiendo un método descentralizado para una asignación balanceada del ancho de banda de los equipos. Hemos evaluado el rendimiento del sistema P2PTV a través de modelado y de simulaciones de ordenador en sistemas IPTV de gran escala. Una configuración simple, con envío P2P puro, indica mejoras en el retardo y número de desconexiones de usuarios. En escenarios más complejos, especialmente con equipos con pocos recursos en la subida, sugerimos el uso de P2P como una solución complementaria a las soluciones tradicionales de multicast IP. Reservando el uso de P2P para los canales de TV poco populares, se permite explotar los recursos de los equipos y se previene la necesidad de un alto número de árboles multicast dispersos. Como trabajo futuro, se propone refinar los algoritmos del AS, abordar diferentes escenarios experimentales y también extender las lecciones aprendidas en esta tesis a otros sistemas no basados en IMS
Video-on-Demand over Internet: a survey of existing systems and solutions
Video-on-Demand is a service where movies are delivered to distributed users with low delay and free interactivity. The traditional client/server architecture experiences scalability issues to provide video streaming services, so there have been many proposals of systems, mostly based on a peer-to-peer or on a hybrid server/peer-to-peer solution, to solve this issue. This work presents a survey of the currently existing or proposed systems and solutions, based upon a subset of representative systems, and defines selection criteria allowing to classify these systems. These criteria are based on common questions such as, for example, is it video-on-demand or live streaming, is the architecture based on content delivery network, peer-to-peer or both, is the delivery overlay tree-based or mesh-based, is the system push-based or pull-based, single-stream or multi-streams, does it use data coding, and how do the clients choose their peers. Representative systems are briefly described to give a summarized overview of the proposed solutions, and four ones are analyzed in details. Finally, it is attempted to evaluate the most promising solutions for future experiments. Résumé La vidéo à la demande est un service où des films sont fournis à distance aux utilisateurs avec u
Recommended from our members
Robust peer-to-peer systems
textPeer-to-peer (p2p) approaches are an increasingly effective way to deploy services. Popular examples include BitTorrent, Skype, and KaZaA. These approaches are attractive because they can be highly fault-tolerant, scalable, adaptive, and less expensive than a more centralized solution. Cooperation lies at the heart of these strengths. Yet, in settings where working together is crucial, a natural question is: "What if users stop cooperating?" After all, cooperative services are typically deployed over multiple administrative domains, and thus vulnerable to Byzantine failures and users who may act selfishly. This dissertation explores how to construct p2p systems to tolerate Byzantine participants while also incentivizing selfish participants to contribute resources. We describe how to balance obedience against choice in building a robust p2p live streaming system. Imposing obedience is desirable as it leaves little room for peers to attack or cheat the system. However, providing choice is also attractive as it allows us to engineer flexible and efficient solutions. We first focus on obedience by using Nash equilibria to drive the design of BAR Gossip, the first gossip protocol that is resilient to Byzantine and selfish nodes. BAR Gossip relies on verifiable pseudo-random partner selection to eliminate non-determinism, which can be used to game the system, while maintaining the robustness and rapid convergence of traditional gossip. A novel fair enough exchange primitive entices cooperation among selfish peers on short timescales, thereby avoiding the need for distributed reputation schemes. We next focus on tempering obedience with choice by using approximate equilibria to guide the construction of a novel p2p live streaming system. These equilibria allow us to design incentives to limit selfish behavior rigorously, yet provide sufficient flexibility to build practical systems. We show the advantages of using an [element of]-Nash equilibrium, instead of an exact Nash, to design and implement FlightPath, our live streaming system that uses bandwidth efficiently, absorbs flash crowds, adapts to sudden peer departures, handles churn, and tolerates malicious activity.Computer Science
Performance Analysis of Network Coding based P2P Live Video Streaming Systems
Peer-to-peer (P2P) video streaming is a scalable and cost-effective technology to stream video content to a large population of users and has attracted a lot of research for over a decade now. Recently, network coding has been
introduced to improve the efficiency of these systems and to simplify the protocol design. There are already some successful commercial applications that utilize network coding. However, previous analytical studies of
network-coding based P2P streaming systems mainly focused on fundamental properties of the system and ignored the influence of the protocol details. In this study, a unique stochastic model is developed to reveal
how segments of the video stream evolve over their lifetime in the buffer before they go into playback. Different strategies for segment selection have been studied with the model and their performance has been compared. A new approximation of the probability of linear independency of coded blocks has been proposed to study the redundancy of network coding. Finally, extensive numerical results and simulations have been provided to validate our model. From these results, in-depth insights into how system parameters and segment selection strategies affect the performance of
the system have been obtained
Diseño centrado en calidad para la difusión Peer-to-Peer de video en vivo
El uso de redes Peer-to-Peer (P2P) es una forma escalable para ofrecer servicios de video sobre Internet. Este documento hace foco en la definición, desarrollo y evaluación de una arquitectura P2P para distribuir video en vivo. El diseño global de la red es guiado por la calidad de experiencia (Quality of Experience - QoE), cuyo principal componente en este caso es la calidad del video percibida por los usuarios finales, en lugar del tradicional diseño basado en la calidad de servicio (Quality of Service - QoE) de la mayoría de los sistemas. Para medir la calidad percibida del video, en tiempo real y automáticamente, extendimos la recientemente propuesta metodología Pseudo-Subjective Quality Assessment (PSQA). Dos grandes líneas de investigación son desarrolladas. Primero, proponemos una técnica de distribución de video desde múltiples fuentes con las características de poder ser optimizada para maximizar la calidad percibida en contextos de muchas fallas y de poseer muy baja señalización (a diferencia de los sistemas existentes). Desarrollamos una metodología, basada en PSQA, que nos permite un control fino sobre la forma en que la señal de video es dividida en partes y la cantidad de redundancia agregada, como una función de la dinámica de los usuarios de la red. De esta forma es posible mejorar la robustez del sistema tanto como sea deseado, contemplando el límite de capacidad en la comunicación. En segundo lugar, presentamos un mecanismo estructurado para controlar la topología de la red. La selección de que usuarios servirán a que otros es importante para la robustez de la red, especialmente cuando los usuarios son heterogéneos en sus capacidades y en sus tiempos de conexión.Nuestro diseño maximiza la calidad global esperada (evaluada usando PSQA), seleccionado una topología que mejora la robustez del sistema. Además estudiamos como extender la red con dos servicios complementarios: el video bajo demanda (Video on Demand - VoD) y el servicio MyTV. El desafío en estos servicios es como realizar búsquedas eficientes sobre la librería de videos, dado al alto dinamismo del contenido. Presentamos una estrategia de "caching" para las búsquedas en estos servicios, que maximiza el número total de respuestas correctas a las consultas, considerando una dinámica particular en los contenidos y restricciones de ancho de banda. Nuestro diseño global considera escenarios reales, donde los casos de prueba y los parámetros de configuración surgen de datos reales de un servicio de referencia en producción. Nuestro prototipo es completamente funcional, de uso gratuito, y basado en tecnologías bien probadas de código abierto
Recent Trends in Communication Networks
In recent years there has been many developments in communication technology. This has greatly enhanced the computing power of small handheld resource-constrained mobile devices. Different generations of communication technology have evolved. This had led to new research for communication of large volumes of data in different transmission media and the design of different communication protocols. Another direction of research concerns the secure and error-free communication between the sender and receiver despite the risk of the presence of an eavesdropper. For the communication requirement of a huge amount of multimedia streaming data, a lot of research has been carried out in the design of proper overlay networks. The book addresses new research techniques that have evolved to handle these challenges
Enhanced Multimedia Exchanges over the Internet
Although the Internet was not originally designed for exchanging multimedia streams, consumers heavily depend on it for audiovisual data delivery. The intermittent nature of multimedia traffic, the unguaranteed underlying communication infrastructure, and dynamic user behavior collectively result in the degradation of Quality-of-Service (QoS) and Quality-of-Experience (QoE) perceived by end-users. Consequently, the volume of signalling messages is inevitably increased to compensate for the degradation of the desired service qualities. Improved multimedia services could leverage adaptive streaming as well as blockchain-based solutions to enhance media-rich experiences over the Internet at the cost of increased signalling volume. Many recent studies in the literature provide signalling reduction and blockchain-based methods for authenticated media access over the Internet while utilizing resources quasi-efficiently. To further increase the efficiency of multimedia communications, novel signalling overhead and content access latency reduction solutions are investigated in this dissertation including: (1) the first two research topics utilize steganography to reduce signalling bandwidth utilization while increasing the capacity of the multimedia network; and (2) the third research topic utilizes multimedia content access request management schemes to guarantee throughput values for servicing users, end-devices, and the network. Signalling of multimedia streaming is generated at every layer of the communication protocol stack; At the highest layer, segment requests are generated, and at the lower layers, byte tracking messages are exchanged. Through leveraging steganography, essential signalling information is encoded within multimedia payloads to reduce the amount of resources consumed by non-payload data. The first steganographic solution hides signalling messages within multimedia payloads, thereby freeing intermediate node buffers from queuing non-payload packets. Consequently, source nodes are capable of delivering control information to receiving nodes at no additional network overhead. A utility function is designed to minimize the volume of overhead exchanged while minimizing visual artifacts. Therefore, the proposed scheme is designed to leverage the fidelity of the multimedia stream to reduce the largest amount of control overhead with the lowest negative visual impact. The second steganographic solution enables protocol translation through embedding packet header information within payload data to alternatively utilize lightweight headers. The protocol translator leverages a proposed utility function to enable the maximum number of translations while maintaining QoS and QoE requirements in terms of packet throughput and playback bit-rate. As the number of multimedia users and sources increases, decentralized content access and management over a blockchain-based system is inevitable. Blockchain technologies suffer from large processing latencies; consequently reducing the throughput of a multimedia network. Reducing blockchain-based access latencies is therefore essential to maintaining a decentralized scalable model with seamless functionality and efficient utilization of resources. Adapting blockchains to feeless applications will then port the utility of ledger-based networks to audiovisual applications in a faultless manner. The proposed transaction processing scheme will enable ledger maintainers in sustaining desired throughputs necessary for delivering expected QoS and QoE values for decentralized audiovisual platforms. A block slicing algorithm is designed to ensure that the ledger maintenance strategy is benefiting the operations of the blockchain-based multimedia network. Using the proposed algorithm, the throughput and latency of operations within the multimedia network are then maintained at a desired level
Timely Classification of Encrypted or ProtocolObfuscated Internet Traffic Using Statistical Methods
Internet traffic classification aims to identify the type of application or protocol that generated
a particular packet or stream of packets on the network. Through traffic classification,
Internet Service Providers (ISPs), governments, and network administrators can
access basic functions and several solutions, including network management, advanced
network monitoring, network auditing, and anomaly detection. Traffic classification is
essential as it ensures the Quality of Service (QoS) of the network, as well as allowing
efficient resource planning.
With the increase of encrypted or obfuscated protocol traffic on the Internet and multilayer
data encapsulation, some classical classification methods have lost interest from the
scientific community. The limitations of traditional classification methods based on port
numbers and payload inspection to classify encrypted or obfuscated Internet traffic have
led to significant research efforts focused on Machine Learning (ML) based classification
approaches using statistical features from the transport layer. In an attempt to increase
classification performance, Machine Learning strategies have gained interest from the scientific
community and have shown promise in the future of traffic classification, specially
to recognize encrypted traffic.
However, ML approach also has its own limitations, as some of these methods have a
high computational resource consumption, which limits their application when classifying
large traffic or realtime
flows. Limitations of ML application have led to the investigation
of alternative approaches, including featurebased
procedures and statistical methods. In
this sense, statistical analysis methods, such as distances and divergences, have been used
to classify traffic in large flows and in realtime.
The main objective of statistical distance is to differentiate flows and find a pattern in
traffic characteristics through statistical properties, which enable classification. Divergences
are functional expressions often related to information theory, which measure the
degree of discrepancy between any two distributions.
This thesis focuses on proposing a new methodological approach to classify encrypted
or obfuscated Internet traffic based on statistical methods that enable the evaluation of
network traffic classification performance, including the use of computational resources
in terms of CPU and memory. A set of traffic classifiers based on KullbackLeibler
and
JensenShannon
divergences, and Euclidean, Hellinger, Bhattacharyya, and Wootters distances
were proposed. The following are the four main contributions to the advancement
of scientific knowledge reported in this thesis.
First, an extensive literature review on the classification of encrypted and obfuscated Internet traffic was conducted. The results suggest that portbased
and payloadbased
methods are becoming obsolete due to the increasing use of traffic encryption and multilayer
data encapsulation. MLbased
methods are also becoming limited due to their computational
complexity. As an alternative, Support Vector Machine (SVM), which is also
an ML method, and the KolmogorovSmirnov
and Chisquared
tests can be used as reference
for statistical classification. In parallel, the possibility of using statistical methods
for Internet traffic classification has emerged in the literature, with the potential of good
results in classification without the need of large computational resources. The potential
statistical methods are Euclidean Distance, Hellinger Distance, Bhattacharyya Distance,
Wootters Distance, as well as KullbackLeibler
(KL) and JensenShannon
divergences.
Second, we present a proposal and implementation of a classifier based on SVM for P2P
multimedia traffic, comparing the results with KolmogorovSmirnov
(KS) and Chisquare
tests. The results suggest that SVM classification with Linear kernel leads to a better classification
performance than KS and Chisquare
tests, depending on the value assigned to
the Self C parameter. The SVM method with Linear kernel and suitable values for the Self
C parameter may be a good choice to identify encrypted P2P multimedia traffic on the
Internet.
Third, we present a proposal and implementation of two classifiers based on KL Divergence
and Euclidean Distance, which are compared to SVM with Linear kernel, configured
with the standard Self C parameter, showing a reduced ability to classify flows based
solely on packet sizes compared to KL and Euclidean Distance methods. KL and Euclidean
methods were able to classify all tested applications, particularly streaming and P2P,
where for almost all cases they efficiently identified them with high accuracy, with reduced
consumption of computational resources. Based on the obtained results, it can be
concluded that KL and Euclidean Distance methods are an alternative to SVM, as these
statistical approaches can operate in realtime
and do not require retraining every time a
new type of traffic emerges.
Fourth, we present a proposal and implementation of a set of classifiers for encrypted
Internet traffic, based on JensenShannon
Divergence and Hellinger, Bhattacharyya, and
Wootters Distances, with their respective results compared to those obtained with methods
based on Euclidean Distance, KL, KS, and ChiSquare.
Additionally, we present a comparative
qualitative analysis of the tested methods based on Kappa values and Receiver
Operating Characteristic (ROC) curves. The results suggest average accuracy values above
90% for all statistical methods, classified as ”almost perfect reliability” in terms of Kappa
values, with the exception of KS. This result indicates that these methods are viable options
to classify encrypted Internet traffic, especially Hellinger Distance, which showed
the best Kappa values compared to other classifiers. We conclude that the considered
statistical methods can be accurate and costeffective
in terms of computational resource
consumption to classify network traffic. Our approach was based on the classification of Internet network traffic, focusing on statistical
distances and divergences. We have shown that it is possible to classify and obtain
good results with statistical methods, balancing classification performance and the
use of computational resources in terms of CPU and memory. The validation of the proposal
supports the argument of this thesis, which proposes the implementation of statistical
methods as a viable alternative to Internet traffic classification compared to methods
based on port numbers, payload inspection, and ML.A classificação de tráfego Internet visa identificar o tipo de aplicação ou protocolo que
gerou um determinado pacote ou fluxo de pacotes na rede. Através da classificação de
tráfego, Fornecedores de Serviços de Internet (ISP), governos e administradores de rede
podem ter acesso às funções básicas e várias soluções, incluindo gestão da rede, monitoramento
avançado de rede, auditoria de rede e deteção de anomalias. Classificar o tráfego é
essencial, pois assegura a Qualidade de Serviço (QoS) da rede, além de permitir planear
com eficiência o uso de recursos.
Com o aumento de tráfego cifrado ou protocolo ofuscado na Internet e do encapsulamento
de dados multicamadas, alguns métodos clássicos da classificação perderam interesse de
investigação da comunidade científica. As limitações dos métodos tradicionais da classificação
com base no número da porta e na inspeção de carga útil payload para classificar
o tráfego de Internet cifrado ou ofuscado levaram a esforços significativos de investigação
com foco em abordagens da classificação baseadas em técnicas de Aprendizagem
Automática (ML) usando recursos estatísticos da camada de transporte. Na tentativa
de aumentar o desempenho da classificação, as estratégias de Aprendizagem Automática
ganharam o interesse da comunidade científica e se mostraram promissoras no futuro da
classificação de tráfego, principalmente no reconhecimento de tráfego cifrado.
No entanto, a abordagem em ML também têm as suas próprias limitações,
pois alguns
desses métodos possuem um elevado consumo de recursos computacionais, o que limita
a sua aplicação para classificação de grandes fluxos de tráfego ou em tempo real. As limitações
no âmbito da aplicação de ML levaram à investigação de abordagens alternativas,
incluindo procedimentos baseados em características e métodos estatísticos. Neste sentido,
os métodos de análise estatística, tais como distâncias e divergências, têm sido utilizados
para classificar tráfego em grandes fluxos e em tempo real.
A distância estatística possui como objetivo principal diferenciar os fluxos e permite encontrar
um padrão nas características de tráfego através de propriedades estatísticas, que
possibilitam a classificação. As divergências são expressões funcionais frequentemente
relacionadas com a teoria da informação, que mede o grau de discrepância entre duas
distribuições quaisquer.
Esta tese focase
na proposta de uma nova abordagem metodológica para classificação de
tráfego cifrado ou ofuscado da Internet com base em métodos estatísticos que possibilite
avaliar o desempenho da classificação de tráfego de rede, incluindo a utilização de recursos
computacionais, em termos de CPU e memória. Foi proposto um conjunto de classificadores
de tráfego baseados nas Divergências de KullbackLeibler
e JensenShannon
e Distâncias Euclidiana, Hellinger, Bhattacharyya e Wootters. A seguir resumemse
os tese.
Primeiro, realizámos uma ampla revisão de literatura sobre classificação de tráfego cifrado
e ofuscado de Internet. Os resultados sugerem que os métodos baseados em porta e
baseados em carga útil estão se tornando obsoletos em função do crescimento da utilização
de cifragem de tráfego e encapsulamento de dados multicamada. O tipo de métodos
baseados em ML também está se tornando limitado em função da complexidade computacional.
Como alternativa, podese
utilizar a Máquina de Vetor de Suporte (SVM),
que também é um método de ML, e os testes de KolmogorovSmirnov
e Quiquadrado
como referência de comparação da classificação estatística. Em paralelo, surgiu na literatura
a possibilidade de utilização de métodos estatísticos para classificação de tráfego
de Internet, com potencial de bons resultados na classificação sem aporte de grandes recursos
computacionais. Os métodos estatísticos potenciais são as Distâncias Euclidiana,
Hellinger, Bhattacharyya e Wootters, além das Divergências de Kullback–Leibler (KL) e
JensenShannon.
Segundo, apresentamos uma proposta e implementação de um classificador baseado na
Máquina de Vetor de Suporte (SVM) para o tráfego multimédia P2P (PeertoPeer),
comparando
os resultados com os testes de KolmogorovSmirnov
(KS) e Quiquadrado.
Os
resultados sugerem que a classificação da SVM com kernel Linear conduz a um melhor
desempenho da classificação do que os testes KS e Quiquadrado,
dependente do valor
atribuído ao parâmetro Self C. O método SVM com kernel Linear e com valores adequados
para o parâmetro Self C pode ser uma boa escolha para identificar o tráfego Par a Par
(P2P) multimédia cifrado na Internet.
Terceiro, apresentamos uma proposta e implementação de dois classificadores baseados
na Divergência de KullbackLeibler (KL) e na Distância Euclidiana, sendo comparados
com a SVM com kernel Linear, configurado para o parâmestro Self C padrão, apresenta
reduzida
capacidade de classificar fluxos com base apenas nos tamanhos dos pacotes
em relação aos métodos KL e Distância Euclidiana. Os métodos KL e Euclidiano foram
capazes de classificar todas as aplicações testadas, destacandose
streaming e P2P, onde
para quase todos os casos foi eficiente identificálas
com alta precisão, com reduzido consumo
de recursos computacionais.Com base nos resultados obtidos, podese
concluir que
os métodos KL e Distância Euclidiana são uma alternativa à SVM, porque essas abordagens
estatísticas podem operar em tempo real e não precisam de retreinamento cada vez
que surge um novo tipo de tráfego.
Quarto, apresentamos uma proposta e implementação de um conjunto de classificadores
para o tráfego de Internet cifrado, baseados na Divergência de JensenShannon
e nas Distâncias
de Hellinger, Bhattacharyya e Wootters, sendo os respetivos resultados comparados
com os resultados obtidos com os métodos baseados na Distância Euclidiana, KL, KS e Quiquadrado.
Além disso, apresentamos uma análise qualitativa comparativa dos
métodos testados com base nos valores de Kappa e Curvas Característica de Operação do
Receptor (ROC). Os resultados sugerem valores médios de precisão acima de 90% para todos
os métodos estatísticos, classificados como “confiabilidade quase perfeita” em valores
de Kappa, com exceçãode KS. Esse resultado indica que esses métodos são opções viáveis
para a classificação de tráfego cifrado da Internet, em especial a Distância de Hellinger,
que apresentou os melhores resultados do valor de Kappa em comparaçãocom os demais
classificadores. Concluise
que os métodos estatísticos considerados podem ser precisos e
económicos em termos de consumo de recursos computacionais para classificar o tráfego
da rede.
A nossa abordagem baseouse
na classificação de tráfego de rede Internet, focando em
distâncias e divergências estatísticas. Nós mostramos que é possível classificar e obter
bons resultados com métodos estatísticos, equilibrando desempenho de classificação e
uso de recursos computacionais em termos de CPU e memória. A validação da proposta
sustenta o argumento desta tese, que propõe a implementação de métodos estatísticos
como alternativa viável à classificação de tráfego da Internet em relação aos métodos com
base no número da porta, na inspeção de carga útil e de ML.Thesis prepared at Instituto de Telecomunicações Delegação
da Covilhã and at the Department
of Computer Science of the University of Beira Interior, and submitted to the
University of Beira Interior for discussion in public session to obtain the Ph.D. Degree in
Computer Science and Engineering.
This work has been funded by Portuguese FCT/MCTES through national funds and, when
applicable, cofunded
by EU funds under the project UIDB/50008/2020, and by operation
Centro010145FEDER000019
C4
Centro
de Competências em Cloud Computing,
cofunded
by the European Regional Development Fund (ERDF/FEDER) through
the Programa Operacional Regional do Centro (Centro 2020). This work has also been
funded by CAPES (Brazilian Federal Agency for Support and Evaluation of Graduate Education)
within the Ministry of Education of Brazil under a scholarship supported by the
International Cooperation Program CAPES/COFECUB Project
9090134/
2013 at the
University of Beira Interior
- …