609 research outputs found
End-to-end informed VM selection in compute clouds
The selection of resources, particularly VMs, in current public IaaS clouds is usually done in a blind fashion, as cloud users do not have much information about resource consumption by co-tenant third-party tasks. In particular, communication patterns can play a significant part in cloud application performance and responsiveness, specially in the case of novel latencysensitive applications, increasingly common in todayās clouds. Thus, herein we propose an end-to-end approach to the VM allocation problem using policies based uniquely on round-trip time measurements between VMs. Those become part of a userlevel āRecommender Serviceā that receives VM allocation requests with certain network-related demands and matches them to a suitable subset of VMs available to the user within the cloud. We propose and implement end-to-end algorithms for VM selection that cover desirable profiles of communications between VMs in distributed applications in a cloud setting, such as profiles with prevailing pair-wise, hub-and-spokes, or clustered communication patterns between constituent VMs. We quantify the expected benefits from deploying our Recommender Service by comparing our informed VM allocation approaches to conventional, random allocation methods, based on real measurements of latencies between Amazon EC2 instances. We also show that our approach is completely independent from cloud architecture details, is adaptable to different types of applications and workloads, and is lightweight and transparent to cloud providers.This work is supported in part by the National Science
Foundation under grant CNS-0963974
Multi-capacity bin packing with dependent items and its application to the packing of brokered workloads in virtualized environments
Providing resource allocation with performance
predictability guarantees is increasingly important in cloud
platforms, especially for data-intensive applications, in which
performance depends greatly on the available rates of data
transfer between the various computing/storage hosts underlying
the virtualized resources assigned to the application. Existing
resource allocation solutions either assume that applications
manage their data transfer between their virtualized resources, or
that cloud providers manage their internal networking resources.
With the increased prevalence of brokerage services in cloud
platforms, there is a need for resource allocation solutions that
provides predictability guarantees in settings, in which neither
application scheduling nor cloud provider resources can be
managed/controlled by the broker. This paper addresses this
problem, as we define the Network-Constrained Packing (NCP)
problem of finding the optimal mapping of brokered resources
to applications with guaranteed performance predictability. We
prove that NCP is NP-hard, and we define two special instances
of the problem, for which exact solutions can be found efficiently.
We develop a greedy heuristic to solve the general instance of the
NCP problem , and we evaluate its efficiency using simulations
on various application workloads, and network models.This work was done while author was at Boston University. It was partially supported by NSF CISE awards #1430145, #1414119, #1239021 and #1012798. (1430145 - NSF CISE; 1414119 - NSF CISE; 1239021 - NSF CISE; 1012798 - NSF CISE
Network-constrained packing of brokered workloads in virtualized environments
Providing resource allocation with performance predictability guarantees is increasingly important in cloud platforms, especially for data-intensive applications, in which performance depends greatly on the available rates of data transfer between the various computing/storage hosts underlying the virtualized resources assigned to the application. Existing resource allocation solutions either assume that applications manage their data transfer between their virtualized resources, or that cloud providers manage their internal networking resources.With the increased prevalence of brokerage services in cloud platforms, there is a need for resource allocation solutions that provides predictability guarantees in settings, in which neither application scheduling nor cloud provider resources can be managed/controlled by the broker. This paper addresses this problem, as we define the Network-Constrained Packing (NCP)problem of finding the optimal mapping of brokered resources to applications with guaranteed performance predictability. We prove that NCP is NP-hard, and we define two special instances of the problem, for which exact solutions can be found efficiently. We develop a greedy heuristic to solve the general instance of the NCP problem, and we evaluate its efficiency using simulations on various application workloads, and network models.This work is supported by NSF CISE CNS Award #1347522, # 1239021, # 1012798
Pathfinder: Application-Aware Distributed Path Computation in Clouds
Path computation in a network is dependent on the networkās processes
and resource usage pattern. While distributed traffic control methods improve the
scalability of a system, their topology and link state conditions may influence the
sub-optimal path computation. Herein, we present Pathfinder, an application-aware
distributed path computation model. The proposed model framework can improve path
computation functions through software-defined network controls. In the paper, we
first analyse the key issues in distributed path computation functions and then present
Pathfinderās system architecture, followed by its design principles and orchestration
environment. Furthermore, we evaluate our systemās performance by comparing it
with FreeFlow and Prune-Dijk techniques. Our results demonstrate that Pathfinder
outperforms these two techniques and delivers significant improvement in the systemās
resource utilisation behaviour
ClouDiA: a deployment advisor for public clouds
An increasing number of distributed data-driven applications are moving into shared public clouds. By sharing resources and oper-ating at scale, public clouds promise higher utilization and lower costs than private clusters. To achieve high utilization, however, cloud providers inevitably allocate virtual machine instances non-contiguously, i.e., instances of a given application may end up in physically distant machines in the cloud. This allocation strategy can lead to large differences in average latency between instances. For a large class of applications, this difference can result in signif-icant performance degradation, unless care is taken in how applica-tion components are mapped to instances. In this paper, we propose ClouDiA, a general deployment ad-visor that selects application node deployments minimizing either (i) the largest latency between application nodes, or (ii) the longest critical path among all application nodes. ClouDiA employs mixed-integer programming and constraint programming techniques to ef-ficiently search the space of possible mappings of application nodes to instances. Through experiments with synthetic and real applica-tions in Amazon EC2, we show that our techniques yield a 15 % to 55 % reduction in time-to-solution or service response time, without any need for modifying application code. 1
Improving the Performance of Cloud-based Scientific Services
Cloud computing provides access to a large scale set of readily available computing resources at the click of a button. The cloud paradigm has commoditised computing capacity and is often touted as a low-cost model for executing and scaling applications. However, there are significant technical challenges associated with selecting, acquiring, configuring, and managing cloud resources which can restrict the efficient utilisation of cloud capabilities.
Scientific computing is increasingly hosted on cloud infrastructureāin which scientific capabilities are delivered to the broad scientific community via Internet-accessible services. This migration from on-premise to on-demand cloud infrastructure is motivated by the sporadic usage patterns of scientific workloads and the associated potential cost savings without the need to purchase, operate, and manage compute infrastructureāa task that few scientific users are trained to perform. However, cloud platforms are not an automatic solution. Their flexibility is derived from an enormous number of services and configuration options, which in turn result in significant complexity for the user. In fact, naĆÆve cloud usage can result in poor performance and excessive costs, which are then directly passed on to researchers.
This thesis presents methods for developing efficient cloud-based scientific services. Three real-world scientific services are analysed and a set of common requirements are derived. To address these requirements, this thesis explores automated and scalable methods for inferring network performance, considers various trade-offs (e.g., cost and performance) when provisioning instances, and profiles application performance, all in heterogeneous and dynamic cloud environments. Specifically, network tomography provides the mechanisms to infer network performance in dynamic and opaque cloud networks; cost-aware automated provisioning approaches enable services to consider, in real-time, various trade-offs such as cost, performance, and reliability; and automated application profiling allows a huge search space of applications, instance types, and configurations to be analysed to determine resource requirements and application performance. Finally, these contributions are integrated into an extensible and modular cloud provisioning and resource management service called SCRIMP. Cloud-based scientific applications and services can subscribe to SCRIMP to outsource their provisioning, usage, and management of cloud infrastructures. Collectively, the approaches presented in this thesis are shown to provide order of magnitude cost savings and significant performance improvement when employed by production scientific services
Detection of Power Line Supporting Towers via Interpretable Semantic Segmentation of 3D Point Clouds
The inspection and maintenance of energy transmission networks are demanding and
crucial tasks for any transmission system operator. They rely on a combination of on-theground
staff and costly low-flying helicopters to visually inspect the power grid structure.
Recently, LiDAR-based inspections have shown the potential to accelerate and increase
inspection precision. These high-resolution sensors allow one to scan an environment and
store it in a 3D point cloud format for further processing and analysis by maintenance
specialists to prevent fires and damage to the electrical system. However, this task is
especially demanding to handle on time when we consider the extensive area that the
transmission network covers. Nonetheless, the transition to point cloud data allows us to
take advantage of Deep Learning to automate these inspections, by detecting collisions
between the grid and the revolving scene.
Deep Learning is a recent and powerful tool that has been successfully applied to a
myriad of real-life problems, such as image recognition and speech generation. With the
introduction of affordable LiDAR sensors, the application of Deep Learning on 3D data
emerged, with numerous methods being proposed every day to address difficult problems,
from 3D object detection to 3D point cloud segmentation. Alas, state-of-the-art methods
are remarkably complex, composed of millions of trainable parameters, and take several
weeks, if not months, to train on specific hardware, which makes it difficult for traditional
companies, like utilities, to employ them.
Therefore, we explore a novel mathematical framework that allows us to define tailored
operators that incorporate prior knowledge regarding our problem. These operators
are then integrated into a learning agent, called SCENE-Net, that detects power line supporting
towers in 3D point clouds. SCENE-Net allows for the interpretability of its results,
which is not possible in conventional models, it shows an efficient training and inference
time of 85 mn and 20 ms on a regular laptop. Our model is composed of 11 trainable
geometrical parameters, like the height of a cylinder, and has a Precision gain of 24%
against a comparable CNN with 2190 parameters.A inspeĆ§Ć£o e manutenĆ§Ć£o de redes de transmissĆ£o de energia sĆ£o tarefas cruciais para
operadores de rede. Recentemente, foram adotadas inspeƧƵes utilizando sensores LiDAR
de forma a acelerar este processo e aumentar a sua precisĆ£o. Estes sensores sĆ£o objetos de
alta precisĆ£o que conseguem inspecionar ambientes e guarda-los no formato de nuvens
de pontos 3D, para serem posteriormente analisadas por specialistas que procuram prevenir
fogos florestais e danos Ć estruta elĆ©ctrica. No entanto, esta tarefa torna-se bastante
difĆcil de concluir em tempo Ćŗtil pois a rede de transmissĆ£o Ć© bastasnte vasta. Por isso,
podemos tirar partido da transiĆ§Ć£o para dados LiDAR e utilizar aprendizagem profunda
para automatizar as inspeƧƵes Ć rede.
Aprendizagem profunda Ć© um campo recente e em grande desenvolvimento, sendo
aplicado a vƔrios problemas do nosso quotidiano e facilmente atinge um desempenho
superior ao do ser humano, como em reconhecimento de imagens, geraĆ§Ć£o de voz, entre
outros. Com o desenvolvimento de sensores LiDAR acessĆveis, o uso de aprendizagem
profunda em dados 3D rapidamente se desenvolveu, apresentando vƔrias metodologias
novas todos os dias que respondem a problemas complexos, como deteĆ§Ć£o de objetos
3D. No entanto, modelos do estado da arte sĆ£o incrivelmente complexos e compostos
por milhƵes de parĆ¢metros e demoram vĆ”rias semanas, senĆ£o meses, a treinar em GPU
potentes, o que dificulta a sua utilizaĆ§Ć£o em empresas tradicionais, como a EDP.
Portanto, nĆ³s exploramos uma nova teoria matemĆ”tica que nos permite definir operadores
especĆficos que incorporaram conhecimento sobre o nosso problema. Estes operadores
sĆ£o integrados num modelo de aprendizagem prounda, designado SCENE-Net,
que deteta torres de suporte de linhas de transmissĆ£o em nuvens de pontos. SCENE-Net
permite a interpretaĆ§Ć£o dos seus resultados, aspeto que nĆ£o Ć© possĆvel com modelos convencionais,
demonstra um treino eficiente de 85 minutos e tempo de inferĆŖncia de 20
milissegundos num computador tradicional. O nosso modelo contĆ©m apenas 11 parĆ¢metros
geomĆ©tricos, como a altura de um cilindro, e demonstra um ganho de PrecisĆ£o de
24% quando comparado com uma CNN com 2190 parĆ¢metros
Revealing and Characterizing MPLS Networks
The Internet is a wide network of computers in constant evolution. Each year, more and more organizations are connected to this worldwide network. Each of them has its own structure and administration that are not publicly revealed for economical, political, and security reasons. Consequently, our perception of the Internet structure, and more specifically, its topology, is incomplete. In order to balance this lack of knowledge, the research community relies on network measurements. Most of the time, they are performed based on the well-known tool traceroute. However, in practice, an operator may privilege other technologies than IP to forward packets inside its network. MultiProtocol Label Switching (MPLS) is one them. Even if it is heavily deployed by operators, it has not been really investigated by researchers. Prior to this thesis, only two studies focused on the identification of MPLS tunnels in traceroute data. Moreover, while one of them does not take all possible scenarios into account, the other lack of precision in some of its models. In addition, MPLS tunnels may hide their content to traceroute. Topologies inferred from such data may thus contain false links or nodes with an artificially high degree, leading so to biases in standard graph metrics used to model the network. Even if some researchers already tried to tackle this issue, the revelation of hidden MPLS devices in traceroute data is still an open question.
This thesis aims at characterizing MPLS in two different ways. On the one hand, at an architectural level, we will analyze in detail its deployment and use in both IPv4 and IPv6 networks in order to improve its state-of-the-art view. We will show that, in practice, more than one IPv4 trace out of two crosses at least one MPLS tunnel. We will also see that, even if this protocol can simplify the internal architecture of transit networks, it also allows some operators to perform traffic engineering in their domain. On the other hand, MPLS will be studied from a measurement point of view. We will see that routers from different manufacturers may have distinct default behaviors regarding to MPLS, and that these specific behaviors can be exploited to identify MPLS tunnels during traceroute measurements. More precisely, we will focus on new methods able to infer the presence of tunnels that are invisible in traceroute outputs, as well as on mechanisms to reveal their content. We will also show that they can be used in order to improve the inference of Internet graph properties, such as path lengths and node degrees. Finally, these techniques will be integrated into Trace the Naughty Tunnels (TNT), a traceroute extension able to identify all types of MPLS tunnels along a path towards a destination. We will prove that this tool can be used in order to get a detailed quantification of MPLS tunnels in the worldwide network. TNT is publicly available, and can therefore be part of many future studies conducted by the research community.Internet est un immense reĢseau informatique en constante eĢvolution. Chaque anneĢe, de plus en plus dāorganisations sāy connectent. Chacune dāelles est geĢreĢe et administreĢe indeĢpendamment des autres. En pratique, lāarchitecture interne de leur reĢseau nāest pas rendue publique pour des raisons politiques, eĢconomiques, ou de seĢcuriteĢ. Par conseĢquent, notre perception de la structure dāInternet, et plus particulieĢrement de sa topologie, est incompleĢte. Afin de pallier ce manque de connaissance, la communauteĢ de la recherche sāappuie sur des mesures de reĢseau. La plupart du temps, elles sont reĢaliseĢes avec lāoutil traceroute. Cependant, des technologies autres que IP peuvent eĢtre privileĢgieĢes pour transfeĢrer les paquets dans un reĢseau. MultiProtocol Label Switching (MPLS) est lāune dāentre elles. MeĢme si cette technologie est largement deĢployeĢe dans Internet, elle nāest pas bien eĢtudieĢe par les chercheurs. Avant cette theĢse, seulement deux travaux se sont inteĢresseĢs aĢ lāidentification dāMPLS dans les donneĢes collecteĢes avec traceroute. Alors que le premier ne prend pas en compte tous les sceĢnarios possibles, le second propose des modeĢles qui manquent de preĢcision. De plus, les tunnels MPLS peuvent dissimuler leur contenu aĢ traceroute. Les topologies infeĢreĢes sur base de ces donneĢes peuvent donc contenir de faux liens, ou des noeuds avec un degreĢ anormalement eĢleveĢ. Les diffeĢrentes modeĢlisations dāInternet qui en reĢsultent peuvent alors eĢtre biaiseĢes. Aujourdāhui, la question de la reĢveĢlation des routeurs MPLS qui sont invisibles dans les donneĢes de mesure nāest toujours pas reĢsolue, meĢme si certains chercheurs ont deĢjaĢ proposeĢ quelques meĢthodes pour y parvenir.
Cette theĢse a pour but de caracteĢriser MPLS de deux manieĢres diffeĢrentes. Dans un premier temps, au niveau architectural, nous analyserons en deĢtail son deĢploiement et son utilisation dans les reĢseaux IPv4 et IPv6 afin dāameĢliorer lāeĢtat de lāart. Nous montrerons quāen pratique, plus dāune trace IPv4 sur deux traverse au moins un tunnel MPLS. Nous deĢcouvrirons eĢgalement que bien que ce protocole peut eĢtre utiliseĢ pour simplifier lāarchitecture interne des reĢseaux de transit, il peut aussi eĢtre deĢployeĢ pour la mise en place de solutions dāingeĢnierie de trafic. Dans un second temps, MPLS sera eĢtudieĢ dāun point de vue mesure. Nous verrons que les comportements par deĢfaut lieĢs au protocole varient dāun fabricant de routeur aĢ lāautre, et quāils peuvent eĢtre exploiteĢs afin dāidentifier les tunnels MPLS dans les donneĢes traceroute. Plus preĢciseĢment, nous deĢcouvrirons de nouvelles meĢthodes capables dāinfeĢrer la preĢsence de tunnels invisibles avec traceroute, ainsi que de nouvelles techniques pour reĢveĢler leur contenu. Nous montrerons eĢgalement quāelles peuvent eĢtre utiliseĢes afin dāameĢliorer la modeĢlisation dāInternet. Pour terminer, ces techniques seront inteĢgreĢes aĢ Trace the Naughty Tunnels (TNT), une extension de traceroute qui permet dāidentifier tous les types de tunnels MPLS le long du chemin vers une destination. Nous prouverons que cet outil peut eĢtre utiliseĢ pour obtenir des statistiques deĢtailleĢes sur le deĢploiement dāMPLS sur Internet. TNT est disponible publiquement, et peut donc eĢtre librement exploiteĢ par la communauteĢ de la recherche pour de multiples futures eĢtudes
- ā¦