232 research outputs found
Peer to Peer Information Retrieval: An Overview
Peer-to-peer technology is widely used for file sharing. In the past decade a number of prototype peer-to-peer information retrieval systems have been developed. Unfortunately, none of these have seen widespread real- world adoption and thus, in contrast with file sharing, information retrieval is still dominated by centralised solutions. In this paper we provide an overview of the key challenges for peer-to-peer information retrieval and the work done so far. We want to stimulate and inspire further research to overcome these challenges. This will open the door to the development and large-scale deployment of real-world peer-to-peer information retrieval systems that rival existing centralised client-server solutions in terms of scalability, performance, user satisfaction and freedom
Development of a system compliant with the Application-Layer Traffic Optimization Protocol
Dissertação de mestrado integrado em Engenharia InformáticaWith the ever-increasing Internet usage that is following the start of the new decade,
the need to optimize this world-scale network of computers becomes a big priority
in the technological sphere that has the number of users rising, as are the Quality of
Service (QoS) demands by applications in domains such as media streaming or virtual
reality.
In the face of rising traffic and stricter application demands, a better understand ing of how Internet Service Providers (ISPs) should manage their assets is needed. An
important concern regards to how applications utilize the underlying network infras tructure over which they reside. Most of these applications act with little regard for
ISP preferences, as exemplified by their lack of care in achieving traffic locality during
their operation, which would be a preferable feature for network administrators, and
that could also improve application performance. However, even a best-effort attempt
by applications to cooperate will hardly succeed if ISP policies aren’t clearly commu nicated to them. Therefore, a system to bridge layer interests has much potential in
helping achieve a mutually beneficial scenario.
The main focus of this thesis is the Application-Layer Traffic Optimization (ALTO) work ing group, which was formed by the Internet Engineering Task Force (IETF) to explore
standardizations for network information retrieval. This group specified a request response protocol where authoritative entities provide resources containing network
status information and administrative preferences. Sharing of infrastructural insight
is done with the intent of enabling a cooperative environment, between the network
overlay and underlay, during application operations, to obtain better infrastructural re sourcefulness and the consequential minimization of the associated operational costs.
This work gives an overview of the historical network tussle between applications
and service providers, presents the ALTO working group’s project as a solution, im plements an extended system built upon their ideas, and finally verifies the developed
system’s efficiency, in a simulation, when compared to classical alternatives.Com o acrescido uso da Internet que acompanha o início da nova década, a necessidade de otimizar esta rede global de computadores passa a ser uma grande prioridade
na esfera tecnológica que vê o seu número de utilizadores a aumentar, assim como a
exigência, por parte das aplicações, de novos padrões de Qualidade de Serviço (QoS),
como visto em domínios de transmissão de conteúdo multimédia em tempo real e em
experiências de realidade virtual.
Face ao aumento de tráfego e aos padrões de exigência aplicacional mais restritos, é
necessário melhor compreender como os fornecedores de serviços Internet (ISPs) devem
gerir os seus recursos. Um ponto fulcral é como aplicações utilizam os seus recursos
da rede, onde muitas destas não têm consideração pelas preferências dos ISPs, como
exemplificado pela sua falta de esforço em localizar tráfego, onde o contrário seria
preferível por administradores de rede e teria potencial para melhorar o desempenho
aplicacional. Uma tentativa de melhor esforço, por parte das aplicações, em resolver
este problema, não será bem-sucedida se as preferências administrativas não forem
claramente comunicadas. Portanto, um sistema que sirva de ponte de comunicação
entre camadas pode potenciar um cenário mutuamente benéfico.
O foco principal desta tese é o grupo de trabalho Application-Layer Traffic Optimization (ALTO), que foi formado pelo Internet Engineering Task Force (IETF) para explorar
estandardizações para recolha de informação da rede. Este grupo especificou um protocolo onde entidades autoritárias disponibilizam recursos com informação de estado
de rede, e preferências administrativas. A partilha de conhecimento infraestrutural
é feita para possibilitar um ambiente cooperativo entre redes overlay e underlay, para
uma mais eficiente utilização de recursos e a consequente minimização de custos operacionais.
É pretendido dar uma visão da histórica disputa entre aplicações e ISPs, assim como
apresentar o projeto do grupo de trabalho ALTO como solução, implementar e melhorar sobre as suas ideias, e finalmente verificar a eficiência do sistema numa simulação,
quando comparado com alternativas clássicas
Towards efficient distributed search in a peer-to-peer network.
Cheng Chun Kong.Thesis submitted in: November 2006.Thesis (M.Phil.)--Chinese University of Hong Kong, 2007.Includes bibliographical references (leaves 62-64).Abstracts in English and Chinese.Abstract --- p.1槪要 --- p.2Acknowledgement --- p.3Chapter 1. --- Introduction --- p.5Chapter 2. --- Literature Review --- p.10Chapter 3. --- DesignChapter A. --- Overview --- p.22Chapter B. --- Basic idea --- p.23Chapter C. --- Follow-up design --- p.30Chapter D. --- Summary --- p.40Chapter 4. --- Experimental FindingsChapter A. --- Goal --- p.41Chapter B. --- Analysis Methodology --- p.41Chapter C. --- Validation --- p.47Chapter D. --- Results --- p.47Chapter 5. --- DeploymentChapter A. --- Limitations --- p.58Chapter B. --- Miscellaneous Design Issues --- p.59Chapter 6. --- Future Directions and Conclusions --- p.61Reference --- p.62Appendix --- p.6
Advanced methods for query routing in peer-to-peer information retrieval
One of the most challenging problems in peer-to-peer networks is query routing: effectively and efficiently identifying peers that can return high-quality local results for a given query. Existing methods from the areas of distributed information retrieval and metasearch engines do not adequately address the peculiarities of a peer-to-peer network. The main contributions of this thesis are as follows: 1. Methods for query routing that take into account the mutual overlap of different peers\u27; collections, 2. Methods for query routing that take into account the correlations between multiple terms, 3. Comparative evaluation of different query routing methods. Our experiments confirm the superiority of our novel query routing methods over the prior state-of-the-art, in particular in the context of peer-to-peer Web search.Eines der drängendsten Probleme in Peer-to-Peer-Netzwerken ist Query-Routing: das effektive und effiziente Identifizieren solcher Peers, die qualitativ hochwertige lokale Ergebnisse zu einer gegebenen Anfrage liefern können. Die bisher bekannten Verfahren aus dem Bereich der verteilten Informationssuche sowie der Metasuchmaschinen werden den Besonderheiten von Peer-to-Peer-Netzwerken nicht gerecht. Die Hautbeiträge dieser Arbeit teilen sich in folgende Schwerpunkte: 1. Query-Routing unter Berücksichtigung der gegenseitigen überlappung der Kollektionen verschiedener Peers, 2. Query-Routing unter Berücksichtigung der Korrelationen zwischen verschiedenen Termen, 3. Vergleichende Evaluierung verschiedener Methoden zum Query-Routing. Unsere Experimente bestätigen die Überlegenheit der in dieser Arbeit entwickelten Verfahren gegenüber den bisher bekannten Verfahren, insbesondere im Kontext von Peer-to-Peer-Websuche
Data Storage and Dissemination in Pervasive Edge Computing Environments
Nowadays, smart mobile devices generate huge amounts of data in all sorts of gatherings.
Much of that data has localized and ephemeral interest, but can be of great use if shared
among co-located devices. However, mobile devices often experience poor connectivity,
leading to availability issues if application storage and logic are fully delegated to a
remote cloud infrastructure. In turn, the edge computing paradigm pushes computations
and storage beyond the data center, closer to end-user devices where data is generated
and consumed. Hence, enabling the execution of certain components of edge-enabled
systems directly and cooperatively on edge devices.
This thesis focuses on the design and evaluation of resilient and efficient data storage
and dissemination solutions for pervasive edge computing environments, operating with
or without access to the network infrastructure. In line with this dichotomy, our goal can
be divided into two specific scenarios. The first one is related to the absence of network
infrastructure and the provision of a transient data storage and dissemination system
for networks of co-located mobile devices. The second one relates with the existence of
network infrastructure access and the corresponding edge computing capabilities.
First, the thesis presents time-aware reactive storage (TARS), a reactive data storage
and dissemination model with intrinsic time-awareness, that exploits synergies between
the storage substrate and the publish/subscribe paradigm, and allows queries within a
specific time scope. Next, it describes in more detail: i) Thyme, a data storage and dis-
semination system for wireless edge environments, implementing TARS; ii) Parsley, a
flexible and resilient group-based distributed hash table with preemptive peer relocation
and a dynamic data sharding mechanism; and iii) Thyme GardenBed, a framework
for data storage and dissemination across multi-region edge networks, that makes use of
both device-to-device and edge interactions.
The developed solutions present low overheads, while providing adequate response
times for interactive usage and low energy consumption, proving to be practical in a
variety of situations. They also display good load balancing and fault tolerance properties.Resumo
Hoje em dia, os dispositivos móveis inteligentes geram grandes quantidades de dados
em todos os tipos de aglomerações de pessoas. Muitos desses dados têm interesse loca-
lizado e efêmero, mas podem ser de grande utilidade se partilhados entre dispositivos
co-localizados. No entanto, os dispositivos móveis muitas vezes experienciam fraca co-
nectividade, levando a problemas de disponibilidade se o armazenamento e a lógica das
aplicações forem totalmente delegados numa infraestrutura remota na nuvem. Por sua
vez, o paradigma de computação na periferia da rede leva as computações e o armazena-
mento para além dos centros de dados, para mais perto dos dispositivos dos utilizadores
finais onde os dados são gerados e consumidos. Assim, permitindo a execução de certos
componentes de sistemas direta e cooperativamente em dispositivos na periferia da rede.
Esta tese foca-se no desenho e avaliação de soluções resilientes e eficientes para arma-
zenamento e disseminação de dados em ambientes pervasivos de computação na periferia
da rede, operando com ou sem acesso à infraestrutura de rede. Em linha com esta dico-
tomia, o nosso objetivo pode ser dividido em dois cenários específicos. O primeiro está
relacionado com a ausência de infraestrutura de rede e o fornecimento de um sistema
efêmero de armazenamento e disseminação de dados para redes de dispositivos móveis
co-localizados. O segundo diz respeito à existência de acesso à infraestrutura de rede e
aos recursos de computação na periferia da rede correspondentes.
Primeiramente, a tese apresenta armazenamento reativo ciente do tempo (ARCT), um
modelo reativo de armazenamento e disseminação de dados com percepção intrínseca
do tempo, que explora sinergias entre o substrato de armazenamento e o paradigma pu-
blicação/subscrição, e permite consultas num escopo de tempo específico. De seguida,
descreve em mais detalhe: i) Thyme, um sistema de armazenamento e disseminação de
dados para ambientes sem fios na periferia da rede, que implementa ARCT; ii) Pars-
ley, uma tabela de dispersão distribuída flexível e resiliente baseada em grupos, com
realocação preventiva de nós e um mecanismo de particionamento dinâmico de dados; e
iii) Thyme GardenBed, um sistema para armazenamento e disseminação de dados em
redes multi-regionais na periferia da rede, que faz uso de interações entre dispositivos e
com a periferia da rede.
As soluções desenvolvidas apresentam baixos custos, proporcionando tempos de res-
posta adequados para uso interativo e baixo consumo de energia, demonstrando serem
práticas nas mais diversas situações. Estas soluções também exibem boas propriedades de balanceamento de carga e tolerância a faltas
階層型ピア・ツー・ピアファイル検索のための負荷管理の研究
In a Peer-to-Peer (P2P) system, multiple interconnected peers or nodes contribute a portion of their resources (e.g., files, disk storage, network bandwidth) in order to inexpensively handle tasks that would normally require powerful servers. Since the emergency of P2P file sharing, load balancing has been considered as a primary concern, as well as other issues such as autonomy, fault tolerance and security. In a process of file search, a heavily loaded peer may incur a long latency or failure in query forwarding or responding. If there are many such peers in a system, it may cause link congestion or path congestion, and consequently affect the performance of overall system. To avoid such situation, some of general techniques used in Web systems such as caching and paging are adopted into P2P systems. However, it is highly insufficient for load balancing since peers often exhibit high heterogeneity and dynamicity in P2P systems. To overcome such a difficulty, the use of super-peers is currently being the most promising approach in optimizing allocation of system load to peers, i.e., it allocates more system load to high capacity and stable super-peers by assigning task of index maintenance and retrieval to them.
In this thesis, we focused on two kinds of super-peer based hierarchical architectures of P2P systems, which are distinguished by the organization of super-peers. In each of them, we discussed system load allocation, and proposed novel load balancing algorithms for alleviating load imbalance of super-peers, aiming to decrease average and variation of query response time during index retrieval process.
More concretely, in this thesis, our contribution to load management solutions for hierarchical P2P file search are the following:
• In Qin’s hierarchical architecture, indices of files held by the user peers in the bottom layer are stored at the super-peers in the middle layer, and the correlation of those two bottom layers is controlled by the central server(s) in the top layer using the notion of tags. In Qin’s system, a heavily loaded super-peer can move excessive load to a lightly loaded super-peer by using the notion of task migration. However, such a task migration approach is not sufficient to balance the load of super-peers if the size of tasks is highly imbalanced. To overcome such an issue, in this thesis, we propose two task migration schemes for this architecture, aiming to ensure an even load distribution over the super-peers. The first scheme controls the load of each task in order to decrease the total cost of task migration. The second scheme directly balances the load over tasks by reordering the priority of tags used in the query forwarding step. The effectiveness of the proposed schemes are evaluated by simulation. The result of simulations indicates that all the schemes can work in coordinate, in alleviating the bottleneck situation of super-peers.
• In DHT-based super-peer architecture, indices of files held by the user peers in the lower layer are stored at the DHT connected super-peers in the upper layer. In DHT-based super-peer systems, the skewness of user’s preference regarding keywords contained in multi-keyword query causes query load imbalance of super-peers that combines both routing and response load. Although index replication has a great potential for alleviating this problem, existing schemes did not explicitly address it or incurred high cost. To overcome such an issue, in this thesis, we propose an integrated solution that consists of three replication schemes to alleviate query load imbalance while minimizing the cost. The first scheme is an active index replication in order to decrease routing load in the super-peer layer, and distribute response load of an index among super-peers that stored the replica. The second scheme is a proactive pointer replication that places location information of an index, for reducing maintenance cost between the index and its replicas. The third scheme is a passive index replication that guarantees the maximum query load of super-peers. The result of simulations indicates that the proposed schemes can help alleviating the query load imbalance of super-peers. Moreover, by comparison it was found that our schemes are more cost-effective on placing replicas than other approaches.広島大学(Hiroshima University)博士(工学)Doctor of Engineering in Information Engineeringdoctora
Semantic search and composition in unstructured peer-to-peer networks
This dissertation focuses on several research questions in the area of semantic search and composition in unstructured peer-to-peer (P2P) networks. Going beyond the state of the art, the proposed semantic-based search strategy S2P2P offers a novel path-suggestion based query routing mechanism, providing a reasonable tradeoff between search performance and network traffic overhead. In addition, the first semantic-based data replication scheme DSDR is proposed. It enables peers to use semantic information to select replica numbers and target peers to address predicted future demands. With DSDR, k-random search can achieve better precision and recall than it can with a near-optimal non-semantic replication strategy. Further, this thesis introduces a functional automatic semantic service composition method, SPSC. Distinctively, it enables peers to jointly compose complex workflows with high cumulative recall but low network traffic overhead, using heuristic-based bidirectional haining and service memorization mechanisms. Its query branching method helps to handle dead-ends in a pruned search space. SPSC is proved to be sound and a lower bound of is completeness is given. Finally, this thesis presents iRep3D for semantic-index based 3D scene selection in P2P search. Its efficient retrieval scales to answer hybrid queries involving conceptual, functional and geometric aspects. iRep3D outperforms previous representative efforts in terms of search precision and efficiency.Diese Dissertation bearbeitet Forschungsfragen zur semantischen Suche und Komposition in unstrukturierten Peer-to-Peer Netzen(P2P). Die semantische Suchstrategie S2P2P verwendet eine neuartige Methode zur Anfrageweiterleitung basierend auf Pfadvorschlägen, welche den Stand der Wissenschaft übertrifft. Sie bietet angemessene Balance zwischen Suchleistung und Kommunikationsbelastung im Netzwerk. Außerdem wird das erste semantische System zur Datenreplikation genannt DSDR vorgestellt, welche semantische Informationen berücksichtigt vorhergesagten zukünftigen Bedarf optimal im P2P zu decken. Hierdurch erzielt k-random-Suche bessere Präzision und Ausbeute als mit nahezu optimaler nicht-semantischer Replikation. SPSC, ein automatisches Verfahren zur funktional korrekten Komposition semantischer Dienste, ermöglicht es Peers, gemeinsam komplexe Ablaufpläne zu komponieren. Mechanismen zur heuristischen bidirektionalen Verkettung und Rückstellung von Diensten ermöglichen hohe Ausbeute bei geringer Belastung des Netzes. Eine Methode zur Anfrageverzweigung vermeidet das Feststecken in Sackgassen im beschnittenen Suchraum. Beweise zur Korrektheit und unteren Schranke der Vollständigkeit von SPSC sind gegeben. iRep3D ist ein neuer semantischer Selektionsmechanismus für 3D-Modelle in P2P. iRep3D beantwortet effizient hybride Anfragen unter Berücksichtigung konzeptioneller, funktionaler und geometrischer Aspekte. Der Ansatz übertrifft vorherige Arbeiten bezüglich Präzision und Effizienz
A Persistent Publish/Subscribe System for Mobile Edge Computing
In recent times, we have seen an incredible growth of users adopting mobile devices
andwearables, and while the hardware capabilities of these devices have greatly increased
year after year, mobile communications still remain a bottleneck for most applications.
This is partially caused by the companies’ cloud infrastructure, which effectively represents
a large scale communication hub where all kinds of platforms compete with each
other for the servers’ processing power and channel throughput. Additionally, wireless
technologies used in mobile environments are unreliable, slow and congestion-prone by
nature when compared to the wired medium counterpart.
To fix the back-and-forth mobile communication overhead, the “Edge” paradigm has
been recently introduced with the aim to bring cloud services closer to the customers,
by providing an intermediate layer between the end devices and the actual cloud infrastructure,
resulting in faster response times. Publish/Subscribe systems, such as Thyme,
have also been proposed and proven effective for data dissemination at edge networks,
due to the interactions’ loosely coupled nature and scalability. Nonetheless, solely relying
on P2P interactions is not feasible in every scenario due to wireless protocols’ range
limitations.
In this thesis we propose and develop Thyme- Infrastructure, an extension to
the Thyme framework, that utilizes available stationary nodes within the edge infrastructure
to not only improve the performance of mobile clients within a BSS, by offloading a
portion of the requests to be processed by the infrastructure, but also to connect multiple
clusters of users within the same venue, with the goal of creating a persistent and global
end-to-end storage network. Our experimental results, both in simulated and real-world
scenarios, show adequate response times for interactive usage, and low energy consumption,
allowing the application to be used in a variety of events without excessive battery
drainage. In fact, when compared to the previous version of Thyme, our framework
was generally able to improve on all of these metrics. On top of that, we evaluated our
system’s latencies against a full-fledged cloud solution and verified that our proposal
yielded a considerable speedup across the board
- …