256 research outputs found
Considering Complex Search Techniques in DHTs Under Churn
Abstract-Traditionally complex queries have been performed over unstructured P2P networks by means of flooding, which is inherently inefficient due to the large number of redundant messages generated. While Distributed Hash Tables (DHTs) can provide very efficient look-up operations, they traditionally do not provide any methods for complex queries. By exploiting the structure inherent in DHTs we can perform complex querying over structured P2P networks by means of efficiently broadcasting the search query. This allows every node in the network to process the query locally, and hence is as powerful and flexible as flooding in unstructured networks, but without the inefficiency of redundant messages. While there have been various approaches proposed for broadcasting search queries over DHTs, the focus has not been on validation under churn. Comparing blind search methods for DHTs through simulation we see that churn, in particular nodes leaving the network, has a large impact on query success rate. In this paper we present novel results comparing blind search over Chord and Pastry while under varying levels of churn. We further consider how different data replication strategies can be used to enhance the query success rate
Recommended from our members
Democratizing Web Automation: Programming for Social Scientists and Other Domain Experts
We have promised social scientists a data revolution, but it has not arrived. What stands between practitioners and the data-driven insights they want? Acquiring the data. In particular, acquiring the social media, online forum, and other web data that was supposed to help them produce big, rich, ecologically valid datasets. Web automation programming is resistant to high-level abstractions, so end-user programmers end up stymied by the need to reverse engineer website internalsâDOM, JavaScript, AJAX. Programming by Demonstration (PBD) offered one promising avenue towards democratizing web automation. Unfortunately, as the web matured, the programs became too complex for PBD tools to synthesize, and web PBD progress stalled.This dissertation describes how I reformulated traditional web PBD around the insight that demonstrations are not always the easiest way for non-programmers to communicate their intent. By shifting from a purely Programming-By-Demonstration view to a Programming-By-X view that accepts a variety of user-friendly inputs, we can dramatically broaden the class of programs that come in reach for end-user programmers. Our Helena ecosystem combines (i) usable PBD-based program drafting tools, (ii) learnable programming languages, and (iii) novel programming environment interactions. The end result: non-coders write Helena programs in 10 minutes that can handle the complexity of modern webpages, while coders attempt the same task and time out in an hour. I conclude with a discussion of the abstraction-resistant domains that will fall next and how hybrid PL-HCI breakthroughs will vastly expand access to programming
Data Storage and Dissemination in Pervasive Edge Computing Environments
Nowadays, smart mobile devices generate huge amounts of data in all sorts of gatherings.
Much of that data has localized and ephemeral interest, but can be of great use if shared
among co-located devices. However, mobile devices often experience poor connectivity,
leading to availability issues if application storage and logic are fully delegated to a
remote cloud infrastructure. In turn, the edge computing paradigm pushes computations
and storage beyond the data center, closer to end-user devices where data is generated
and consumed. Hence, enabling the execution of certain components of edge-enabled
systems directly and cooperatively on edge devices.
This thesis focuses on the design and evaluation of resilient and efficient data storage
and dissemination solutions for pervasive edge computing environments, operating with
or without access to the network infrastructure. In line with this dichotomy, our goal can
be divided into two specific scenarios. The first one is related to the absence of network
infrastructure and the provision of a transient data storage and dissemination system
for networks of co-located mobile devices. The second one relates with the existence of
network infrastructure access and the corresponding edge computing capabilities.
First, the thesis presents time-aware reactive storage (TARS), a reactive data storage
and dissemination model with intrinsic time-awareness, that exploits synergies between
the storage substrate and the publish/subscribe paradigm, and allows queries within a
specific time scope. Next, it describes in more detail: i) Thyme, a data storage and dis-
semination system for wireless edge environments, implementing TARS; ii) Parsley, a
flexible and resilient group-based distributed hash table with preemptive peer relocation
and a dynamic data sharding mechanism; and iii) Thyme GardenBed, a framework
for data storage and dissemination across multi-region edge networks, that makes use of
both device-to-device and edge interactions.
The developed solutions present low overheads, while providing adequate response
times for interactive usage and low energy consumption, proving to be practical in a
variety of situations. They also display good load balancing and fault tolerance properties.Resumo
Hoje em dia, os dispositivos mĂłveis inteligentes geram grandes quantidades de dados
em todos os tipos de aglomeraçÔes de pessoas. Muitos desses dados tĂȘm interesse loca-
lizado e efĂȘmero, mas podem ser de grande utilidade se partilhados entre dispositivos
co-localizados. No entanto, os dispositivos mĂłveis muitas vezes experienciam fraca co-
nectividade, levando a problemas de disponibilidade se o armazenamento e a lĂłgica das
aplicaçÔes forem totalmente delegados numa infraestrutura remota na nuvem. Por sua
vez, o paradigma de computação na periferia da rede leva as computaçÔes e o armazena-
mento para além dos centros de dados, para mais perto dos dispositivos dos utilizadores
finais onde os dados são gerados e consumidos. Assim, permitindo a execução de certos
componentes de sistemas direta e cooperativamente em dispositivos na periferia da rede.
Esta tese foca-se no desenho e avaliação de soluçÔes resilientes e eficientes para arma-
zenamento e disseminação de dados em ambientes pervasivos de computação na periferia
da rede, operando com ou sem acesso Ă infraestrutura de rede. Em linha com esta dico-
tomia, o nosso objetivo pode ser dividido em dois cenĂĄrios especĂficos. O primeiro estĂĄ
relacionado com a ausĂȘncia de infraestrutura de rede e o fornecimento de um sistema
efĂȘmero de armazenamento e disseminação de dados para redes de dispositivos mĂłveis
co-localizados. O segundo diz respeito Ă existĂȘncia de acesso Ă infraestrutura de rede e
aos recursos de computação na periferia da rede correspondentes.
Primeiramente, a tese apresenta armazenamento reativo ciente do tempo (ARCT), um
modelo reativo de armazenamento e disseminação de dados com percepção intrĂnseca
do tempo, que explora sinergias entre o substrato de armazenamento e o paradigma pu-
blicação/subscrição, e permite consultas num escopo de tempo especĂfico. De seguida,
descreve em mais detalhe: i) Thyme, um sistema de armazenamento e disseminação de
dados para ambientes sem fios na periferia da rede, que implementa ARCT; ii) Pars-
ley, uma tabela de dispersĂŁo distribuĂda flexĂvel e resiliente baseada em grupos, com
realocação preventiva de nós e um mecanismo de particionamento dinùmico de dados; e
iii) Thyme GardenBed, um sistema para armazenamento e disseminação de dados em
redes multi-regionais na periferia da rede, que faz uso de interaçÔes entre dispositivos e
com a periferia da rede.
As soluçÔes desenvolvidas apresentam baixos custos, proporcionando tempos de res-
posta adequados para uso interativo e baixo consumo de energia, demonstrando serem
pråticas nas mais diversas situaçÔes. Estas soluçÔes também exibem boas propriedades de balanceamento de carga e tolerùncia a faltas
Study of Peer-to-Peer Network Based Cybercrime Investigation: Application on Botnet Technologies
The scalable, low overhead attributes of Peer-to-Peer (P2P) Internet
protocols and networks lend themselves well to being exploited by criminals to
execute a large range of cybercrimes. The types of crimes aided by P2P
technology include copyright infringement, sharing of illicit images of
children, fraud, hacking/cracking, denial of service attacks and virus/malware
propagation through the use of a variety of worms, botnets, malware, viruses
and P2P file sharing. This project is focused on study of active P2P nodes
along with the analysis of the undocumented communication methods employed in
many of these large unstructured networks. This is achieved through the design
and implementation of an efficient P2P monitoring and crawling toolset. The
requirement for investigating P2P based systems is not limited to the more
obvious cybercrimes listed above, as many legitimate P2P based applications may
also be pertinent to a digital forensic investigation, e.g, voice over IP,
instant messaging, etc. Investigating these networks has become increasingly
difficult due to the broad range of network topologies and the ever increasing
and evolving range of P2P based applications. In this work we introduce the
Universal P2P Network Investigation Framework (UP2PNIF), a framework which
enables significantly faster and less labour intensive investigation of newly
discovered P2P networks through the exploitation of the commonalities in P2P
network functionality. In combination with a reference database of known
network characteristics, it is envisioned that any known P2P network can be
instantly investigated using the framework, which can intelligently determine
the best investigation methodology and greatly expedite the evidence gathering
process. A proof of concept tool was developed for conducting investigations on
the BitTorrent network.Comment: This is a thesis submitted in fulfilment of a PhD in Digital
Forensics and Cybercrime Investigation in the School of Computer Science,
University College Dublin in October 201
Spartan Daily, May 9, 1934
Volume 22, Issue 122https://scholarworks.sjsu.edu/spartandaily/2158/thumbnail.jp
A formal analysis of blockchain consensus
In this thesis, we analyse these protocols using PRISM+, our extension of the probabilistic model checker PRISM with blockchain types and operations upon them. This allows us to model the behaviour of key participants in the protocols and describe the protocols as a parallel composition of PRISM+ processes.
Through our analysis of the Bitcoin model, we are able to understand how forks (where different nodes have different versions of the blockchain) occur and how they depend on specific parameters of the protocol, such as the difficulty of the cryptopuzzle and network communication delays. Our results corroborate the statement that considering confirmed the transactions in blocks at depth larger than 5 is reasonable because the majority of miners have consistent blockchains up-to that depth with probability of almost 1. We also study the behaviour of the Bitcoin network with churn miners (nodes that leave and rejoin the network) and with different topologies (linear topology, ring topology, tree topology and fully connected topology).
PRISM+ is therefore used to analyse the resilience of Hybrid Casper when changing various basic parameters of the protocol, such as block creation rates and penalty determination strategies. We also study the robustness of Hybrid Casper against two known attacks: the Eclipse attack (where an attacker controls a significant portion of the network's nodes and can prevent other nodes from receiving new transactions) and the majority attack (where an attacker controls a majority of the network's nodes and can manipulate the blockchain to their advantage)
Twitter Analysis to Predict the Satisfaction of Saudi Telecommunication Companiesâ Customers
The flexibility in mobile communications allows customers to quickly switch from one service provider to
another, making customer churn one of the most critical challenges for the data and voice telecommunication
service industry. In 2019, the percentage of post-paid telecommunication customers in Saudi Arabia
decreased; this represents a great deal of customer dissatisfaction and subsequent corporate fiscal losses.
Many studies correlate customer satisfaction with customer churn. The Telecom companies have depended
on historical customer data to measure customer churn. However, historical data does not reveal current
customer satisfaction or future likeliness to switch between telecom companies. Current methods of analysing
churn rates are inadequate and faced some issues, particularly in the Saudi market.
This research was conducted to realize the relationship between customer satisfaction and customer churn
and how to use social media mining to measure customer satisfaction and predict customer churn.
This research conducted a systematic review to address the churn prediction models problems and their
relation to Arabic Sentiment Analysis. The findings show that the current churn models lack integrating
structural data frameworks with real-time analytics to target customers in real-time. In addition, the findings
show that the specific issues in the existing churn prediction models in Saudi Arabia relate to the Arabic
language itself, its complexity, and lack of resources.
As a result, I have constructed the first gold standard corpus of Saudi tweets related to telecom companies,
comprising 20,000 manually annotated tweets. It has been generated as a dialect sentiment lexicon extracted
from a larger Twitter dataset collected by me to capture text characteristics in social media. I developed a
new ASA prediction model for telecommunication that fills the detected gaps in the ASA literature and fits
the telecommunication field. The proposed model proved its effectiveness for Arabic sentiment analysis and
churn prediction. This is the first work using Twitter mining to predict potential customer loss (churn) in
Saudi telecom companies, which has not been attempted before. Different fields, such as education, have
different features, making applying the proposed model is interesting because it based on text-mining
Randomness, Age, Work: Ingredients for Secure Distributed Hash Tables
Distributed Hash Tables (DHTs) are a popular and natural choice when dealing with dynamic resource location and routing. DHTs basically provide two main functions: saving (key, value) records in a network environment and, given a key, find the node responsible for it, optionally retrieving the associated value. However, all predominant DHT designs suffer a number of security flaws that expose nodes and stored data to a number of malicious attacks, ranging from disrupting correct DHT routing to corrupting data or making it unavailable. Thus even if DHTs are a standard layer for some mainstream systems (like BitTorrent or KAD clients), said vulnerabilities may prevent more security-aware systems from taking advantage of the ease of indexing and publishing on DHTs.
Through the years a variety of solutions to the security flaws of DHTs have been proposed both from academia and practitioners, ranging from authentication via Central Authorities to social-network based ones. These solutions are often tailored to DHT specific implementations, simply try to mitigate without eliminating hostile actions aimed at resources or nodes. Moreover all these solutions often sports serious limitations or make strong assumptions on the underlying network.
We present, after after providing a useful abstract model of the DHT protocol and infrastructure, two new primitives. We extend a âstandardâ proof-of-work primitive making of it also a âproof of ageâ primitive (informally, allowing a node to prove it is âsufficiently oldâ) and a âshared random seedâ primitive (informally, producing a new, shared, seed that was completely unpredictable in a âsufficiently remoteâ past). These primitives are then integrated into the basic DHT model obtaining an âenhancedâ DHT design, resilient to many common attacks. This work also shows how to adapt a Block Chain scheme â a continuously growing list of records (or blocks) protected from alteration or forgery â to provide a possible infrastructure for our proposed secure design.
Finally a working proof-of-concept software implementing an âenhancedâ Kademlia-based DHT is presented, together with some experimental results showing that, in practice, the performance overhead of the additional security layer is more than tolerable.
Therefore this work provides a threefold contribution. It describes a general set of new primitives (adaptable to any DHT matching our basic model) achieving a secure DHT;
it proposes an actionable design to attain said primitives; it makes public a proof-of-concept implementation of a full âenhancedâ DHT system, which a preliminary performance evaluation shows to be actually usable in practice
System support for keyword-based search in structured Peer-to-Peer systems
In this dissertation, we present protocols for building a distributed search infrastructure over structured Peer-to-Peer systems. Unlike existing search engines which consist of large server farms managed by a centralized authority, our approach makes use of a distributed set of end-hosts built out of commodity hardware. These end-hosts cooperatively construct and maintain the search infrastructure.
The main challenges with distributing such a system include node failures, churn, and data migration. Localities inherent in query patterns also cause load imbalances and hot spots that severely impair performance. Users of search systems want their results returned quickly, and in ranked order. Our main contribution is to show that a scalable, robust, and distributed search infrastructure can be built over existing Peer-to-Peer systems through the use of techniques that address these problems. We present a decentralized scheme for ranking search results without prohibitive network or storage overhead. We show that caching allows for efficient query evaluation and present a distributed data structure, called the View Tree, that enables efficient storage, and retrieval of cached results. We also present a lightweight adaptive replication protocol, called LAR that can adapt to different kinds of query streams and is extremely effective at eliminating hotspots. Finally, we present techniques for storing indexes reliably. Our approach is to use an adaptive partitioning protocol to store large indexes and employ efficient redundancy techniques to handle failures. Through detailed analysis and experiments we show that our techniques are efficient and scalable, and that they make distributed search feasible
- âŠ