20,730 research outputs found
Processing Rank-Aware Queries in Schema-Based P2P Systems
Effiziente Anfragebearbeitung in Datenintegrationssystemen sowie in
P2P-Systemen ist bereits seit einigen Jahren ein Aspekt aktueller
Forschung. Konventionelle Datenintegrationssysteme bestehen aus mehreren
Datenquellen mit ggf. unterschiedlichen Schemata, sind hierarchisch
aufgebaut und besitzen eine zentrale Komponente: den Mediator, der ein
globales Schema verwaltet. Anfragen an das System werden auf diesem
globalen Schema formuliert und vom Mediator bearbeitet, indem relevante
Daten von den Datenquellen transparent für den Benutzer angefragt werden.
Aufbauend auf diesen Systemen entstanden schließlich
Peer-Daten-Management-Systeme (PDMSs) bzw. schemabasierte P2P-Systeme. An
einem PDMS teilnehmende Knoten (Peers) können einerseits als Mediatoren
agieren andererseits jedoch ebenso als Datenquellen. Darüber hinaus sind
diese Peers autonom und können das Netzwerk jederzeit verlassen bzw.
betreten. Die potentiell riesige Datenmenge, die in einem derartigen
Netzwerk verfügbar ist, führt zudem in der Regel zu sehr großen
Anfrageergebnissen, die nur schwer zu bewältigen sind. Daher ist das
Bestimmen einer vollständigen Ergebnismenge in vielen Fällen äußerst
aufwändig oder sogar unmöglich. In diesen Fällen bietet sich die
Anwendung von Top-N- und Skyline-Operatoren, ggf. in Verbindung mit
Approximationstechniken, an, da diese Operatoren lediglich diejenigen
Datensätze als Ergebnis ausgeben, die aufgrund nutzerdefinierter
Ranking-Funktionen am relevantesten für den Benutzer sind. Da durch die
Anwendung dieser Operatoren zumeist nur ein kleiner Teil des Ergebnisses
tatsächlich dem Benutzer ausgegeben wird, muss nicht zwangsläufig die
vollständige Ergebnismenge berechnet werden sondern nur der Teil, der
tatsächlich relevant für das Endergebnis ist.
Die Frage ist nun, wie man derartige Anfragen durch die Ausnutzung dieser
Erkenntnis effizient in PDMSs bearbeiten kann. Die Beantwortung dieser
Frage ist das Hauptanliegen dieser Dissertation. Zur Lösung dieser
Problemstellung stellen wir effiziente Anfragebearbeitungsstrategien in
PDMSs vor, die die charakteristischen Eigenschaften ranking-basierter
Operatoren sowie Approximationstechniken ausnutzen. Peers werden dabei
sowohl auf Schema- als auch auf Datenebene hinsichtlich der Relevanz ihrer
Daten geprüft und dementsprechend in die Anfragebearbeitung einbezogen
oder ausgeschlossen. Durch die Heterogenität der Peers werden Techniken
zum Umschreiben einer Anfrage von einem Schema in ein anderes nötig. Da
existierende Techniken zum Umschreiben von Anfragen zumeist nur konjunktive
Anfragen betrachten, stellen wir eine Erweiterung dieser Techniken vor, die
Anfragen mit ranking-basierten Anfrageoperatoren berücksichtigt. Da PDMSs
dynamische Systeme sind und teilnehmende Peers jederzeit ihre Daten ändern
können, betrachten wir in dieser Dissertation nicht nur wie Routing-Indexe
verwendet werden, um die Relevanz eines Peers auf Datenebene zu bestimmen,
sondern auch wie sie gepflegt werden können. Schließlich stellen wir
SmurfPDMS (SiMUlating enviRonment For Peer Data Management Systems) vor,
ein System, welches im Rahmen dieser Dissertation entwickelt wurde und alle
vorgestellten Techniken implementiert.In recent years, there has been considerable research with respect to query
processing in data integration and P2P systems. Conventional data
integration systems consist of multiple sources with possibly different
schemas, adhere to a hierarchical structure, and have a central component
(mediator) that manages a global schema. Queries are formulated against
this global schema and the mediator processes them by retrieving relevant
data from the sources transparently to the user. Arising from these
systems, eventually Peer Data Management Systems (PDMSs), or schema-based
P2P systems respectively, have attracted attention. Peers participating in
a PDMS can act both as a mediator and as a data source, are autonomous, and
might leave or join the network at will. Due to these reasons peers often
hold incomplete or erroneous data sets and mappings. The possibly huge
amount of data available in such a network often results in large query
result sets that are hard to manage. Due to these reasons, retrieving the
complete result set is in most cases difficult or even impossible. Applying
rank-aware query operators such as top-N and skyline, possibly in
conjunction with approximation techniques, is a remedy to these problems as
these operators select only those result records that are most relevant to
the user. Being aware that in most cases only a small fraction of the
complete result set is actually output to the user, retrieving the complete
set before evaluating such operators is obviously inefficient.
Therefore, the questions we want to answer in this dissertation are how to
compute such queries in PDMSs and how to do that efficiently. We propose
strategies for efficient query processing in PDMSs that exploit the
characteristics of rank-aware queries and optionally apply approximation
techniques. A peer's relevance is determined on two levels: on schema-level
and on data-level. According to its relevance a peer is either considered
for query processing or not. Because of heterogeneity queries need to be
rewritten, enabling cooperation between peers that use different schemas.
As existing query rewriting techniques mostly consider conjunctive queries
only, we present an extension that allows for rewriting queries involving
rank-aware query operators. As PDMSs are dynamic systems and peers might
update their local data, this dissertation addresses not only the problem
of considering such structures within a query processing strategy but also
the problem of keeping them up-to-date. Finally, we provide a system-level
evaluation by presenting SmurfPDMS (SiMUlating enviRonment For Peer Data
Management Systems) -- a system created in the context of this dissertation
implementing all presented techniques
The role of homophily in the emergence of opinion controversies
Understanding the emergence of strong controversial issues in modern
societies is a key issue in opinion studies. A commonly diffused idea is the
fact that the increasing of homophily in social networks, due to the modern
ICT, can be a driving force for opinion polariation. In this paper we address
the problem with a modelling approach following three basic steps. We first
introduce a network morphogenesis model to reconstruct network structures where
homophily can be tuned with a parameter. We show that as homophily increases
the emergence of marked topological community structures in the networks
raises. Secondly, we perform an opinion dynamics process on homophily dependent
networks and we show that, contrary to the common idea, homophily helps
consensus formation. Finally, we introduce a tunable external media pressure
and we show that, actually, the combination of homophily and media makes the
media effect less effective and leads to strongly polarized opinion clusters.Comment: 24 pages, 10 figure
Monte Carlo optimization of decentralized estimation networks over directed acyclic graphs under communication constraints
Motivated by the vision of sensor networks, we consider decentralized estimation networks over bandwidth–limited communication links, and are particularly interested in the tradeoff between the estimation accuracy and the cost of communications due to, e.g., energy consumption. We employ a class of in–network processing strategies that admits directed acyclic graph representations and yields a tractable Bayesian risk that comprises the cost of communications and estimation error penalty. This perspective captures a broad range of possibilities for processing under network constraints and enables a rigorous design problem in the form of constrained optimization. A similar scheme and the structures exhibited by the solutions have been previously studied in the context of decentralized detection. Under reasonable assumptions, the optimization can be carried out in a message passing fashion. We adopt
this framework for estimation, however, the corresponding optimization scheme involves integral operators that cannot be evaluated exactly in general. We develop an approximation framework using Monte Carlo methods and obtain
particle representations and approximate computational schemes for both the in–network processing strategies and their optimization. The proposed Monte Carlo optimization procedure operates in a scalable and efficient fashion and,
owing to the non-parametric nature, can produce results for any distributions provided that samples can be produced from the marginals. In addition, this approach exhibits graceful degradation of the estimation accuracy asymptotically
as the communication becomes more costly, through a parameterized Bayesian risk
DCDIDP: A distributed, collaborative, and data-driven intrusion detection and prevention framework for cloud computing environments
With the growing popularity of cloud computing, the exploitation of possible vulnerabilities grows at the same pace; the distributed nature of the cloud makes it an attractive target for potential intruders. Despite security issues delaying its adoption, cloud computing has already become an unstoppable force; thus, security mechanisms to ensure its secure adoption are an immediate need. Here, we focus on intrusion detection and prevention systems (IDPSs) to defend against the intruders. In this paper, we propose a Distributed, Collaborative, and Data-driven Intrusion Detection and Prevention system (DCDIDP). Its goal is to make use of the resources in the cloud and provide a holistic IDPS for all cloud service providers which collaborate with other peers in a distributed manner at different architectural levels to respond to attacks. We present the DCDIDP framework, whose infrastructure level is composed of three logical layers: network, host, and global as well as platform and software levels. Then, we review its components and discuss some existing approaches to be used for the modules in our proposed framework. Furthermore, we discuss developing a comprehensive trust management framework to support the establishment and evolution of trust among different cloud service providers. © 2011 ICST
- …