7 research outputs found
Big Data: Metadatos y su uso para la vigilancia global
[IT] La mia tesi consiste, prima di tutto, nella descrizione del mondo del Big Data,
definendolo, facendo un breve riassunto della sua cronologia e analizzando
alcuni dei problemi sorti in questo campo.
Successivamente, mi sono concentrato sulla descrizione dei metadati,
definendoli e mostrando l’importanza che hanno assunto oggigiorno. Inoltre
sviluppo alcune delle sue applicazioni. Il tema centrale del mio lavoro, quindi,
tratta della relazione che intercorre tra il Big Data e i metadati, dell’uso che
ne fanno i governi, specialmente quello Americano, raccogliendo dati di
massa al fine di creare una rete di sorveglianza globale della popolazione. Nel
Capitolo 5 mostro numerose rivelazioni riguardanti queste reti di
sorveglianza pubblicate attraverso vari mezzi di comunicazione.[ES] Dentro del mundo del Big Data, profundizar y mostrar qué son los metadatos, cómo se recolectan y el uso que les da el gobierno para la vigilancia global.
Para todo ello, lo describiré a través del caso Snowden, un extrabajador de la NSA (Agencia Nacional de Seguridad de EEUU), que rebeló métodos de EEUU para la vigilancia de sus ciudadanos.
Mi trabajo será estructurado de la siguiente manera:
1. Introducción
1.1 Definición Big Data
1.2 Evolución Big Data
1.3 Definición e importancia de los metadatos
1.4 Recolección metadatos y almacenamiento Big Data
1.5 Uso de los metadatos: el espionaje
2. Objetivos
2.1 Caso Snowden
2.2 Mostrar cómo se recolectan los metadatos, y como son material de estudio por las agencias de seguridad. Su posible uso a efectos de vigilancia mundial.
3. Situación actual
3.1 Fuentes de recolección de datos y metadatos
3.2 Las leyes internacionales dedicadas a la protección de datos y la privacidad personal
3.3 Debate entre la seguridad nacional frente a la privacidad. ¿Qué debería tener prioridad?
4. Conclusiones .Arguisuelas León, JA. (2017). Big Data: Metadati e il suo uso per la sorveglianza. http://hdl.handle.net/10251/89496TFG
Cost-Aware Resource Management for Decentralized Internet Services
Decentralized network services, such as naming systems, content
distribution networks, and publish-subscribe systems, play an
increasingly critical role and are required to provide high
performance, low latency service, achieve high availability in the
presence of network and node failures, and handle a large volume
of users. Judicious utilization of expensive system resources,
such as memory space, network bandwidth, and number of machines,
is fundamental to achieving the above properties. Yet, current
network services typically rely on less-informed, heuristic-based
techniques to manage scarce resources, and often fall short of
expectations.
This thesis presents a principled approach for building high
performance, robust, and scalable network services. The key
contribution of this thesis is to show that resolving the
fundamental cost-benefit tradeoff between resource consumption and
performance through mathematical optimization is practical in
large-scale distributed systems, and enables decentralized network
services to meet efficiently system-wide performance goals. This
thesis presents a practical approach for resource management in
three stages: analytically model the cost-benefit tradeoff as a
constrained optimization problem, determine a near-optimal
resource allocation strategy on the fly, and enforce the derived
strategy through light-weight, decentralized mechanisms. It
builds on self-organizing structured overlays, which provide
failure resilience and scalability, and complements them with
stronger performance guarantees and robustness under sudden
changes in workload. This work enables applications to meet
system-wide performance targets, such as low average response
times, high cache hit rates, and small update dissemination times
with low resource consumption. Alternatively, applications can
make the maximum use of available resources, such as storage and
bandwidth, and derive large gains in performance.
I have implemented an extensible framework called Honeycomb to
perform cost-aware resource management on structured overlays
based on the above approach and built three critical network
services using it. These services consist of a new name system for
the Internet called CoDoNS that distributes data associated with
domain names, an open-access content distribution network called
CobWeb that caches web content for faster access by users, and an
online information monitoring system called Corona that notifies
users about changes to web pages. Simulations and performance
measurements from a planetary-scale deployment show that these
services provide unprecedented performance improvement over the
current state of the art
Replication of non-deterministic objects
This thesis discusses replication of non-deterministic objects in distributed systems to achieve fault tolerance against crash failures. The objects replicated are the virtual nodes of a distributed application. Replication is viewed as an issue that is to be dealt with only during the configuration of a distributed application and that should not affect the development of the application. Hence, replication of virtual nodes should be transparent to the application. Like all measures to achieve fault tolerance, replication introduces redundancy in the system. Not surprisingly, the main difficulty is guaranteeing the consistency of all replicas such that they behave in the same way as if the object was not replicated (replication transparency). This is further complicated if active objects (like virtual nodes) are replicated, and these objects themselves can be clients of still further objects in the distributed application. The problems of replication of active non-deterministic objects are analyzed in the context of distributed Ada 95 applications. The ISO standard for Ada 95 defines a model for distributed execution based on remote procedure calls (RPC). Virtual nodes in Ada 95 use this as their sole communication paradigm, but they may contain tasks to execute activities concurrently, thus making the execution potentially non-deterministic due to implicit timing dependencies. Such non-determinism cannot be avoided by choosing deterministic tasking policies. I present two different approaches to maintain replica consistency despite this non-determinism. In a first approach, I consider the run-time support of Ada 95 as a black box (except for the part handling remote communications). This corresponds to a non-deterministic computation model. I show that replication of non-deterministic virtual nodes requires that remote procedure calls are implemented as nested transactions. Unfortunately, effects of failures are not local to the replicas of a virtual node: when a failure occurs, nested remote calls made to other virtual nodes must be undone. Also, using transactional semantics for RPCs necessitates a compromise regarding transparency: the application must identify global state for it cannot be determined reliably in an automatic way. Further study reveals that this approach cannot be implemented in a transparent way at all because the consistency criterion of Ada 95 (linearizability) is much weaker than that of transactions (serializability). An execution of remote procedure calls as transactions may thus lead to incompatibilities with the semantics of the programming language. If remotely called subprograms on a replicated virtual node perform partial operations, i.e., entry calls on global protected objects, deadlocks that cannot be broken can occur in certain cases. Such deadlocks do not occur when the virtual node is not replicated. The transactional semantics of RPCs must therefore be exposed to the application. A second approach is based on a piecewise deterministic computation model, i.e., the execution of a virtual node is seen as a sequence of deterministic state intervals. Whenever a non-deterministic event occurs, a new state interval is started. I study replica organization under this computation model (semi-active replication). In this model, all non-deterministic decisions are made on one distinguished replica (the leader), while all other replicas (the followers) are forced to follow the same sequence of non-deterministic events. I show that it suffices to synchronize the followers with the leader upon each observable event, i.e., when the leader sends a message to some other virtual node. It is not necessary to synchronize upon each and every non-deterministic event — which would incur a prohibitively high overhead. Non-deterministic events occurring on the leader between observable events are logged and sent to the followers just before the leader executes an observable event. Consequently, it is guaranteed that the followers will reach the same state as the leader, and thus the effects of failures remain mostly local to the replicas. A prototype implementation called RAPIDS (Replicated Ada Partitions In Distributed Systems) serves as a proof of concept for this second approach, demonstrating its feasibility. RAPIDS is an Ada 95 implementation of a replication manager for semi-active replication for the GNAT development system for Ada 95. It is entirely contained within the run-time support and hence largely transparent for the application
Protocolos de pertenencia a grupos para entornos dinámicos
Los sistemas distribuidos gozan hoy de fundamental importancia entre los sistemas de información, debido a sus potenciales capacidades de tolerancia a fallos y escalabilidad, que permiten su adecuación a
las aplicaciones actuales, crecientemente exigentes. Por otra parte, el desarrollo de aplicaciones distribuidas presenta también dificultades específicas, precisamente para poder ofrecer la escalabilidad, tolerancia a fallos y alta disponibilidad que constituyen sus ventajas. Por eso es de gran utilidad contar con componentes distribuidas específicamente diseñadas para proporcionar, a más bajo nivel, un conjunto de servicios bien definidos, sobre los cuales las aplicaciones de más alto nivel puedan construir su propia semántica más fácilmente.
Es el caso de los servicios orientados a grupos, de uso muy extendido por las aplicaciones distribuidas, a las que permiten abstraerse de los detalles de las comunicaciones. Tales servicios proporcionan primitivas básicas para la comunicación entre dos miembros del grupo o, sobre todo, las transmisiones de mensajes a todo el grupo, con garantías
concretas. Un caso particular de servicio orientado a grupos lo constituyen los servicios de pertenencia a grupos, en los cuales se centra esta tesis. Los servicios de pertenencia a grupos proporcionan a sus usuarios una imagen del conjunto de procesos o máquinas del sistema que permanecen simultáneamente conectados y correctos. Es más, los diversos participantes reciben esta información con garantías concretas de consistencia. Así pues, los servicios de pertenencia constituyen una componente fundamental para el desarrollo de sistemas de comunicación a grupos y otras aplicaciones distribuidas.
El problema de pertenencia a grupos ha sido ampliamente tratado en la literatura tanto desde un punto de vista teórico como práctico, y existen múltiples realizaciones de servicios de pertenencia utilizables. A pesar de ello, la definición del problema no es única.
Por el contrario, dependienBañuls Polo, MDC. (2006). Protocolos de pertenencia a grupos para entornos dinámicos [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/1886Palanci
Open Multithreaded Transactions: A Transaction Model for Concurrent Object-Oriented Programming
To read the abstract, please go to my PhD home page