Search CORE

92 research outputs found

Trade-off among timeliness, messages and accuracy for large-Ssale information management

Author: Brunner René
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2011
Field of study

The increasing amount of data and the number of nodes in large-scale environments require new techniques for information management. Examples of such environments are the decentralized infrastructures of Computational Grid and Computational Cloud applications. These large-scale applications need different kinds of aggregated information such as resource monitoring, resource discovery or economic information. The challenge of providing timely and accurate information in large scale environments arise from the distribution of the information. Reasons for delays in distributed information system are a long information transmission time due to the distribution, churn and failures. A problem of large applications such as peer-to-peer (P2P) systems is the increasing retrieval time of the information due to the decentralization of the data and the failure proneness. However, many applications need a timely information provision. Another problem is an increasing network consumption when the application scales to millions of users and data. Using approximation techniques allows reducing the retrieval time and the network consumption. However, the usage of approximation techniques decreases the accuracy of the results. Thus, the remaining problem is to offer a trade-off in order to solve the conflicting requirements of fast information retrieval, accurate results and low messaging cost. Our goal is to reach a self-adaptive decision mechanism to offer a trade-off among the retrieval time, the network consumption and the accuracy of the result. Self-adaption enables distributed software to modify its behavior based on changes in the operating environment. In large-scale information systems that use hierarchical data aggregation, we apply self-adaptation to control the approximation used for the information retrieval and reduces the network consumption and the retrieval time. The hypothesis of the thesis is that approximation techniquescan reduce the retrieval time and the network consumption while guaranteeing an accuracy of the results, while considering user’s defined priorities. First, this presented research addresses the problem of a trade-off among a timely information retrieval, accurate results and low messaging cost by proposing a summarization algorithm for resource discovery in P2P-content networks. After identifying how summarization can improve the discovery process, we propose an algorithm which uses a precision-recall metric to compare the accuracy and to offer a user-driven trade-off. Second, we propose an algorithm that applies a self-adaptive decision making on each node. The decision is about the pruning of the query and returning the result instead of continuing the query. The pruning reduces the retrieval time and the network consumption at the cost of a lower accuracy in contrast to continuing the query. The algorithm uses an analytic hierarchy process to assess the user’s priorities and to propose a trade-off in order to satisfy the accuracy requirements with a low message cost and a short delay. A quantitative analysis evaluates our presented algorithms with a simulator, which is fed with real data of a network topology and the nodes’ attributes. The usage of a simulator instead of the prototype allows the evaluation in a large scale of several thousands of nodes. The algorithm for content summarization is evaluated with half a million of resources and with different query types. The selfadaptive algorithm is evaluated with a simulator of several thousands of nodes that are created from real data. A qualitative analysis addresses the integration of the simulator’s components in existing market frameworks for Computational Grid and Cloud applications. The proposed content summarization algorithm reduces the information retrieval time from a logarithmic increase to a constant factor. Furthermore, the message size is reduced significantly by applying the summarization technique. For the user, a precision-recall metric allows defining the relation between the retrieval time and the accuracy. The self-adaptive algorithm reduces the number of messages needed from an exponential increase to a constant factor. At the same time, the retrieval time is reduced to a constant factor under an increasing number of nodes. Finally, the algorithm delivers the data with the required accuracy adjusting the depth of the query according to the network conditions.La gestió de la informació exigeix noves tècniques que tractin amb la creixent quantitat de dades i nodes en entorns a gran escala. Alguns exemples d’aquests entorns són les infraestructures descentralitzades de Computacional Grid i Cloud. Les aplicacions a gran escala necessiten diferents classes d’informació agregada com monitorització de recursos i informació econòmica. El desafiament de proporcionar una provisió ràpida i acurada d’informació en ambients de grans escala sorgeix de la distribució de la informació. Una raó és que el sistema d’informació ha de tractar amb l’adaptabilitat i fracassos d’aquests ambients. Un problema amb aplicacions molt grans com en sistemes peer-to-peer (P2P) és el creixent temps de recuperació de l’informació a causa de la descentralització de les dades i la facilitat al fracàs. No obstant això, moltes aplicacions necessiten una provisió d’informació puntual. A més, alguns usuaris i aplicacions accepten inexactituds dels resultats si la informació es reparteix a temps. A més i més, el consum de xarxa creixent fa que sorgeixi un altre problema per l’escalabilitat del sistema. La utilització de tècniques d’aproximació permet reduir el temps de recuperació i el consum de xarxa. No obstant això, l’ús de tècniques d’aproximació disminueix la precisió dels resultats. Així, el problema restant és oferir un compromís per resoldre els requisits en conflicte d’extracció de la informació ràpida, resultats acurats i cost d’enviament baix. El nostre objectiu és obtenir un mecanisme de decisió completament autoadaptatiu per tal d’oferir el compromís entre temps de recuperació, consum de xarxa i precisió del resultat. Autoadaptacío permet al programari distribuït modificar el seu comportament en funció dels canvis a l’entorn d’operació. En sistemes d’informació de gran escala que utilitzen agregació de dades jeràrquica, l’auto-adaptació permet controlar l’aproximació utilitzada per a l’extracció de la informació i redueixen el consum de xarxa i el temps de recuperació. La hipòtesi principal d’aquesta tesi és que els tècniques d’aproximació permeten reduir el temps de recuperació i el consum de xarxa mentre es garanteix una precisió adequada definida per l’usari. La recerca que es presenta, introdueix un algoritme de sumarització de continguts per a la descoberta de recursos a xarxes de contingut P2P. Després d’identificar com sumarització pot millorar el procés de descoberta, proposem una mètrica que s’utilitza per comparar la precisió i oferir un compromís definit per l’usuari. Després, introduïm un algoritme nou que aplica l’auto-adaptació a un ordre per satisfer els requisits de precisió amb un cost de missatge baix i un retard curt. Basat en les prioritats d’usuari, l’algoritme troba automàticament un compromís. L’anàlisi quantitativa avalua els algoritmes presentats amb un simulador per permetre l’evacuació d’uns quants milers de nodes. El simulador s’alimenta amb dades d’una topologia de xarxa i uns atributs dels nodes reals. L’algoritme de sumarització de contingut s’avalua amb mig milió de recursos i amb diferents tipus de sol·licituds. L’anàlisi qualitativa avalua la integració del components del simulador en estructures de mercat existents per a aplicacions de Computacional Grid i Cloud. Així, la funcionalitat implementada del simulador (com el procés d’agregació i la query language) és comprovada per la integració de prototips. L’algoritme de sumarització de contingut proposat redueix el temps d’extracció de l’informació d’un augment logarítmic a un factor constant. A més, també permet que la mida del missatge es redueix significativament. Per a l’usuari, una precision-recall mètric permet definir la relació entre el nivell de precisió i el temps d’extracció de la informació. Alhora, el temps de recuperació es redueix a un factor constant sota un nombre creixent de nodes. Finalment, l’algoritme reparteix les dades amb la precisió exigida i ajusta la profunditat de la sol·licitud segons les condicions de xarxa. Els algoritmes introduïts són prometedors per ser utilitzats per l’agregació d’informació en nous sistemes de gestió de la informació de gran escala en el futur

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Tesis Doctorals en Xarxa

Secretaría de Estado de Cultura

Data mining and fusion

Author: Addis M. J.
Choi F.
Taylor S. J.
Upstill C.
Watkins E. R.
Publication venue: s.n.
Publication date: 01/04/2006
Field of study

Southampton (e-Prints Soton)

Proceedings of the ECIR2010 workshop on information access for personal media archives (IAPMA2010), Milton Keynes, UK, 28 March 2010

Author
Publication venue: Dublin City University
Publication date: 28/03/2010
Field of study

Towards e-Memories: challenges of capturing, summarising, presenting, understanding, using, and retrieving relevant information from heterogeneous data contained in personal media archives. This is the proceedings of the inaugural workshop on “Information Access for Personal Media Archives”. It is now possible to archive much of our life experiences in digital form using a variety of sources, e.g. blogs written, tweets made, social network status updates, photographs taken, videos seen, music heard, physiological monitoring, locations visited and environmentally sensed data of those places, details of people met, etc. Information can be captured from a myriad of personal information devices including desktop computers, PDAs, digital cameras, video and audio recorders, and various sensors, including GPS, Bluetooth, and biometric devices. In this workshop research from diverse disciplines was presented on how we can advance towards the goal of effective capture, retrieval and exploration of e-memories

DCU Online Research Access Service

Command & Control: Understanding, Denying and Detecting - A review of malware C2 techniques, detection and defences

Author: Cova Marco
Gardiner Joseph
Nagaraja Shishir
Publication venue
Publication date: 05/08/2014
Field of study

In this survey, we first briefly review the current state of cyber attacks, highlighting significant recent changes in how and why such attacks are performed. We then investigate the mechanics of malware command and control (C2) establishment: we provide a comprehensive review of the techniques used by attackers to set up such a channel and to hide its presence from the attacked parties and the security tools they use. We then switch to the defensive side of the problem, and review approaches that have been proposed for the detection and disruption of C2 channels. We also map such techniques to widely-adopted security controls, emphasizing gaps or limitations (and success stories) in current best practices.Comment: Work commissioned by CPNI, available at c2report.org. 38 pages. Listing abstract compressed from version appearing in repor

arXiv.org e-Print Archive

Explore Bristol Research

CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap

Author: Bardeli Rolf
Boujemaa Nozha
Compañó Ramón
Doch Christoph
Geurts Joost
Gouraud Henri
Joly Alexis
Karlgren Jussi
King Paul
Kompatsiaris Yiannis
Köhler Joachim
Le Moine Jean-Yves
Ortgies Robert
Point Jean-Charles
Rotenberg Boris
Rudström Åsa
Schreer Oliver
Sebe Nicu
Snoek Cees
Publication venue: Chorus Project Consortium
Publication date: 01/01/2008
Field of study

After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in multimedia search engines, we have identified and analyzed gaps within European research effort during our second year. In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio- economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal challenges

RISE – Research Institutes of Sweden

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database

Software institutes' Online Digital Archive

A proactive personalised mobile recommendation system using analytic hierarchy process and Bayesian network

Author
Publication venue: Springer
Publication date
Field of study

Springer - Publisher Connector

Trade-off among timeliness, messages and accuracy for large-Ssale information management

Author: Brunner René
Publication venue: Universitat Politècnica de Catalunya
Publication date: 18/11/2011
Field of study

UPCommons. Portal del coneixement obert de la UPC

CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

Author: Boujemaa Nozha
Compañó Ramón
Dosch Christoph
Geurts Joost
Karlgren Jussi
King Paul
Kompatsiaris Yiannis
Köhler Joachim
Le Moine Jean-Yves
Ortgies Robert
Point Jean-Charles
Rotenberg Boris
Rudström Åsa
Sebe Nicu
Publication venue: Chorus Project Consortium
Publication date: 01/01/2007
Field of study

Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

RISE – Research Institutes of Sweden

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database

Software institutes' Online Digital Archive

Recommended from our members

HARD: Hybrid Adaptive Resource Discovery for Jungle Computing

Author: Aguiar R. L.
Barraca J. P.
Zarrin J.
Publication venue: 'Elsevier BV'
Publication date: 15/07/2017
Field of study

In recent years, Jungle Computing has emerged as a distributed computing paradigm based on simultaneous combination of various hierarchical and distributed computing environments which are composed by large number of heterogeneous resources. In such a computing environment, the resources and the underlying computation and communication infrastructures are highly-hierarchical and heterogeneous. This creates a lot of difficulty and complexity for finding the proper resources in a precise way in order to run a particular job on the system efficiently. This paper proposes Hybrid Adaptive Resource Discovery (HARD), a novel efficient and highly scalable resource-discovery approach which is built upon a virtual hierarchical overlay based on self-organization and self-adaptation of processing resources in the system, where the computing resources are organized into distributed hierarchies according to a proposed hierarchical multi-layered resource description model. The proposed approach supports distributed query processing within and across hierarchical layers by deploying various distributed resource discovery services and functionalities in the system which are implemented using different adapted algorithms and mechanisms in each level of hierarchy. The proposed approach addresses the requirements for resource discovery in Jungle Computing environments such as high-hierarchy, high-heterogeneity, high-scalability and dynamicity. Simulation results show significant scalability and efficiency of the proposed approach over highly heterogeneous, hierarchical and dynamic computing environments

City Research Online

Repositório Institucional da Universidade de Aveiro

Anglia Ruskin Research

Federated and autonomic management of multimedia services

Author: Famaey Jeroen
Publication venue: Ghent University, Department of Information technology
Publication date: 01/01/2012
Field of study

Ghent University Academic Bibliography