Search CORE

27,950 research outputs found

Using Information Filtering in Web Data Mining Process

Author: Bruza Peter
Lau Raymond
Li Yuefeng
Wu Shengtang
Xu Yue
Zhou Xujuan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2007
Field of study

Web service-oriented Grid is becoming a standard for achieving loosely coupled distributed computing. Grid services could easily be specified with web-service based interfaces. In this paper we first envisage a realistic Grid market with players such as end-users, brokers and service providers participating co-operatively with an aim to meet requirements and earn profit. End-users wish to use functionality of Grid services by paying the minimum possible price or price confined within a specified budget, brokers aim to maximise profit whilst establishing a SLA (Service Level Agreement) and satisfying end-user needs and at the same time resisting the volatility of service execution time and availability. Service providers aim to develop price models based on end-user or broker demands that will maximise their profit. In this paper we focus on developing stochastic approaches to end-user workflow scheduling that provides QoS guarantees by establishing a SLA. We also develop a novel 2-stage stochastic programming technique that aims at establishing a SLA with end-users regarding satisfying their workflow QoS requirements. We develop a scheduling (workload allocation) technique based on linear programming that embeds the negotiated workflow QoS into the program and model Grid services as generalised queues. This technique is shown to outperform existing scheduling techniques that don't rely on real-time performance information

Crossref

Queensland University of Technology ePrints Archive

Macquarie University ResearchOnline

University of Southern Queensland ePrints

Distributed resource discovery using a context sensitive infrastructure

Author: Eliassen F.
Ferguson I.
Fongen A.
Stobart S.
Tait J.
Publication venue: 'IOS Press'
Publication date: 01/01/2001
Field of study

Distributed Resource Discovery in a World Wide Web environment using full-text indices will never scale. The distinct properties of WWW information (volume, rate of change, topical diversity) limits the scaleability of traditional approaches to distributed Resource Discovery. An approach combining metadata clustering and query routing can, on the other hand, be proven to scale much better. This paper presents the Content-Sensitive Infrastructure, which is a design building on these results. We also present an analytical framework for comparing scaleability of different distribution strategies

CiteSeerX

University of Strathclyde Institutional Repository

Porqpine: a peer-to-peer search engine

Author: Bermúdez Juanjo
Pujol Josep Maria
Sangüesa i Sole Ramon
Publication venue
Publication date: 01/01/2003
Field of study

In this paper, we present a fully distributed and collaborative search engine for web pages: Porqpine. This system uses a novel query-based model and collaborative filtering techniques in order to obtain user-customized results. All knowledge about users and profiles is stored in each user node?s application. Overall the system is a multi-agent system that runs on the computers of the user community. The nodes interact in a peer-to-peer fashion in order to create a real distributed search engine where information is completely distributed among all the nodes in the network. Moreover, the system preserves the privacy of user queries and results by maintaining the anonymity of the queries? consumers and results? producers. The knowledge required by the system to work is implicitly caught through the monitoring of users actions, not only within the system?s interface but also within one of the most popular web browsers. Thus, users are not required to explicitly feed knowledge about their interests into the system since this process is done automatically. In this manner, users obtain the benefits of a personalized search engine just by installing the application on their computer. Porqpine does not intend to shun completely conventional centralized search engines but to complement them by issuing more accurate and personalized results.Postprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

On content-based recommendation and user privacy in social-tagging systems

Author: Forné Muñoz Jorge
Parra-Arnau Javier
Puglisi Silvia
Rebollo Monedero David
Publication venue: 'Elsevier BV'
Publication date: 01/09/2015
Field of study

Recommendation systems and content filtering approaches based on annotations and ratings, essentially rely on users expressing their preferences and interests through their actions, in order to provide personalised content. This activity, in which users engage collectively has been named social tagging, and it is one of the most popular in which users engage online, and although it has opened new possibilities for application interoperability on the semantic web, it is also posing new privacy threats. It, in fact, consists of describing online or offline resources by using free-text labels (i.e. tags), therefore exposing the user profile and activity to privacy attacks. Users, as a result, may wish to adopt a privacy-enhancing strategy in order not to reveal their interests completely. Tag forgery is a privacy enhancing technology consisting of generating tags for categories or resources that do not reflect the user's actual preferences. By modifying their profile, tag forgery may have a negative impact on the quality of the recommendation system, thus protecting user privacy to a certain extent but at the expenses of utility loss. The impact of tag forgery on content-based recommendation is, therefore, investigated in a real-world application scenario where different forgery strategies are evaluated, and the consequent loss in utility is measured and compared.Peer ReviewedPostprint (author’s final draft

arXiv.org e-Print Archive

Elsevier - Publisher Connector

UPCommons. Portal del coneixement obert de la UPC

Proceedings of the 2nd Computer Science Student Workshop: Microsoft Istanbul, Turkey, April 9, 2011

Author
Publication venue: 'Sabanci University Information Center'
Publication date: 01/01/2011
Field of study

Sabanci University Research Database

Pytrec_eval: An Extremely Fast Python Interface to trec_eval

Author: Koepke H.
ST.
Tague J.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

We introduce pytrec_eval, a Python interface to the tree_eval information retrieval evaluation toolkit. pytrec_eval exposes the reference implementations of trec_eval within Python as a native extension. We show that pytrec_eval is around one order of magnitude faster than invoking trec_eval as a sub process from within Python. Compared to a native Python implementation of NDCG, pytrec_eval is twice as fast for practically-sized rankings. Finally, we demonstrate its effectiveness in an application where pytrec_eval is combined with Pyndri and the OpenAI Gym where query expansion is learned using Q-learning.Comment: SIGIR '18. The 41st International ACM SIGIR Conference on Research & Development in Information Retrieva

arXiv.org e-Print Archive

Crossref

International Migration, Integration and Social Cohesion online publications

Keyword based categorisation of diary entries to support personal Internet content pre-caching on mobile devices

Author: Dunlop M.D.
Komninos A.
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2004
Field of study

This paper presents a study into the effectiveness of our algorithm for automatic categorisation of real users' diary entries, as a first step towards personal Internet content pre-caching on mobile devices. The study reports an experiment comparing trial subjects allocations of 99 diary entries to those predicted by a keyword-based algorithm. While leaving considerable grounds for improvement, results are positive and show pave the way for supporting mobile services based on categorising users' diary entries

University of Strathclyde Institutional Repository

Calendar based contextual information as an Internet content pre-caching tool

Author: Dunlop M.D.
Komninos A.
Publication venue: ACM Press
Publication date: 01/01/2005
Field of study

Motivated by the need to access internet content on mobile devices with expensive or non-existent network access, this paper discusses the possibility for contextual information extracted from electronic calendars to be used as sources for Internet content predictive retrieval (pre-caching). Our results show that calendar based contextual information is useful for this purpose and that calendar based information can produce web queries that are relevant to the users' task supportive information needs

University of Strathclyde Institutional Repository