11,898 research outputs found
Probabilistic learning for selective dissemination of information
New methods and new systems are needed to filter or to selectively distribute the increasing volume of electronic information being produced nowadays. An effective information filtering system is one that provides the exact information that fulfills user's interests with the minimum effort by the user to describe it. Such a system will have to be adaptive to the user changing interest. In this paper we describe and evaluate a learning model for information filtering which is an adaptation of the generalized probabilistic model of information retrieval. The model is based on the concept of 'uncertainty sampling', a technique that allows for relevance feedback both on relevant and nonrelevant documents. The proposed learning model is the core of a prototype information filtering system called ProFile
Dissimilarity metric based on local neighboring information and genetic programming for data dissemination in vehicular ad hoc networks (VANETs)
This paper presents a novel dissimilarity metric based on local neighboring information
and a genetic programming approach for efficient data dissemination in Vehicular Ad Hoc Networks
(VANETs). The primary aim of the dissimilarity metric is to replace the Euclidean distance in
probabilistic data dissemination schemes, which use the relative Euclidean distance among vehicles
to determine the retransmission probability. The novel dissimilarity metric is obtained by applying a
metaheuristic genetic programming approach, which provides a formula that maximizes the Pearson
Correlation Coefficient between the novel dissimilarity metric and the Euclidean metric in several
representative VANET scenarios. Findings show that the obtained dissimilarity metric correlates with
the Euclidean distance up to 8.9% better than classical dissimilarity metrics. Moreover, the obtained
dissimilarity metric is evaluated when used in well-known data dissemination schemes, such as
p-persistence, polynomial and irresponsible algorithm. The obtained dissimilarity metric achieves
significant improvements in terms of reachability in comparison with the classical dissimilarity
metrics and the Euclidean metric-based schemes in the studied VANET urban scenarios
Formal models, usability and related work in IR (editorial for special edition)
The Glasgow IR group has carried out both theoretical and empirical work, aimed at giving end users efficient and effective access to large collections of multimedia data
Toward Reliable Contention-aware Data Dissemination in Multi-hop Cognitive Radio Ad Hoc Networks
This paper introduces a new channel selection strategy for reliable
contentionaware data dissemination in multi-hop cognitive radio network. The
key challenge here is to select channels providing a good tradeoff between
connectivity and contention. In other words, channels with good opportunities
for communication due to (1) low primary radio nodes (PRs) activities, and (2)
limited contention of cognitive ratio nodes (CRs) acceding that channel, have
to be selected. Thus, by dynamically exploring residual resources on channels
and by monitoring the number of CRs on a particular channel, SURF allows
building a connected network with limited contention where reliable
communication can take place. Through simulations, we study the performance of
SURF when compared with three other related approaches. Simulation results
confirm that our approach is effective in selecting the best channels for
efficient and reliable multi-hop data dissemination
Machine Learning in Automated Text Categorization
The automated categorization (or classification) of texts into predefined
categories has witnessed a booming interest in the last ten years, due to the
increased availability of documents in digital form and the ensuing need to
organize them. In the research community the dominant approach to this problem
is based on machine learning techniques: a general inductive process
automatically builds a classifier by learning, from a set of preclassified
documents, the characteristics of the categories. The advantages of this
approach over the knowledge engineering approach (consisting in the manual
definition of a classifier by domain experts) are a very good effectiveness,
considerable savings in terms of expert manpower, and straightforward
portability to different domains. This survey discusses the main approaches to
text categorization that fall within the machine learning paradigm. We will
discuss in detail issues pertaining to three different problems, namely
document representation, classifier construction, and classifier evaluation.Comment: Accepted for publication on ACM Computing Survey
- …