3,166 research outputs found
Semantic Similarity of Spatial Scenes
The formalization of similarity in spatial information systems can unleash their functionality and contribute technology not only useful, but also desirable by broad groups of users. As a paradigm for information retrieval, similarity supersedes tedious querying techniques and unveils novel ways for user-system interaction by naturally supporting modalities such as speech and sketching. As a tool within the scope of a broader objective, it can facilitate such diverse tasks as data integration, landmark determination, and prediction making. This potential motivated the development of several similarity models within the geospatial and computer science communities. Despite the merit of these studies, their cognitive plausibility can be limited due to neglect of well-established psychological principles about properties and behaviors of similarity. Moreover, such approaches are typically guided by experience, intuition, and observation, thereby often relying on more narrow perspectives or restrictive assumptions that produce inflexible and incompatible measures. This thesis consolidates such fragmentary efforts and integrates them along with novel formalisms into a scalable, comprehensive, and cognitively-sensitive framework for similarity queries in spatial information systems. Three conceptually different similarity queries at the levels of attributes, objects, and scenes are distinguished. An analysis of the relationship between similarity and change provides a unifying basis for the approach and a theoretical foundation for measures satisfying important similarity properties such as asymmetry and context dependence. The classification of attributes into categories with common structural and cognitive characteristics drives the implementation of a small core of generic functions, able to perform any type of attribute value assessment. Appropriate techniques combine such atomic assessments to compute similarities at the object level and to handle more complex inquiries with multiple constraints. These techniques, along with a solid graph-theoretical methodology adapted to the particularities of the geospatial domain, provide the foundation for reasoning about scene similarity queries. Provisions are made so that all methods comply with major psychological findings about people’s perceptions of similarity. An experimental evaluation supplies the main result of this thesis, which separates psychological findings with a major impact on the results from those that can be safely incorporated into the framework through computationally simpler alternatives
Kernel-Based Ranking. Methods for Learning and Performance Estimation
Machine learning provides tools for automated construction of predictive
models in data intensive areas of engineering and science. The family of
regularized kernel methods have in the recent years become one of the mainstream
approaches to machine learning, due to a number of advantages the
methods share. The approach provides theoretically well-founded solutions
to the problems of under- and overfitting, allows learning from structured
data, and has been empirically demonstrated to yield high predictive performance
on a wide range of application domains. Historically, the problems
of classification and regression have gained the majority of attention in the
field. In this thesis we focus on another type of learning problem, that of
learning to rank.
In learning to rank, the aim is from a set of past observations to learn
a ranking function that can order new objects according to how well they
match some underlying criterion of goodness. As an important special case
of the setting, we can recover the bipartite ranking problem, corresponding
to maximizing the area under the ROC curve (AUC) in binary classification.
Ranking applications appear in a large variety of settings, examples
encountered in this thesis include document retrieval in web search, recommender
systems, information extraction and automated parsing of natural
language. We consider the pairwise approach to learning to rank, where
ranking models are learned by minimizing the expected probability of ranking
any two randomly drawn test examples incorrectly. The development
of computationally efficient kernel methods, based on this approach, has in
the past proven to be challenging. Moreover, it is not clear what techniques
for estimating the predictive performance of learned models are the most
reliable in the ranking setting, and how the techniques can be implemented
efficiently.
The contributions of this thesis are as follows. First, we develop
RankRLS, a computationally efficient kernel method for learning to rank,
that is based on minimizing a regularized pairwise least-squares loss. In
addition to training methods, we introduce a variety of algorithms for tasks
such as model selection, multi-output learning, and cross-validation, based
on computational shortcuts from matrix algebra. Second, we improve the fastest known training method for the linear version of the RankSVM algorithm,
which is one of the most well established methods for learning to
rank. Third, we study the combination of the empirical kernel map and reduced
set approximation, which allows the large-scale training of kernel machines
using linear solvers, and propose computationally efficient solutions
to cross-validation when using the approach. Next, we explore the problem
of reliable cross-validation when using AUC as a performance criterion,
through an extensive simulation study. We demonstrate that the proposed
leave-pair-out cross-validation approach leads to more reliable performance
estimation than commonly used alternative approaches. Finally, we present
a case study on applying machine learning to information extraction from
biomedical literature, which combines several of the approaches considered
in the thesis. The thesis is divided into two parts. Part I provides the background
for the research work and summarizes the most central results, Part
II consists of the five original research articles that are the main contribution
of this thesis.Siirretty Doriast
Global and Preference-based Optimization with Mixed Variables using Piecewise Affine Surrogates
Optimization problems involving mixed variables, i.e., variables of numerical
and categorical nature, can be challenging to solve, especially in the presence
of complex constraints. Moreover, when the objective function is the result of
a complicated simulation or experiment, it may be expensive to evaluate. This
paper proposes a novel surrogate-based global optimization algorithm to solve
linearly constrained mixed-variable problems up to medium-large size (around
100 variables after encoding and 20 constraints) based on constructing a
piecewise affine surrogate of the objective function over feasible samples. We
introduce two types of exploration functions to efficiently search the feasible
domain via mixed-integer linear programming solvers. We also provide a
preference-based version of the algorithm, which can be used when only pairwise
comparisons between samples can be acquired while the underlying objective
function to minimize remains unquantified. The two algorithms are tested on
mixed-variable benchmark problems with and without constraints. The results
show that, within a small number of acquisitions, the proposed algorithms can
often achieve better or comparable results than other existing methods.Comment: code available at https://github.com/mjzhu-p/PWA
Approximation Algorithms for Envy-Free Cake Division with Connected Pieces
Cake cutting is a classic model for studying fair division of a heterogeneous, divisible resource among agents with individual preferences. Addressing cake division under a typical requirement that each agent must receive a connected piece of the cake, we develop approximation algorithms for finding envy-free (fair) cake divisions. In particular, this work improves the state-of-the-art additive approximation bound for this fundamental problem. Our results hold for general cake division instances in which the agents\u27 valuations satisfy basic assumptions and are normalized (to have value 1 for the cake). Furthermore, the developed algorithms execute in polynomial time under the standard Robertson-Webb query model.
Prior work has shown that one can efficiently compute a cake division (with connected pieces) in which the additive envy of any agent is at most 1/3. An efficient algorithm is also known for finding connected cake divisions that are (almost) 1/2-multiplicatively envy-free. Improving the additive approximation guarantee and maintaining the multiplicative one, we develop a polynomial-time algorithm that computes a connected cake division that is both (1/4 +o(1))-additively envy-free and (1/2 - o(1))-multiplicatively envy-free. Our algorithm is based on the ideas of interval growing and envy-cycle elimination.
In addition, we study cake division instances in which the number of distinct valuations across the agents is parametrically bounded. We show that such cake division instances admit a fully polynomial-time approximation scheme for connected envy-free cake division
Fair Allocation of goods and chores -- Tutorial and Survey of Recent Results
Fair resource allocation is an important problem in many real-world
scenarios, where resources such as goods and chores must be allocated among
agents. In this survey, we delve into the intricacies of fair allocation,
focusing specifically on the challenges associated with indivisible resources.
We define fairness and efficiency within this context and thoroughly survey
existential results, algorithms, and approximations that satisfy various
fairness criteria, including envyfreeness, proportionality, MMS, and their
relaxations. Additionally, we discuss algorithms that achieve fairness and
efficiency, such as Pareto Optimality and Utilitarian Welfare. We also study
the computational complexity of these algorithms, the likelihood of finding
fair allocations, and the price of fairness for each fairness notion. We also
cover mixed instances of indivisible and divisible items and investigate
different valuation and allocation settings. By summarizing the
state-of-the-art research, this survey provides valuable insights into fair
resource allocation of indivisible goods and chores, highlighting computational
complexities, fairness guarantees, and trade-offs between fairness and
efficiency. It serves as a foundation for future advancements in this vital
field
An efficient algorithm for learning to rank from preference graphs
In this paper, we introduce a framework for regularized least-squares (RLS) type of ranking cost functions and we propose three such cost functions. Further, we propose a kernel-based preference learning algorithm, which we call RankRLS, for minimizing these functions. It is shown that RankRLS has many computational advantages compared to the ranking algorithms that are based on minimizing other types of costs, such as the hinge cost. In particular, we present efficient algorithms for training, parameter selection, multiple output learning, cross-validation, and large-scale learning. Circumstances under which these computational benefits make RankRLS preferable to RankSVM are considered. We evaluate RankRLS on four different types of ranking tasks using RankSVM and the standard RLS regression as the baselines. RankRLS outperforms the standard RLS regression and its performance is very similar to that of RankSVM, while RankRLS has several computational benefits over RankSVM
Data analytics
This study guide is devoted to substantiating the nature, role and importance of data, information, analytical work, explanation of its basic principles within modern information environment, as well as consideration of the main approaches and basic tools while performing the analytical tasks by specialists in the sphere of political analytics as well as of social work
Multi-Agent Systems for Computational Economics and Finance
In this article we survey the main research topics of our group at the University of Essex. Our research interests lie at the intersection of theoretical computer science, artificial intelligence, and economic theory. In particular, we focus on the design and analysis of mechanisms for systems involving multiple strategic agents, both from a theoretical and an applied perspective. We present an overview of our group’s activities, as well as its members, and then discuss in detail past, present, and future work in multi-agent systems
A Few Queries Go a Long Way: Information-Distortion Tradeoffs in Matching
We consider the One-Sided Matching problem, where n agents have preferences over n items, and these preferences are induced by underlying cardinal valuation functions. The goal is to match every agent to a single item so as to maximize the social welfare. Most of the related literature, however, assumes that the values of the agents are not a priori known, and only access to the ordinal preferences of the agents over the items is provided. Consequently, this incomplete information leads to loss of efficiency, which is measured by the notion of distortion. In this paper, we further assume that the agents can answer a small number of queries, allowing us partial access to their values. We study the interplay between elicited cardinal information (measured by the number of queries per agent) and distortion for One-Sided Matching, as well as a wide range of well-studied related problems. Qualitatively, our results show that with a limited number of queries, it is possible to obtain significant improvements over the classic setting, where only access to ordinal information is given
- …