3,166 research outputs found

    Semantic Similarity of Spatial Scenes

    Get PDF
    The formalization of similarity in spatial information systems can unleash their functionality and contribute technology not only useful, but also desirable by broad groups of users. As a paradigm for information retrieval, similarity supersedes tedious querying techniques and unveils novel ways for user-system interaction by naturally supporting modalities such as speech and sketching. As a tool within the scope of a broader objective, it can facilitate such diverse tasks as data integration, landmark determination, and prediction making. This potential motivated the development of several similarity models within the geospatial and computer science communities. Despite the merit of these studies, their cognitive plausibility can be limited due to neglect of well-established psychological principles about properties and behaviors of similarity. Moreover, such approaches are typically guided by experience, intuition, and observation, thereby often relying on more narrow perspectives or restrictive assumptions that produce inflexible and incompatible measures. This thesis consolidates such fragmentary efforts and integrates them along with novel formalisms into a scalable, comprehensive, and cognitively-sensitive framework for similarity queries in spatial information systems. Three conceptually different similarity queries at the levels of attributes, objects, and scenes are distinguished. An analysis of the relationship between similarity and change provides a unifying basis for the approach and a theoretical foundation for measures satisfying important similarity properties such as asymmetry and context dependence. The classification of attributes into categories with common structural and cognitive characteristics drives the implementation of a small core of generic functions, able to perform any type of attribute value assessment. Appropriate techniques combine such atomic assessments to compute similarities at the object level and to handle more complex inquiries with multiple constraints. These techniques, along with a solid graph-theoretical methodology adapted to the particularities of the geospatial domain, provide the foundation for reasoning about scene similarity queries. Provisions are made so that all methods comply with major psychological findings about people’s perceptions of similarity. An experimental evaluation supplies the main result of this thesis, which separates psychological findings with a major impact on the results from those that can be safely incorporated into the framework through computationally simpler alternatives

    Kernel-Based Ranking. Methods for Learning and Performance Estimation

    Get PDF
    Machine learning provides tools for automated construction of predictive models in data intensive areas of engineering and science. The family of regularized kernel methods have in the recent years become one of the mainstream approaches to machine learning, due to a number of advantages the methods share. The approach provides theoretically well-founded solutions to the problems of under- and overfitting, allows learning from structured data, and has been empirically demonstrated to yield high predictive performance on a wide range of application domains. Historically, the problems of classification and regression have gained the majority of attention in the field. In this thesis we focus on another type of learning problem, that of learning to rank. In learning to rank, the aim is from a set of past observations to learn a ranking function that can order new objects according to how well they match some underlying criterion of goodness. As an important special case of the setting, we can recover the bipartite ranking problem, corresponding to maximizing the area under the ROC curve (AUC) in binary classification. Ranking applications appear in a large variety of settings, examples encountered in this thesis include document retrieval in web search, recommender systems, information extraction and automated parsing of natural language. We consider the pairwise approach to learning to rank, where ranking models are learned by minimizing the expected probability of ranking any two randomly drawn test examples incorrectly. The development of computationally efficient kernel methods, based on this approach, has in the past proven to be challenging. Moreover, it is not clear what techniques for estimating the predictive performance of learned models are the most reliable in the ranking setting, and how the techniques can be implemented efficiently. The contributions of this thesis are as follows. First, we develop RankRLS, a computationally efficient kernel method for learning to rank, that is based on minimizing a regularized pairwise least-squares loss. In addition to training methods, we introduce a variety of algorithms for tasks such as model selection, multi-output learning, and cross-validation, based on computational shortcuts from matrix algebra. Second, we improve the fastest known training method for the linear version of the RankSVM algorithm, which is one of the most well established methods for learning to rank. Third, we study the combination of the empirical kernel map and reduced set approximation, which allows the large-scale training of kernel machines using linear solvers, and propose computationally efficient solutions to cross-validation when using the approach. Next, we explore the problem of reliable cross-validation when using AUC as a performance criterion, through an extensive simulation study. We demonstrate that the proposed leave-pair-out cross-validation approach leads to more reliable performance estimation than commonly used alternative approaches. Finally, we present a case study on applying machine learning to information extraction from biomedical literature, which combines several of the approaches considered in the thesis. The thesis is divided into two parts. Part I provides the background for the research work and summarizes the most central results, Part II consists of the five original research articles that are the main contribution of this thesis.Siirretty Doriast

    Global and Preference-based Optimization with Mixed Variables using Piecewise Affine Surrogates

    Full text link
    Optimization problems involving mixed variables, i.e., variables of numerical and categorical nature, can be challenging to solve, especially in the presence of complex constraints. Moreover, when the objective function is the result of a complicated simulation or experiment, it may be expensive to evaluate. This paper proposes a novel surrogate-based global optimization algorithm to solve linearly constrained mixed-variable problems up to medium-large size (around 100 variables after encoding and 20 constraints) based on constructing a piecewise affine surrogate of the objective function over feasible samples. We introduce two types of exploration functions to efficiently search the feasible domain via mixed-integer linear programming solvers. We also provide a preference-based version of the algorithm, which can be used when only pairwise comparisons between samples can be acquired while the underlying objective function to minimize remains unquantified. The two algorithms are tested on mixed-variable benchmark problems with and without constraints. The results show that, within a small number of acquisitions, the proposed algorithms can often achieve better or comparable results than other existing methods.Comment: code available at https://github.com/mjzhu-p/PWA

    Approximation Algorithms for Envy-Free Cake Division with Connected Pieces

    Get PDF
    Cake cutting is a classic model for studying fair division of a heterogeneous, divisible resource among agents with individual preferences. Addressing cake division under a typical requirement that each agent must receive a connected piece of the cake, we develop approximation algorithms for finding envy-free (fair) cake divisions. In particular, this work improves the state-of-the-art additive approximation bound for this fundamental problem. Our results hold for general cake division instances in which the agents\u27 valuations satisfy basic assumptions and are normalized (to have value 1 for the cake). Furthermore, the developed algorithms execute in polynomial time under the standard Robertson-Webb query model. Prior work has shown that one can efficiently compute a cake division (with connected pieces) in which the additive envy of any agent is at most 1/3. An efficient algorithm is also known for finding connected cake divisions that are (almost) 1/2-multiplicatively envy-free. Improving the additive approximation guarantee and maintaining the multiplicative one, we develop a polynomial-time algorithm that computes a connected cake division that is both (1/4 +o(1))-additively envy-free and (1/2 - o(1))-multiplicatively envy-free. Our algorithm is based on the ideas of interval growing and envy-cycle elimination. In addition, we study cake division instances in which the number of distinct valuations across the agents is parametrically bounded. We show that such cake division instances admit a fully polynomial-time approximation scheme for connected envy-free cake division

    Fair Allocation of goods and chores -- Tutorial and Survey of Recent Results

    Full text link
    Fair resource allocation is an important problem in many real-world scenarios, where resources such as goods and chores must be allocated among agents. In this survey, we delve into the intricacies of fair allocation, focusing specifically on the challenges associated with indivisible resources. We define fairness and efficiency within this context and thoroughly survey existential results, algorithms, and approximations that satisfy various fairness criteria, including envyfreeness, proportionality, MMS, and their relaxations. Additionally, we discuss algorithms that achieve fairness and efficiency, such as Pareto Optimality and Utilitarian Welfare. We also study the computational complexity of these algorithms, the likelihood of finding fair allocations, and the price of fairness for each fairness notion. We also cover mixed instances of indivisible and divisible items and investigate different valuation and allocation settings. By summarizing the state-of-the-art research, this survey provides valuable insights into fair resource allocation of indivisible goods and chores, highlighting computational complexities, fairness guarantees, and trade-offs between fairness and efficiency. It serves as a foundation for future advancements in this vital field

    An efficient algorithm for learning to rank from preference graphs

    Get PDF
    In this paper, we introduce a framework for regularized least-squares (RLS) type of ranking cost functions and we propose three such cost functions. Further, we propose a kernel-based preference learning algorithm, which we call RankRLS, for minimizing these functions. It is shown that RankRLS has many computational advantages compared to the ranking algorithms that are based on minimizing other types of costs, such as the hinge cost. In particular, we present efficient algorithms for training, parameter selection, multiple output learning, cross-validation, and large-scale learning. Circumstances under which these computational benefits make RankRLS preferable to RankSVM are considered. We evaluate RankRLS on four different types of ranking tasks using RankSVM and the standard RLS regression as the baselines. RankRLS outperforms the standard RLS regression and its performance is very similar to that of RankSVM, while RankRLS has several computational benefits over RankSVM

    Data analytics

    Get PDF
    This study guide is devoted to substantiating the nature, role and importance of data, information, analytical work, explanation of its basic principles within modern information environment, as well as consideration of the main approaches and basic tools while performing the analytical tasks by specialists in the sphere of political analytics as well as of social work

    Multi-Agent Systems for Computational Economics and Finance

    Get PDF
    In this article we survey the main research topics of our group at the University of Essex. Our research interests lie at the intersection of theoretical computer science, artificial intelligence, and economic theory. In particular, we focus on the design and analysis of mechanisms for systems involving multiple strategic agents, both from a theoretical and an applied perspective. We present an overview of our group’s activities, as well as its members, and then discuss in detail past, present, and future work in multi-agent systems

    A Few Queries Go a Long Way: Information-Distortion Tradeoffs in Matching

    Get PDF
    We consider the One-Sided Matching problem, where n agents have preferences over n items, and these preferences are induced by underlying cardinal valuation functions. The goal is to match every agent to a single item so as to maximize the social welfare. Most of the related literature, however, assumes that the values of the agents are not a priori known, and only access to the ordinal preferences of the agents over the items is provided. Consequently, this incomplete information leads to loss of efficiency, which is measured by the notion of distortion. In this paper, we further assume that the agents can answer a small number of queries, allowing us partial access to their values. We study the interplay between elicited cardinal information (measured by the number of queries per agent) and distortion for One-Sided Matching, as well as a wide range of well-studied related problems. Qualitatively, our results show that with a limited number of queries, it is possible to obtain significant improvements over the classic setting, where only access to ordinal information is given
    corecore