453,940 research outputs found

    Enabling Quality Control for Entity Resolution: A Human and Machine Cooperation Framework

    Full text link
    Even though many machine algorithms have been proposed for entity resolution, it remains very challenging to find a solution with quality guarantees. In this paper, we propose a novel HUman and Machine cOoperation (HUMO) framework for entity resolution (ER), which divides an ER workload between the machine and the human. HUMO enables a mechanism for quality control that can flexibly enforce both precision and recall levels. We introduce the optimization problem of HUMO, minimizing human cost given a quality requirement, and then present three optimization approaches: a conservative baseline one purely based on the monotonicity assumption of precision, a more aggressive one based on sampling and a hybrid one that can take advantage of the strengths of both previous approaches. Finally, we demonstrate by extensive experiments on real and synthetic datasets that HUMO can achieve high-quality results with reasonable return on investment (ROI) in terms of human cost, and it performs considerably better than the state-of-the-art alternatives in quality control.Comment: 12 pages, 11 figures. Camera-ready version of the paper submitted to ICDE 2018, In Proceedings of the 34th IEEE International Conference on Data Engineering (ICDE 2018

    Distributed Selfish Coaching

    Full text link
    Although cooperation generally increases the amount of resources available to a community of nodes, thus improving individual and collective performance, it also allows for the appearance of potential mistreatment problems through the exposition of one node's resources to others. We study such concerns by considering a group of independent, rational, self-aware nodes that cooperate using on-line caching algorithms, where the exposed resource is the storage at each node. Motivated by content networking applications -- including web caching, CDNs, and P2P -- this paper extends our previous work on the on-line version of the problem, which was conducted under a game-theoretic framework, and limited to object replication. We identify and investigate two causes of mistreatment: (1) cache state interactions (due to the cooperative servicing of requests) and (2) the adoption of a common scheme for cache management policies. Using analytic models, numerical solutions of these models, as well as simulation experiments, we show that on-line cooperation schemes using caching are fairly robust to mistreatment caused by state interactions. To appear in a substantial manner, the interaction through the exchange of miss-streams has to be very intense, making it feasible for the mistreated nodes to detect and react to exploitation. This robustness ceases to exist when nodes fetch and store objects in response to remote requests, i.e., when they operate as Level-2 caches (or proxies) for other nodes. Regarding mistreatment due to a common scheme, we show that this can easily take place when the "outlier" characteristics of some of the nodes get overlooked. This finding underscores the importance of allowing cooperative caching nodes the flexibility of choosing from a diverse set of schemes to fit the peculiarities of individual nodes. To that end, we outline an emulation-based framework for the development of mistreatment-resilient distributed selfish caching schemes. Our framework utilizes a simple control-theoretic approach to dynamically parameterize the cache management scheme. We show performance evaluation results that quantify the benefits from instantiating such a framework, which could be substantial under skewed demand profiles.National Science Foundation (CNS Cybertrust 0524477, CNS NeTS 0520166, CNS ITR 0205294, EIA RI 0202067); EU IST (CASCADAS and E-NEXT); Marie Curie Outgoing International Fellowship of the EU (MOIF-CT-2005-007230

    Variance-constrained dissipative observer-based control for a class of nonlinear stochastic systems with degraded measurements

    Get PDF
    The official published version of the article can be obtained from the link below.This paper is concerned with the variance-constrained dissipative control problem for a class of stochastic nonlinear systems with multiple degraded measurements, where the degraded probability for each sensor is governed by an individual random variable satisfying a certain probabilistic distribution over a given interval. The purpose of the problem is to design an observer-based controller such that, for all possible degraded measurements, the closed-loop system is exponentially mean-square stable and strictly dissipative, while the individual steady-state variance is not more than the pre-specified upper bound constraints. A general framework is established so that the required exponential mean-square stability, dissipativity as well as the variance constraints can be easily enforced. A sufficient condition is given for the solvability of the addressed multiobjective control problem, and the desired observer and controller gains are characterized in terms of the solution to a convex optimization problem that can be easily solved by using the semi-definite programming method. Finally, a numerical example is presented to show the effectiveness and applicability of the proposed algorithm.This work was supported in part by the Distinguished Visiting Fellowship of the Royal Academy of Engineering of the UK, the Royal Society of the UK, the GRF HKU 7137/09E, the National Natural Science Foundation of China under Grant 61028008, the International Science and Technology Cooperation Project of China under Grant 2009DFA32050, and the Alexander von Humboldt Foundation of Germany

    Web 2.0 and micro-businesses: An exploratory investigation

    Get PDF
    This is the author's final version of the article. This article is (c) Emerald Group Publishing and permission has been granted for this version to appear here. Emerald does not grant permission for this article to be further copied/distributed or hosted elsewhere without the express permission from Emerald Group Publishing Limited.This article was chosen as a Highly Commended Award Winner at the Emerald Literati Network Awards for Excellence 2013.Purpose – The paper aims to report on an exploratory study into how small businesses use Web 2.0 information and communication technologies (ICT) to work collaboratively with other small businesses. The study had two aims: to investigate the benefits available from the use of Web 2.0 in small business collaborations, and to characterize the different types of such online collaborations. Design/methodology/approach – The research uses a qualitative case study methodology based on semi-structured interviews with the owner-managers of 12 UK-based small companies in the business services sector who are early adopters of Web 2.0 technologies. Findings – Benefits from the use of Web 2.0 are categorized as lifestyle benefits, internal operational efficiency, enhanced capability, external communications and enhanced service offerings. A 2×2 framework is developed to categorize small business collaborations using the dimensions of the basis for inter-organizational collaboration (control vs cooperation) and the level of Web 2.0 ICT use (simple vs sophisticated). Research limitations/implications – A small number of firms of similar size, sector and location were studied, which limits generalizability. Nonetheless, the results offer a pointer to the likely future use of Web 2.0 tools by other small businesses. Practical implications – The research provides evidence of the attraction and potential of Web 2.0 for collaborations between small businesses. Originality/value – The paper is one of the first to report on use of Web 2.0 ICT in collaborative working between small businesses. It will be of interest to those seeking a better understanding of the potential of Web 2.0 in the small business community.WestFocu

    Resource Management in Message Passing Environments

    Get PDF
    This paper discusses the need for resource management support for parallel applications running on workstation clusters and communicating by message passing among tasks. Many resource management systems are only able to start a message passing runtime environment and parallel applications, but dynamic reconfiguration fails because of the missing cooperation between the resource manager and the runtime environment. In order to utilize computational resources in message passing environments efficiently, to control execution of parallel applications by rescheduling tasks at runtime, and to minimize their execution time, a resource management system has been developed and preliminary tests results have been carried out. Most of our efforts in this regard have been to design an efficient approach to load measurement and process scheduling and implement the resource management system in a manner such that it can easily be adapted to any message passing framework. Although our first version is based on the PVM system, we also intend to implement an MPI – based resource management system

    Internal report cluster 1: Urban freight innovations and solutions for sustainable deliveries (2/4)

    Get PDF
    Technical report about sustainable urban freight solutions, part 2 of

    Internal report cluster 1: Urban freight innovations and solutions for sustainable deliveries (1/4)

    Get PDF
    Technical report about sustainable urban freight solutions, part 1 of
    • 

    corecore