1,373 research outputs found

    E-loyalty networks in online auctions

    Full text link
    Creating a loyal customer base is one of the most important, and at the same time, most difficult tasks a company faces. Creating loyalty online (e-loyalty) is especially difficult since customers can ``switch'' to a competitor with the click of a mouse. In this paper we investigate e-loyalty in online auctions. Using a unique data set of over 30,000 auctions from one of the main consumer-to-consumer online auction houses, we propose a novel measure of e-loyalty via the associated network of transactions between bidders and sellers. Using a bipartite network of bidder and seller nodes, two nodes are linked when a bidder purchases from a seller and the number of repeat-purchases determines the strength of that link. We employ ideas from functional principal component analysis to derive, from this network, the loyalty distribution which measures the perceived loyalty of every individual seller, and associated loyalty scores which summarize this distribution in a parsimonious way. We then investigate the effect of loyalty on the outcome of an auction. In doing so, we are confronted with several statistical challenges in that standard statistical models lead to a misrepresentation of the data and a violation of the model assumptions. The reason is that loyalty networks result in an extreme clustering of the data, with few high-volume sellers accounting for most of the individual transactions. We investigate several remedies to the clustering problem and conclude that loyalty networks consist of very distinct segments that can best be understood individually.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS310 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Penalizing Unfairness in Binary Classification

    Get PDF
    We present a new approach for mitigating unfairness in learned classifiers. In particular, we focus on binary classification tasks over individuals from two populations, where, as our criterion for fairness, we wish to achieve similar false positive rates in both populations, and similar false negative rates in both populations. As a proof of concept, we implement our approach and empirically evaluate its ability to achieve both fairness and accuracy, using datasets from the fields of criminal risk assessment, credit, lending, and college admissions

    Faster Shortest Paths in Dense Distance Graphs, with Applications

    Full text link
    We show how to combine two techniques for efficiently computing shortest paths in directed planar graphs. The first is the linear-time shortest-path algorithm of Henzinger, Klein, Subramanian, and Rao [STOC'94]. The second is Fakcharoenphol and Rao's algorithm [FOCS'01] for emulating Dijkstra's algorithm on the dense distance graph (DDG). A DDG is defined for a decomposition of a planar graph GG into regions of at most rr vertices each, for some parameter r<nr < n. The vertex set of the DDG is the set of Θ(n/r)\Theta(n/\sqrt r) vertices of GG that belong to more than one region (boundary vertices). The DDG has Θ(n)\Theta(n) arcs, such that distances in the DDG are equal to the distances in GG. Fakcharoenphol and Rao's implementation of Dijkstra's algorithm on the DDG (nicknamed FR-Dijkstra) runs in O(nlog⁑(n)rβˆ’1/2log⁑r)O(n\log(n) r^{-1/2} \log r) time, and is a key component in many state-of-the-art planar graph algorithms for shortest paths, minimum cuts, and maximum flows. By combining these two techniques we remove the log⁑n\log n dependency in the running time of the shortest-path algorithm, making it O(nrβˆ’1/2log⁑2r)O(n r^{-1/2} \log^2r). This work is part of a research agenda that aims to develop new techniques that would lead to faster, possibly linear-time, algorithms for problems such as minimum-cut, maximum-flow, and shortest paths with negative arc lengths. As immediate applications, we show how to compute maximum flow in directed weighted planar graphs in O(nlog⁑p)O(n \log p) time, where pp is the minimum number of edges on any path from the source to the sink. We also show how to compute any part of the DDG that corresponds to a region with rr vertices and kk boundary vertices in O(rlog⁑k)O(r \log k) time, which is faster than has been previously known for small values of kk
    • …
    corecore