2,129 research outputs found

    Information scraps: how and why information eludes our personal information management tools

    No full text
    In this paper we describe information scraps -- a class of personal information whose content is scribbled on Post-it notes, scrawled on corners of random sheets of paper, buried inside the bodies of e-mail messages sent to ourselves, or typed haphazardly into text files. Information scraps hold our great ideas, sketches, notes, reminders, driving directions, and even our poetry. We define information scraps to be the body of personal information that is held outside of its natural or We have much still to learn about these loose forms of information capture. Why are they so often held outside of our traditional PIM locations and instead on Post-its or in text files? Why must we sometimes go around our traditional PIM applications to hold on to our scraps, such as by e-mailing ourselves? What are information scraps' role in the larger space of personal information management, and what do they uniquely offer that we find so appealing? If these unorganized bits truly indicate the failure of our PIM tools, how might we begin to build better tools? We have pursued these questions by undertaking a study of 27 knowledge workers. In our findings we describe information scraps from several angles: their content, their location, and the factors that lead to their use, which we identify as ease of capture, flexibility of content and organization, and avilability at the time of need. We also consider the personal emotive responses around scrap management. We present a set of design considerations that we have derived from the analysis of our study results. We present our work on an application platform, jourknow, to test some of these design and usability findings

    Assessment of the effectiveness of head only and back-of-the-head electrical stunning of chickens

    Get PDF
    The study assesses the effectiveness of reversible head-only and back-of-the-head electrical stunning of chickens using 130–950 mA per bird at 50 Hz AC

    Globally Optimal Crowdsourcing Quality Management

    Full text link
    We study crowdsourcing quality management, that is, given worker responses to a set of tasks, our goal is to jointly estimate the true answers for the tasks, as well as the quality of the workers. Prior work on this problem relies primarily on applying Expectation-Maximization (EM) on the underlying maximum likelihood problem to estimate true answers as well as worker quality. Unfortunately, EM only provides a locally optimal solution rather than a globally optimal one. Other solutions to the problem (that do not leverage EM) fail to provide global optimality guarantees as well. In this paper, we focus on filtering, where tasks require the evaluation of a yes/no predicate, and rating, where tasks elicit integer scores from a finite domain. We design algorithms for finding the global optimal estimates of correct task answers and worker quality for the underlying maximum likelihood problem, and characterize the complexity of these algorithms. Our algorithms conceptually consider all mappings from tasks to true answers (typically a very large number), leveraging two key ideas to reduce, by several orders of magnitude, the number of mappings under consideration, while preserving optimality. We also demonstrate that these algorithms often find more accurate estimates than EM-based algorithms. This paper makes an important contribution towards understanding the inherent complexity of globally optimal crowdsourcing quality management

    The Transcriptional Landscape of Marek’s Disease Virus in Primary Chicken B Cells Reveals Novel Splice Variants and Genes

    Get PDF
    Marek’s disease virus (MDV) is an oncogenic alphaherpesvirus that infects chickens and poses a serious threat to poultry health. In infected animals, MDV efficiently replicates in B cells in various lymphoid organs. Despite many years of research, the viral transcriptome in primary target cells of MDV remained unknown. In this study, we uncovered the transcriptional landscape of the very virulent RB1B strain and the attenuated CVI988/Rispens vaccine strain in primary chicken B cells using high-throughput RNA-sequencing. Our data confirmed the expression of known genes, but also identified a novel spliced MDV gene in the unique short region of the genome. Furthermore, de novo transcriptome assembly revealed extensive splicing of viral genes resulting in coding and non-coding RNA transcripts. A novel splicing isoform of MDV UL15 could also be confirmed by mass spectrometry and RT-PCR. In addition, we could demonstrate that the associated transcriptional motifs are highly conserved and closely resembled those of the host transcriptional machinery. Taken together, our data allow a comprehensive re-annotation of the MDV genome with novel genes and splice variants that could be targeted in further research on MDV replication and tumorigenesis

    Speeding up shortest path algorithms

    Full text link
    Given an arbitrary, non-negatively weighted, directed graph G=(V,E)G=(V,E) we present an algorithm that computes all pairs shortest paths in time O(mn+mlgn+nTψ(m,n))\mathcal{O}(m^* n + m \lg n + nT_\psi(m^*, n)), where mm^* is the number of different edges contained in shortest paths and Tψ(m,n)T_\psi(m^*, n) is a running time of an algorithm to solve a single-source shortest path problem (SSSP). This is a substantial improvement over a trivial nn times application of ψ\psi that runs in O(nTψ(m,n))\mathcal{O}(nT_\psi(m,n)). In our algorithm we use ψ\psi as a black box and hence any improvement on ψ\psi results also in improvement of our algorithm. Furthermore, a combination of our method, Johnson's reweighting technique and topological sorting results in an O(mn+mlgn)\mathcal{O}(m^*n + m \lg n) all-pairs shortest path algorithm for arbitrarily-weighted directed acyclic graphs. In addition, we also point out a connection between the complexity of a certain sorting problem defined on shortest paths and SSSP.Comment: 10 page

    Efficient crowdsourcing for multi-class labeling

    Get PDF
    Crowdsourcing systems like Amazon's Mechanical Turk have emerged as an effective large-scale human-powered platform for performing tasks in domains such as image classification, data entry, recommendation, and proofreading. Since workers are low-paid (a few cents per task) and tasks performed are monotonous, the answers obtained are noisy and hence unreliable. To obtain reliable estimates, it is essential to utilize appropriate inference algorithms (e.g. Majority voting) coupled with structured redundancy through task assignment. Our goal is to obtain the best possible trade-off between reliability and redundancy. In this paper, we consider a general probabilistic model for noisy observations for crowd-sourcing systems and pose the problem of minimizing the total price (i.e. redundancy) that must be paid to achieve a target overall reliability. Concretely, we show that it is possible to obtain an answer to each task correctly with probability 1-ε as long as the redundancy per task is O((K/q) log (K/ε)), where each task can have any of the KK distinct answers equally likely, q is the crowd-quality parameter that is defined through a probabilistic model. Further, effectively this is the best possible redundancy-accuracy trade-off any system design can achieve. Such a single-parameter crisp characterization of the (order-)optimal trade-off between redundancy and reliability has various useful operational consequences. Further, we analyze the robustness of our approach in the presence of adversarial workers and provide a bound on their influence on the redundancy-accuracy trade-off. Unlike recent prior work [GKM11, KOS11, KOS11], our result applies to non-binary (i.e. K>2) tasks. In effect, we utilize algorithms for binary tasks (with inhomogeneous error model unlike that in [GKM11, KOS11, KOS11]) as key subroutine to obtain answers for K-ary tasks. Technically, the algorithm is based on low-rank approximation of weighted adjacency matrix for a random regular bipartite graph, weighted according to the answers provided by the workers.National Science Foundation (U.S.

    Almost-Tight Distributed Minimum Cut Algorithms

    Full text link
    We study the problem of computing the minimum cut in a weighted distributed message-passing networks (the CONGEST model). Let λ\lambda be the minimum cut, nn be the number of nodes in the network, and DD be the network diameter. Our algorithm can compute λ\lambda exactly in O((nlogn+D)λ4log2n)O((\sqrt{n} \log^{*} n+D)\lambda^4 \log^2 n) time. To the best of our knowledge, this is the first paper that explicitly studies computing the exact minimum cut in the distributed setting. Previously, non-trivial sublinear time algorithms for this problem are known only for unweighted graphs when λ3\lambda\leq 3 due to Pritchard and Thurimella's O(D)O(D)-time and O(D+n1/2logn)O(D+n^{1/2}\log^* n)-time algorithms for computing 22-edge-connected and 33-edge-connected components. By using the edge sampling technique of Karger's, we can convert this algorithm into a (1+ϵ)(1+\epsilon)-approximation O((nlogn+D)ϵ5log3n)O((\sqrt{n}\log^{*} n+D)\epsilon^{-5}\log^3 n)-time algorithm for any ϵ>0\epsilon>0. This improves over the previous (2+ϵ)(2+\epsilon)-approximation O((nlogn+D)ϵ5log2nloglogn)O((\sqrt{n}\log^{*} n+D)\epsilon^{-5}\log^2 n\log\log n)-time algorithm and O(ϵ1)O(\epsilon^{-1})-approximation O(D+n12+ϵpolylogn)O(D+n^{\frac{1}{2}+\epsilon} \mathrm{poly}\log n)-time algorithm of Ghaffari and Kuhn. Due to the lower bound of Ω(D+n1/2/logn)\Omega(D+n^{1/2}/\log n) by Das Sarma et al. which holds for any approximation algorithm, this running time is tight up to a polylogn \mathrm{poly}\log n factor. To get the stated running time, we developed an approximation algorithm which combines the ideas of Thorup's algorithm and Matula's contraction algorithm. It saves an ϵ9log7n\epsilon^{-9}\log^{7} n factor as compared to applying Thorup's tree packing theorem directly. Then, we combine Kutten and Peleg's tree partitioning algorithm and Karger's dynamic programming to achieve an efficient distributed algorithm that finds the minimum cut when we are given a spanning tree that crosses the minimum cut exactly once

    End-users publishing structured information on the web: an observational study of what, why, and how

    Get PDF
    End-users are accustomed to filtering and browsing styled collections of data on professional web sites, but they have few ways to create and publish such information architectures for themselves. This paper presents a full-lifecycle analysis of the Exhibit framework - an end-user tool which provides such functionality - to understand the needs, capabilities, and practices of this class of users. We include interviews, as well as analysis of over 1,800 visualizations and 200,000 web interactions with these visualizations. Our analysis reveals important findings about this user population which generalize to the task of providing better end-user structured content publication tools.Intel Science & Technology Center for Big Dat
    corecore