2,666,631 research outputs found

    Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding

    Get PDF
    Computer vision has a great potential to help our daily lives by searching for lost keys, watering flowers or reminding us to take a pill. To succeed with such tasks, computer vision methods need to be trained from real and diverse examples of our daily dynamic scenes. While most of such scenes are not particularly exciting, they typically do not appear on YouTube, in movies or TV broadcasts. So how do we collect sufficiently many diverse but boring samples representing our lives? We propose a novel Hollywood in Homes approach to collect such data. Instead of shooting videos in the lab, we ensure diversity by distributing and crowdsourcing the whole process of video creation from script writing to video recording and annotation. Following this procedure we collect a new dataset, Charades, with hundreds of people recording videos in their own homes, acting out casual everyday activities. The dataset is composed of 9,848 annotated videos with an average length of 30 seconds, showing activities of 267 people from three continents. Each video is annotated by multiple free-text descriptions, action labels, action intervals and classes of interacted objects. In total, Charades provides 27,847 video descriptions, 66,500 temporally localized intervals for 157 action classes and 41,104 labels for 46 object classes. Using this rich data, we evaluate and provide baseline results for several tasks including action recognition and automatic description generation. We believe that the realism, diversity, and casual nature of this dataset will present unique challenges and new opportunities for computer vision community

    Understanding Task Design Trade-offs in Crowdsourced Paraphrase Collection

    Full text link
    Linguistically diverse datasets are critical for training and evaluating robust machine learning systems, but data collection is a costly process that often requires experts. Crowdsourcing the process of paraphrase generation is an effective means of expanding natural language datasets, but there has been limited analysis of the trade-offs that arise when designing tasks. In this paper, we present the first systematic study of the key factors in crowdsourcing paraphrase collection. We consider variations in instructions, incentives, data domains, and workflows. We manually analyzed paraphrases for correctness, grammaticality, and linguistic diversity. Our observations provide new insight into the trade-offs between accuracy and diversity in crowd responses that arise as a result of task design, providing guidance for future paraphrase generation procedures.Comment: Published at ACL 201

    Data collection framework for understanding UFT within city logistics solutions

    Get PDF
    Urban Freight Transport (UFT) is a fundamental component of city life. It involves a vast range of activities resulting from relationships among different actors with conflicting needs and goals. Manufacturers are interested in fast and on-time deliveries, retailers require complete assortment and frequent deliveries, citizens wish to have easy access to goods while not losing their quality of life and City Authorities have to face negative externalities related to UFT (i.e. congestion, air and noise pollution, and safety). Concretely, few cities have a well-developed and comprehensive city logistics strategy because authorities generally focus their attention on passenger transport. When city logistics measures have been conceived and implemented, frequently private requirements have not been considered sufficiently. The European Commission includes the lack of data and understanding of freight flows among the main obstacles to the improvement of operational efficiency and planning process for a sustainable UFT in economic, social and environmental terms. Also, the research community raises the issue of the unavailability or the low quality of data on urban freight and the need to identify effective data collection methods in order to understand processes and actors' behavior and then define appropriate city logistics solutions. The NOVELOG EU project is providing city authorities and practitioners with a new framework aimed at systematizing all data to be collected, directly or indirectly, and to be elaborated in order to understand and represent the different aspects of the UFT sector. In order to achieve a complete knowledge, the framework approaches this sector according to four main thematic pillars: 1) profile of major supply chains served in the urban area under study; 2) mapping of urban freight and service trips activity; 3) organizational and legal framework; 4) procedural and technological methods and innovations. The present paper introduces the framework and the guidance it provides to its target audience

    Sustaining Collection Value: Managing Collection/Item Metadata Relationships

    Get PDF
    Many aspects of managing collection/item metadata relationships are critical to sustaining collection value over time. Metadata at the collection-level not only provides context for finding, understanding, and using the items in the collection, but is often essential to the particular research and scholarly activities the collection is designed to support. Contemporary retrieval systems, which search across collections, usually ignore collection level metadata. Alternative approaches, informed by collection-level information, will require an understanding of the various kinds of relationships that can obtain between collection-level and item-level metadata. This paper outlines the problem and describes a project that is developing a logic-based framework for classifying collection-level/item-level metadata relationships. This framework will support (i) metadata specification developers defining metadata elements, (ii) metadata librarians describing objects, and (iii) system designers implementing systems that help users take advantage of collection-level metadata.Institute for Museum and Libary Services (Grant #LG06070020)published or submitted for publicationis peer reviewe

    Reliable online social network data collection

    Get PDF
    Large quantities of information are shared through online social networks, making them attractive sources of data for social network research. When studying the usage of online social networks, these data may not describe properly users’ behaviours. For instance, the data collected often include content shared by the users only, or content accessible to the researchers, hence obfuscating a large amount of data that would help understanding users’ behaviours and privacy concerns. Moreover, the data collection methods employed in experiments may also have an effect on data reliability when participants self-report inacurrate information or are observed while using a simulated application. Understanding the effects of these collection methods on data reliability is paramount for the study of social networks; for understanding user behaviour; for designing socially-aware applications and services; and for mining data collected from such social networks and applications. This chapter reviews previous research which has looked at social network data collection and user behaviour in these networks. We highlight shortcomings in the methods used in these studies, and introduce our own methodology and user study based on the Experience Sampling Method; we claim our methodology leads to the collection of more reliable data by capturing both those data which are shared and not shared. We conclude with suggestions for collecting and mining data from online social networks.Postprin

    Collection understanding

    Get PDF
    Collection understanding shifts the traditional focus of retrieval in large collections from locating specific artifacts to gaining a comprehensive view of the collection. Visualization tools are critical to the process of efficient collection understanding. By presenting simple visual interfaces and intuitive methods of interacting with a collection, users come to understand the essence of the collection by focusing on the artifacts. This thesis discusses a practical approach for enhancing collection understanding in image collections

    [Review of] Ambrose Y.C. King and Rance P.L. Lee, eds. Social Life and Development in Hong Kong

    Get PDF
    This book is a collection of research papers on the political and social conditions of Hong Kong sponsored by the Social Research Centre of The Chinese University of Hong Kong. The collection is not a comprehensive coverage of such conditions in Hong Kong. I t is a selective report with the purpose of updating existing information. The new information will provide a better understanding of Hong Kong\u27s problems and serve as a resource in coping with these problems

    Tax Collection Methods: Understanding Business Tax Collection and the Psyche of Evasion

    Get PDF
    “Taxes are the life-blood of government, and their prompt and certain availability an imperious need (Justice Owen J Roberts, Bull V US 295 U. S. 247 (1935))” (Scharf). Tax collection is necessary to ensure revenues are collected to fund governmental services. States are losing tax revenue for a variety of reasons; this paper explores some of the major factors causing states to lose out on tax revenue. It addresses the tax gap, or unpaid taxes due and the economic inefficiencies caused by tax evasion. It analyzes the psyche of noncompliance in an attempt to discover the most efficient manner of collecting taxes. This understanding of noncompliant behavior is used to help identify the most effective tax collection methods available. This study focuses on the Kentucky Department of Revenue, Division of Collections, Corporation/Limited Liability Company Branch. The CP/LLC Branch focuses on enforced collection activities against corporations and limited liability companies. The enforced collection tools available to the branch are: jeopardy assessments, corporate officer notice of assessments, limited liability company member notice of assessments, final notice before seizures, liens, bank levies, and wage levies. Data from the CP/LLC Branch from July 2006-June 2009 were used to determine which of the enforced collection activities listed above have the most effect on total collections revenue. A lag regression analysis model was used. This was to account for mail float and response time. This analysis found a positive relationship between only one of the enforced collection activities and collections revenue, officer Notice of Assessments (NOA). A positive relationship was also discovered between total collections revenue and both the total number of cases in a previous month and incoming calls in the current month. These three factors, officer NOA’s, number of collection cases, and incoming phone calls all effect total collections in a statistically significant positive manner. There are other collection activities that a priori might seem to be related to these factors as well, but this analysis showed no connection. First, though the number of cases in collections has a positive effect on collections revenue, the total amount of accounts receivable for collections does not have the same relationship. This analysis showed that more cases entering collection leads to more collections revenue, independent of accounts receivable. This relationship could mean that cases new to collections are more likely to make payments. With this knowledge, management could decide to have collection officers focus on cases new to collections. Second, incoming calls have a positive influence on collections revenue. This was the least surprising part of the analysis. An incoming call from a business ensures that someone is contacting DOR concerning the case. When taxpayers are contacting the CP/LLC branch they are usually trying to work toward resolution. Another possible explanation for the correlation could be that taxpayers are calling in due to a refund offset. If a taxpayer owes tax liability for a business or for their individual income and he or she has been properly assessed as an officer of the business, then any tax refund due to that person, state or federal, will be offset and applied to the tax debt. These offsets often prompt an incoming call. This information emphasizes the importance of adequate phone coverage to CP/LLC Collections. The only enforced collection action shown to have a statistically significant effect on collections revenue is the officer NOA. Currently, collectors are instructed to wait until there is a total trust tax due of at least $1000(including tax, penalty, and interest) before starting officer NOA action. I recommend that management decide to lower this threshold, at least for the purpose of corporate collections (the same relationship was not found between LLC member NOA’s and collections revenue)
    • 

    corecore