55,863 research outputs found

    Fine-Grained Car Detection for Visual Census Estimation

    Full text link
    Targeted socioeconomic policies require an accurate understanding of a country's demographic makeup. To that end, the United States spends more than 1 billion dollars a year gathering census data such as race, gender, education, occupation and unemployment rates. Compared to the traditional method of collecting surveys across many years which is costly and labor intensive, data-driven, machine learning driven approaches are cheaper and faster--with the potential ability to detect trends in close to real time. In this work, we leverage the ubiquity of Google Street View images and develop a computer vision pipeline to predict income, per capita carbon emission, crime rates and other city attributes from a single source of publicly available visual data. We first detect cars in 50 million images across 200 of the largest US cities and train a model to predict demographic attributes using the detected cars. To facilitate our work, we have collected the largest and most challenging fine-grained dataset reported to date consisting of over 2600 classes of cars comprised of images from Google Street View and other web sources, classified by car experts to account for even the most subtle of visual differences. We use this data to construct the largest scale fine-grained detection system reported to date. Our prediction results correlate well with ground truth income data (r=0.82), Massachusetts department of vehicle registration, and sources investigating crime rates, income segregation, per capita carbon emission, and other market research. Finally, we learn interesting relationships between cars and neighborhoods allowing us to perform the first large scale sociological analysis of cities using computer vision techniques.Comment: AAAI 201

    Towards an Intelligent Database System Founded on the SP Theory of Computing and Cognition

    Full text link
    The SP theory of computing and cognition, described in previous publications, is an attractive model for intelligent databases because it provides a simple but versatile format for different kinds of knowledge, it has capabilities in artificial intelligence, and it can also function like established database models when that is required. This paper describes how the SP model can emulate other models used in database applications and compares the SP model with those other models. The artificial intelligence capabilities of the SP model are reviewed and its relationship with other artificial intelligence systems is described. Also considered are ways in which current prototypes may be translated into an 'industrial strength' working system

    Executive Orders: Promoting Democracy and Openness in New York State Government

    Get PDF
    This joint report outlines 11 executive actions Gov. Andrew Cuomo can take to open up New York State government, increase the accountability of state agencies and reduce barriers to voting. The orders are centered on the basic goal of empowering the citizenry with more and better information about what its government is doing, and how it is spending tax payer dollars

    Scholarly communication: The quest for Pasteur's Quadrant

    Get PDF
    The scholarly communication system is sustained by its functions of a) registration, b) certification or legitimization, c) dissemination and awareness d) archiving or curation and e) reward. These functions have remained stable during the development of scholarly communication but the means through which they are achieved have not. It has been a long journey from the days when scientists communicated primarily through correspondence. The impact of modern-day technological changes is significant and has destabilized the scholarly communication system to some extent because many more options have become available to communicate scholarly information with. Pasteur's Quadrant was articulated by Donald E Stokes in his book Pasteur's Quadrant Basic Science and Technological Innovation. It is the idea that basic science (as practiced by Niels Bohr) and applied science (as exemplified by Thomas Edison) can be brought together to create a synergy that will produce results of significant benefit, as Louis Pasteur did. Given the theory (fundamental understanding) we have of scholarly communication and given how modern-day technological advances can be applied, a case can be made that use-inspired basic research (Pasteur's Quadrant) should be the focus for current research in scholarly communication. In doing so the different types of digital scholarly resources and their characteristics must be investigated to determine how the fundamentals of scholarly communication are being supported. How libraries could advocate for and contribute to the improvement of scholarly communication is also noted. These resources could include: e-journals, repositories, reviews, annotated content, data, pre -print and working papers servers, blogs, discussion forums, professional and academic hubs

    The Cure: Making a game of gene selection for breast cancer survival prediction

    Get PDF
    Motivation: Molecular signatures for predicting breast cancer prognosis could greatly improve care through personalization of treatment. Computational analyses of genome-wide expression datasets have identified such signatures, but these signatures leave much to be desired in terms of accuracy, reproducibility and biological interpretability. Methods that take advantage of structured prior knowledge (e.g. protein interaction networks) show promise in helping to define better signatures but most knowledge remains unstructured. Crowdsourcing via scientific discovery games is an emerging methodology that has the potential to tap into human intelligence at scales and in modes previously unheard of. Here, we developed and evaluated a game called The Cure on the task of gene selection for breast cancer survival prediction. Our central hypothesis was that knowledge linking expression patterns of specific genes to breast cancer outcomes could be captured from game players. We envisioned capturing knowledge both from the players prior experience and from their ability to interpret text related to candidate genes presented to them in the context of the game. Results: Between its launch in Sept. 2012 and Sept. 2013, The Cure attracted more than 1,000 registered players who collectively played nearly 10,000 games. Gene sets assembled through aggregation of the collected data clearly demonstrated the accumulation of relevant expert knowledge. In terms of predictive accuracy, these gene sets provided comparable performance to gene sets generated using other methods including those used in commercial tests. The Cure is available at http://genegames.org/cure

    Investigating people: a qualitative analysis of the search behaviours of open-source intelligence analysts

    Get PDF
    The Internet and the World Wide Web have become integral parts of the lives of many modern individuals, enabling almost instantaneous communication, sharing and broadcasting of thoughts, feelings and opinions. Much of this information is publicly facing, and as such, it can be utilised in a multitude of online investigations, ranging from employee vetting and credit checking to counter-terrorism and fraud prevention/detection. However, the search needs and behaviours of these investigators are not well documented in the literature. In order to address this gap, an in-depth qualitative study was carried out in cooperation with a leading investigation company. The research contribution is an initial identification of Open-Source Intelligence investigator search behaviours, the procedures and practices that they undertake, along with an overview of the difficulties and challenges that they encounter as part of their domain. This lays the foundation for future research in to the varied domain of Open-Source Intelligence gathering
    • …
    corecore