10,046 research outputs found

    A Survey of Current Datasets for Vision and Language Research

    Full text link
    Integrating vision and language has long been a dream in work on artificial intelligence (AI). In the past two years, we have witnessed an explosion of work that brings together vision and language from images to videos and beyond. The available corpora have played a crucial role in advancing this area of research. In this paper, we propose a set of quality metrics for evaluating and analyzing the vision & language datasets and categorize them accordingly. Our analyses show that the most recent datasets have been using more complex language and more abstract concepts, however, there are different strengths and weaknesses in each.Comment: To appear in EMNLP 2015, short proceedings. Dataset analysis and discussion expanded, including an initial examination into reporting bias for one of them. F.F. and N.M. contributed equally to this wor

    Sharing Human-Generated Observations by Integrating HMI and the Semantic Sensor Web

    Get PDF
    Current “Internet of Things” concepts point to a future where connected objects gather meaningful information about their environment and share it with other objects and people. In particular, objects embedding Human Machine Interaction (HMI), such as mobile devices and, increasingly, connected vehicles, home appliances, urban interactive infrastructures, etc., may not only be conceived as sources of sensor information, but, through interaction with their users, they can also produce highly valuable context-aware human-generated observations. We believe that the great promise offered by combining and sharing all of the different sources of information available can be realized through the integration of HMI and Semantic Sensor Web technologies. This paper presents a technological framework that harmonizes two of the most influential HMI and Sensor Web initiatives: the W3C’s Multimodal Architecture and Interfaces (MMI) and the Open Geospatial Consortium (OGC) Sensor Web Enablement (SWE) with its semantic extension, respectively. Although the proposed framework is general enough to be applied in a variety of connected objects integrating HMI, a particular development is presented for a connected car scenario where drivers’ observations about the traffic or their environment are shared across the Semantic Sensor Web. For implementation and evaluation purposes an on-board OSGi (Open Services Gateway Initiative) architecture was built, integrating several available HMI, Sensor Web and Semantic Web technologies. A technical performance test and a conceptual validation of the scenario with potential users are reported, with results suggesting the approach is soun

    A unified view of data-intensive flows in business intelligence systems : a survey

    Get PDF
    Data-intensive flows are central processes in today’s business intelligence (BI) systems, deploying different technologies to deliver data, from a multitude of data sources, in user-preferred and analysis-ready formats. To meet complex requirements of next generation BI systems, we often need an effective combination of the traditionally batched extract-transform-load (ETL) processes that populate a data warehouse (DW) from integrated data sources, and more real-time and operational data flows that integrate source data at runtime. Both academia and industry thus must have a clear understanding of the foundations of data-intensive flows and the challenges of moving towards next generation BI environments. In this paper we present a survey of today’s research on data-intensive flows and the related fundamental fields of database theory. The study is based on a proposed set of dimensions describing the important challenges of data-intensive flows in the next generation BI setting. As a result of this survey, we envision an architecture of a system for managing the lifecycle of data-intensive flows. The results further provide a comprehensive understanding of data-intensive flows, recognizing challenges that still are to be addressed, and how the current solutions can be applied for addressing these challenges.Peer ReviewedPostprint (author's final draft

    Experimental evaluation of big data querying tools

    Get PDF
    Nos Ășltimos anos, o termo Big Data tornou-se um tĂłpico bastanta debatido em vĂĄrias ĂĄreas de negĂłcio. Um dos principais desafios relacionados com este conceito Ă© como lidar com o enorme volume e variedade de dados de forma eficiente. Devido Ă  notĂłria complexidade e volume de dados associados ao conceito de Big Data, sĂŁo necessĂĄrios mecanismos de consulta eficientes para fins de anĂĄlise de dados. Motivado pelo rĂĄpido desenvolvimento de ferramentas e frameworks para Big Data, hĂĄ muita discussĂŁo sobre ferramentas de consulta e, mais especificamente, quais sĂŁo as mais apropriadas para necessidades analĂ­ticas especĂ­fica. Esta dissertação descreve e compara as principais caracterĂ­sticas e arquiteturas das seguintes conhecidas ferramentas analĂ­ticas para Big Data: Drill, HAWQ, Hive, Impala, Presto e Spark. Para testar o desempenho dessas ferramentas analĂ­ticas para Big Data, descrevemos tambĂ©m o processo de preparação, configuração e administração de um Cluster Hadoop para que possamos instalar e utilizar essas ferramentas, tendo um ambiente capaz de avaliar seu desempenho e identificar quais cenĂĄrios mais adequados Ă  sua utilização. Para realizar esta avaliação, utilizamos os benchmarks TPC-H e TPC-DS, onde os resultados mostraram que as ferramentas de processamento em memĂłria como HAWQ, Impala e Presto apresentam melhores resultados e desempenho em datasets de dimensĂŁo baixa e mĂ©dia. No entanto, as ferramentas que apresentaram tempos de execuçÔes mais lentas, especialmente o Hive, parecem apanhar as ferramentas de melhor desempenho quando aumentamos os datasets de referĂȘncia
    • 

    corecore