10,046 research outputs found
A Survey of Current Datasets for Vision and Language Research
Integrating vision and language has long been a dream in work on artificial
intelligence (AI). In the past two years, we have witnessed an explosion of
work that brings together vision and language from images to videos and beyond.
The available corpora have played a crucial role in advancing this area of
research. In this paper, we propose a set of quality metrics for evaluating and
analyzing the vision & language datasets and categorize them accordingly. Our
analyses show that the most recent datasets have been using more complex
language and more abstract concepts, however, there are different strengths and
weaknesses in each.Comment: To appear in EMNLP 2015, short proceedings. Dataset analysis and
discussion expanded, including an initial examination into reporting bias for
one of them. F.F. and N.M. contributed equally to this wor
Sharing Human-Generated Observations by Integrating HMI and the Semantic Sensor Web
Current âInternet of Thingsâ concepts point to a future where connected objects gather meaningful information about their environment and share it with other objects and people. In particular, objects embedding Human Machine Interaction (HMI), such as mobile devices and, increasingly, connected vehicles, home appliances, urban interactive infrastructures, etc., may not only be conceived as sources of sensor information, but, through interaction with their users, they can also produce highly valuable context-aware human-generated observations. We believe that the great promise offered by combining and sharing all of the different sources of information available can be realized through the integration of HMI and Semantic Sensor Web technologies. This paper presents a technological framework that harmonizes two of the most influential HMI and Sensor Web initiatives: the W3Câs Multimodal Architecture and Interfaces (MMI) and the Open Geospatial Consortium (OGC) Sensor Web Enablement (SWE) with its semantic extension, respectively. Although the proposed framework is general enough to be applied in a variety of connected objects integrating HMI, a particular development is presented for a connected car scenario where driversâ observations about the traffic or their environment are shared across the Semantic Sensor Web. For implementation and evaluation purposes an on-board OSGi (Open Services Gateway Initiative) architecture was built, integrating several available HMI, Sensor Web and Semantic Web technologies. A technical performance test and a conceptual validation of the scenario with potential users are reported, with results suggesting the approach is soun
A unified view of data-intensive flows in business intelligence systems : a survey
Data-intensive flows are central processes in todayâs business intelligence (BI) systems, deploying different technologies to deliver data, from a multitude of data sources, in user-preferred and analysis-ready formats. To meet complex requirements of next generation BI systems, we often need an effective combination of the traditionally batched extract-transform-load (ETL) processes that populate a data warehouse (DW) from integrated data sources, and more real-time and operational data flows that integrate source data at runtime. Both academia and industry thus must have a clear understanding of the foundations of data-intensive flows and the challenges of moving towards next generation BI environments. In this paper we present a survey of todayâs research on data-intensive flows and the related fundamental fields of database theory. The study is based on a proposed set of dimensions describing the important challenges of data-intensive flows in the next generation BI setting. As a result of this survey, we envision an architecture of a system for managing the lifecycle of data-intensive flows. The results further provide a comprehensive understanding of data-intensive flows, recognizing challenges that still are to be addressed, and how the current solutions can be applied for addressing these challenges.Peer ReviewedPostprint (author's final draft
Experimental evaluation of big data querying tools
Nos Ășltimos anos, o termo Big Data tornou-se um tĂłpico bastanta debatido em vĂĄrias
ĂĄreas de negĂłcio. Um dos principais desafios relacionados com este conceito Ă© como lidar
com o enorme volume e variedade de dados de forma eficiente. Devido Ă notĂłria
complexidade e volume de dados associados ao conceito de Big Data, sĂŁo necessĂĄrios
mecanismos de consulta eficientes para fins de anĂĄlise de dados. Motivado pelo rĂĄpido
desenvolvimento de ferramentas e frameworks para Big Data, hĂĄ muita discussĂŁo sobre
ferramentas de consulta e, mais especificamente, quais sĂŁo as mais apropriadas para
necessidades analĂticas especĂfica. Esta dissertação descreve e compara as principais
caracterĂsticas e arquiteturas das seguintes conhecidas ferramentas analĂticas para Big Data:
Drill, HAWQ, Hive, Impala, Presto e Spark. Para testar o desempenho dessas ferramentas
analĂticas para Big Data, descrevemos tambĂ©m o processo de preparação, configuração e
administração de um Cluster Hadoop para que possamos instalar e utilizar essas ferramentas,
tendo um ambiente capaz de avaliar seu desempenho e identificar quais cenĂĄrios mais
adequados à sua utilização. Para realizar esta avaliação, utilizamos os benchmarks TPC-H e
TPC-DS, onde os resultados mostraram que as ferramentas de processamento em memĂłria
como HAWQ, Impala e Presto apresentam melhores resultados e desempenho em datasets de
dimensão baixa e média. No entanto, as ferramentas que apresentaram tempos de execuçÔes
mais lentas, especialmente o Hive, parecem apanhar as ferramentas de melhor desempenho
quando aumentamos os datasets de referĂȘncia
- âŠ