3,165 research outputs found
CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines
Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective.
The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines.
From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research
AiiDA: Automated Interactive Infrastructure and Database for Computational Science
Computational science has seen in the last decades a spectacular rise in the
scope, breadth, and depth of its efforts. Notwithstanding this prevalence and
impact, it is often still performed using the renaissance model of individual
artisans gathered in a workshop, under the guidance of an established
practitioner. Great benefits could follow instead from adopting concepts and
tools coming from computer science to manage, preserve, and share these
computational efforts. We illustrate here our paradigm sustaining such vision,
based around the four pillars of Automation, Data, Environment, and Sharing. We
then discuss its implementation in the open-source AiiDA platform
(http://www.aiida.net), that has been tuned first to the demands of
computational materials science. AiiDA's design is based on directed acyclic
graphs to track the provenance of data and calculations, and ensure
preservation and searchability. Remote computational resources are managed
transparently, and automation is coupled with data storage to ensure
reproducibility. Last, complex sequences of calculations can be encoded into
scientific workflows. We believe that AiiDA's design and its sharing
capabilities will encourage the creation of social ecosystems to disseminate
codes, data, and scientific workflows.Comment: 30 pages, 7 figure
Managing contextual information in semantically-driven temporal information systems
Context-aware (CA) systems have demonstrated the provision of a robust solution for personalized information delivery in the current content-rich and dynamic information age we live in. They allow software agents to autonomously interact with users by modeling the user’s environment (e.g. profile, location, relevant public information etc.) as dynamically-evolving and interoperable contexts. There is a flurry of research activities in a wide spectrum at context-aware research areas such as managing the user’s profile, context acquisition from external environments, context storage, context representation and interpretation, context service delivery and matching of context attributes to users‘ queries etc. We propose SDCAS, a Semantic-Driven Context Aware System that facilitates public services recommendation to users at temporal location. This paper focuses on information management and service recommendation using semantic technologies, taking into account the challenges of relationship complexity in temporal and contextual information
XWeB: the XML Warehouse Benchmark
With the emergence of XML as a standard for representing business data, new
decision support applications are being developed. These XML data warehouses
aim at supporting On-Line Analytical Processing (OLAP) operations that
manipulate irregular XML data. To ensure feasibility of these new tools,
important performance issues must be addressed. Performance is customarily
assessed with the help of benchmarks. However, decision support benchmarks do
not currently support XML features. In this paper, we introduce the XML
Warehouse Benchmark (XWeB), which aims at filling this gap. XWeB derives from
the relational decision support benchmark TPC-H. It is mainly composed of a
test data warehouse that is based on a unified reference model for XML
warehouses and that features XML-specific structures, and its associate XQuery
decision support workload. XWeB's usage is illustrated by experiments on
several XML database management systems
FishMark: A Linked Data Application Benchmark
Abstract. FishBase is an important species data collection produced by the FishBase Information and Research Group Inc (FIN), a not-forprofit NGO with the aim of collecting comprehensive information (from the taxonomic to the ecological) about all the world’s finned fish species. FishBase is exposed as a MySQL backed website (supporting a range of canned, although complex queries) and serves over 33 million hits per month. FishDelish is a transformation of FishBase into LinkedData weighing in at 1.38 billion triples. We have ported a substantial number of FishBase SQL queries to FishDelish SPARQL query which form the basis of a new linked data application benchmark (using our derivative of the Berlin SPARQL Benchmark harness). We use this benchmarking framework to compare the performance of the native MySQL application, the Virtuoso RDF triple store, and the Quest OBDA system on a fishbase.org like application.
CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap
After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in
multimedia search engines, we have identified and analyzed gaps within European research effort during our second year.
In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio-
economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown
of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on
requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the
community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our
Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as
National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core
technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research
challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal
challenges
Experimental evaluation of big data querying tools
Nos últimos anos, o termo Big Data tornou-se um tópico bastanta debatido em várias
áreas de negócio. Um dos principais desafios relacionados com este conceito é como lidar
com o enorme volume e variedade de dados de forma eficiente. Devido à notória
complexidade e volume de dados associados ao conceito de Big Data, são necessários
mecanismos de consulta eficientes para fins de análise de dados. Motivado pelo rápido
desenvolvimento de ferramentas e frameworks para Big Data, há muita discussão sobre
ferramentas de consulta e, mais especificamente, quais são as mais apropriadas para
necessidades analíticas específica. Esta dissertação descreve e compara as principais
características e arquiteturas das seguintes conhecidas ferramentas analíticas para Big Data:
Drill, HAWQ, Hive, Impala, Presto e Spark. Para testar o desempenho dessas ferramentas
analíticas para Big Data, descrevemos também o processo de preparação, configuração e
administração de um Cluster Hadoop para que possamos instalar e utilizar essas ferramentas,
tendo um ambiente capaz de avaliar seu desempenho e identificar quais cenários mais
adequados à sua utilização. Para realizar esta avaliação, utilizamos os benchmarks TPC-H e
TPC-DS, onde os resultados mostraram que as ferramentas de processamento em memória
como HAWQ, Impala e Presto apresentam melhores resultados e desempenho em datasets de
dimensão baixa e média. No entanto, as ferramentas que apresentaram tempos de execuções
mais lentas, especialmente o Hive, parecem apanhar as ferramentas de melhor desempenho
quando aumentamos os datasets de referência
Simulated evaluation of faceted browsing based on feature selection
In this paper we explore the limitations of facet based browsing which uses sub-needs of an information need for querying and organising the search process in video retrieval. The underlying assumption of this approach is that the search effectiveness will be enhanced if such an approach is employed for interactive video retrieval using textual and visual features. We explore the performance bounds of a faceted system by carrying out a simulated user evaluation on TRECVid data sets, and also on the logs of a prior user experiment with the system. We first present a methodology to reduce the dimensionality of features by selecting the most important ones. Then, we discuss the simulated evaluation strategies employed in our evaluation and the effect on the use of both textual and visual features. Facets created by users are simulated by clustering video shots using textual and visual features. The experimental results of our study demonstrate that the faceted browser can potentially improve the search effectiveness
Grand Challenges of Traceability: The Next Ten Years
In 2007, the software and systems traceability community met at the first
Natural Bridge symposium on the Grand Challenges of Traceability to establish
and address research goals for achieving effective, trustworthy, and ubiquitous
traceability. Ten years later, in 2017, the community came together to evaluate
a decade of progress towards achieving these goals. These proceedings document
some of that progress. They include a series of short position papers,
representing current work in the community organized across four process axes
of traceability practice. The sessions covered topics from Trace Strategizing,
Trace Link Creation and Evolution, Trace Link Usage, real-world applications of
Traceability, and Traceability Datasets and benchmarks. Two breakout groups
focused on the importance of creating and sharing traceability datasets within
the research community, and discussed challenges related to the adoption of
tracing techniques in industrial practice. Members of the research community
are engaged in many active, ongoing, and impactful research projects. Our hope
is that ten years from now we will be able to look back at a productive decade
of research and claim that we have achieved the overarching Grand Challenge of
Traceability, which seeks for traceability to be always present, built into the
engineering process, and for it to have "effectively disappeared without a
trace". We hope that others will see the potential that traceability has for
empowering software and systems engineers to develop higher-quality products at
increasing levels of complexity and scale, and that they will join the active
community of Software and Systems traceability researchers as we move forward
into the next decade of research
- …