Search CORE

903 research outputs found

DARIAH and the Benelux

Author: Backes Marianne
Chambers Sally
Hoogerwerf Maarten
Van der West Jan
Publication venue: Department of Applied Linguistics, Translators and Interpreters, University of Antwerp
Publication date: 01/01/2015
Field of study

Web Data Extraction, Applications and Techniques: A Survey

Author: Abel
Amalfitano
Balduzzi
Baumgartner
Baumgartner
Baumgartner
Baumgartner
Baumgartner
Baumgartner
Berger
Berthold
Bettencourt
Califf
Catanese
Chang
Chen
Chen
Chen
Collins
Conover
Crandall
Crescenzi
Crescenzi
Dalvi
Dalvi
De Meo
De Meo
Doan
Emilio Ferrara
Ferrara
Ferrara
Ferrara
Ferrara
Ferrara
Flesca
Freitag
Furche
Gatterbauer
Gatterbauer
Giacomo Fiumara
Gjoka
Gkotsis
Gottlob
Gottlob
Hammersley
Han
Hecht
Hsu
Irmak
Khare
Kim
Kinsella
Kleinberg
Kleinberg
Kohlschütter
Kokkoras
Kokkoras
Kokkoras
Krüpl
Kushmerick
Kwak
Laender
Liu
Manning
Masanès
Mathes
Meng
Mislove
Monge
Muslea
Oro
Pan
Pasquale De Meo
Perito
Phan
Plake
Rahm
Rahm
Reis
Robert Baumgartner
Sahuguet
Sarawagi
Schifanella
Selkow
Shi
Soderland
Szomszor
Turmo
Vosecky
Wang
Wang
Weikum
Wilson
Winograd
Yang
Ye
Zafarani
Zanasi
Zhai
Zhang
Zhang
Publication venue: 'Elsevier BV'
Publication date: 09/06/2014
Field of study

Web Data Extraction is an important problem that has been studied by means of different scientific tools and in a broad range of applications. Many approaches to extracting data from the Web have been designed to solve specific problems and operate in ad-hoc domains. Other approaches, instead, heavily reuse techniques and algorithms developed in the field of Information Extraction. This survey aims at providing a structured and comprehensive overview of the literature in the field of Web Data Extraction. We provided a simple classification framework in which existing Web Data Extraction applications are grouped into two main classes, namely applications at the Enterprise level and at the Social Web level. At the Enterprise level, Web Data Extraction techniques emerge as a key tool to perform data analysis in Business and Competitive Intelligence systems as well as for business process re-engineering. At the Social Web level, Web Data Extraction techniques allow to gather a large amount of structured data continuously generated and disseminated by Web 2.0, Social Media and Online Social Network users and this offers unprecedented opportunities to analyze human behavior at a very large scale. We discuss also the potential of cross-fertilization, i.e., on the possibility of re-using Web Data Extraction techniques originally designed to work in a given domain, in other domains.Comment: Knowledge-based System

arXiv.org e-Print Archive

Crossref

GeoBIM for built environment condition assessment supporting asset management decision making

Author: Dejaco MC
Ellul C
Moretti N
Papapesios N
Re Cecconi F
Publication venue
Publication date: 01/10/2021
Field of study

The digital transformation in management of the built environment is more and more evident. While the benefits of location data, from Building Information Modelling or Geographical Information Systems, have been explored separately, their combination - GeoBIM - in asset management has never been explored. Data collection for condition assessment is challenging due to quantity, types, frequency and quality of data. We first describe the opportunities and challenges of GeoBIM for condition assessment. The theoretical approach is then validated developing an integrated GeoBIM model of the digital built environment, for a neighbourhood in Milan, Italy. Data are collected, linked, processed and analysed, through multiple software platforms, providing relevant information for asset management decision making. Good results are achieved in rapid massive data collection, improved visualisation, and analysis. While further testing and development is required, the case study outcomes demonstrated the innovation and the mid-term service-oriented potential of the proposed approach

UCL Discovery

Le nuage de point intelligent

Author: Poux Florent
Publication venue: ULiège - Université de Liège
Publication date: 05/06/2019
Field of study

Discrete spatial datasets known as point clouds often lay the groundwork for decision-making applications. E.g., we can use such data as a reference for autonomous cars and robot’s navigation, as a layer for floor-plan’s creation and building’s construction, as a digital asset for environment modelling and incident prediction... Applications are numerous, and potentially increasing if we consider point clouds as digital reality assets. Yet, this expansion faces technical limitations mainly from the lack of semantic information within point ensembles. Connecting knowledge sources is still a very manual and time-consuming process suffering from error-prone human interpretation. This highlights a strong need for domain-related data analysis to create a coherent and structured information. The thesis clearly tries to solve automation problematics in point cloud processing to create intelligent environments, i.e. virtual copies that can be used/integrated in fully autonomous reasoning services. We tackle point cloud questions associated with knowledge extraction – particularly segmentation and classification – structuration, visualisation and interaction with cognitive decision systems. We propose to connect both point cloud properties and formalized knowledge to rapidly extract pertinent information using domain-centered graphs. The dissertation delivers the concept of a Smart Point Cloud (SPC) Infrastructure which serves as an interoperable and modular architecture for a unified processing. It permits an easy integration to existing workflows and a multi-domain specialization through device knowledge, analytic knowledge or domain knowledge. Concepts, algorithms, code and materials are given to replicate findings and extend current applications.Les ensembles discrets de données spatiales, appelés nuages de points, forment souvent le support principal pour des scénarios d’aide à la décision. Par exemple, nous pouvons utiliser ces données comme référence pour les voitures autonomes et la navigation des robots, comme couche pour la création de plans et la construction de bâtiments, comme actif numérique pour la modélisation de l'environnement et la prédiction d’incidents... Les applications sont nombreuses et potentiellement croissantes si l'on considère les nuages de points comme des actifs de réalité numérique. Cependant, cette expansion se heurte à des limites techniques dues principalement au manque d'information sémantique au sein des ensembles de points. La création de liens avec des sources de connaissances est encore un processus très manuel, chronophage et lié à une interprétation humaine sujette à l'erreur. Cela met en évidence la nécessité d'une analyse automatisée des données relatives au domaine étudié afin de créer une information cohérente et structurée. La thèse tente clairement de résoudre les problèmes d'automatisation dans le traitement des nuages de points pour créer des environnements intelligents, c'est-àdire des copies virtuelles qui peuvent être utilisées/intégrées dans des services de raisonnement totalement autonomes. Nous abordons plusieurs problématiques liées aux nuages de points et associées à l'extraction des connaissances - en particulier la segmentation et la classification - la structuration, la visualisation et l'interaction avec les systèmes cognitifs de décision. Nous proposons de relier à la fois les propriétés des nuages de points et les connaissances formalisées pour extraire rapidement les informations pertinentes à l'aide de graphes centrés sur le domaine. La dissertation propose le concept d'une infrastructure SPC (Smart Point Cloud) qui sert d'architecture interopérable et modulaire pour un traitement unifié. Elle permet une intégration facile aux flux de travail existants et une spécialisation multidomaine grâce aux connaissances liée aux capteurs, aux connaissances analytiques ou aux connaissances de domaine. Plusieurs concepts, algorithmes, codes et supports sont fournis pour reproduire les résultats et étendre les applications actuelles.Diskrete räumliche Datensätze, so genannte Punktwolken, bilden oft die Grundlage für Entscheidungsanwendungen. Beispielsweise können wir solche Daten als Referenz für autonome Autos und Roboternavigation, als Ebene für die Erstellung von Grundrissen und Gebäudekonstruktionen, als digitales Gut für die Umgebungsmodellierung und Ereignisprognose verwenden... Die Anwendungen sind zahlreich und nehmen potenziell zu, wenn wir Punktwolken als Digital Reality Assets betrachten. Allerdings stößt diese Erweiterung vor allem durch den Mangel an semantischen Informationen innerhalb von Punkt-Ensembles auf technische Grenzen. Die Verbindung von Wissensquellen ist immer noch ein sehr manueller und zeitaufwendiger Prozess, der unter fehleranfälliger menschlicher Interpretation leidet. Dies verdeutlicht den starken Bedarf an domänenbezogenen Datenanalysen, um eine kohärente und strukturierte Information zu schaffen. Die Arbeit versucht eindeutig, Automatisierungsprobleme in der Punktwolkenverarbeitung zu lösen, um intelligente Umgebungen zu schaffen, d.h. virtuelle Kopien, die in vollständig autonome Argumentationsdienste verwendet/integriert werden können. Wir befassen uns mit Punktwolkenfragen im Zusammenhang mit der Wissensextraktion - insbesondere Segmentierung und Klassifizierung - Strukturierung, Visualisierung und Interaktion mit kognitiven Entscheidungssystemen. Wir schlagen vor, sowohl Punktwolkeneigenschaften als auch formalisiertes Wissen zu verbinden, um schnell relevante Informationen mithilfe von domänenzentrierten Grafiken zu extrahieren. Die Dissertation liefert das Konzept einer Smart Point Cloud (SPC) Infrastruktur, die als interoperable und modulare Architektur für eine einheitliche Verarbeitung dient. Es ermöglicht eine einfache Integration in bestehende Workflows und eine multidimensionale Spezialisierung durch Gerätewissen, analytisches Wissen oder Domänenwissen. Konzepte, Algorithmen, Code und Materialien werden zur Verfügung gestellt, um Erkenntnisse zu replizieren und aktuelle Anwendungen zu erweitern

Open Repository and Bibliography - Liège

SWKM 2008: Social Web and Knowledge Management, Proceedings:CEUR Workshop Proceedings

Author
Publication venue: CEUR Workshop Proceedings
Publication date: 01/01/2008
Field of study

VBN

Digital Innovation: A Frugal Ecosystem Perspective

Author: Ahuja Suchit
Chan Yolande E.
Publication venue: AIS Electronic Library (AISeL)
Publication date: 11/12/2016
Field of study

In this conceptual paper, we attempt to answer the question: How do firms develop frugal IT capabilities in a resource-constrained ecosystem? Frugal firms tend to successfully overcome severe infrastructure, financial, social, and technological constraints. Frugal IT Innovation” is a special case of frugal innovation where IT/IS play a pivotal, core role in enabling capabilities to overcome challenges of resource-constrained business environments. It is centered on development of products/services with a sharp focus on affordability, simplicity, and sustainability. Taking a digital ecodynamics perspective, we focus on the co-evolution of firm-level capabilities, the frugal ecosystem, and underlying IT systems to uncover how a dynamic, higher-order, frugal IT innovation capability (FITIC) drives firm performance. Due to unique ecosystem conditions, we measure firm performance by including social and environmental measures in addition to financial measures. The paper discusses ecosystem-wide implications and contributes to advancement of both theoretical and practice-based knowledge in this domain

AIS Electronic Library (AISeL)

Report of the Stanford Linked Data Workshop

Author: Calter Mimi
Glaser Hugh
Keller Michael A
Persons Jerry
Publication venue: Council on Library and Information Resources
Publication date: 01/10/2011
Field of study

The Stanford University Libraries and Academic Information Resources (SULAIR) with the Council on Library and Information Resources (CLIR) conducted at week-long workshop on the prospects for a large scale, multi-national, multi-institutional prototype of a Linked Data environment for discovery of and navigation among the rapidly, chaotically expanding array of academic information resources. As preparation for the workshop, CLIR sponsored a survey by Jerry Persons, Chief Information Architect emeritus of SULAIR that was published originally for workshop participants as background to the workshop and is now publicly available. The original intention of the workshop was to devise a plan for such a prototype. However, such was the diversity of knowledge, experience, and views of the potential of Linked Data approaches that the workshop participants turned to two more fundamental goals: building common understanding and enthusiasm on the one hand and identifying opportunities and challenges to be confronted in the preparation of the intended prototype and its operation on the other. In pursuit of those objectives, the workshop participants produced:1. a value statement addressing the question of why a Linked Data approach is worth prototyping;2. a manifesto for Linked Libraries (and Museums and Archives and …);3. an outline of the phases in a life cycle of Linked Data approaches;4. a prioritized list of known issues in generating, harvesting & using Linked Data;5. a workflow with notes for converting library bibliographic records and other academic metadata to URIs;6. examples of potential “killer apps” using Linked Data: and7. a list of next steps and potential projects.This report includes a summary of the workshop agenda, a chart showing the use of Linked Data in cultural heritage venues, and short biographies and statements from each of the participants

Southampton (e-Prints Soton)

Supporting the workflow of archaeo-related sciences by providing storage, sharing, analysis, and retrieval methods

Author: Kaltenthaler Daniel
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 27/04/2018
Field of study

The recovery and analysis of material culture is the main focus of archaeo-related work. The corpus of findings like rest of buildings, artifacts, human burial remains, or faunal remains is excavated, described, categorized, and analyzed in projects all over the world. A huge amount of archaeo-related data is the basis for many analyses. The results of analyzing collected data make us learn about the past. All disciplines of archaeo-related sciences deal with similar challenges. The workflow of the disciplines is similar, however there are still differences in the nature of the data. These circumstances result in questions how to store, share, retrieve, and analyze these heterogeneous and distributed data. The contribution of this thesis is to support archaeologists and bioarchaeologists in their work by providing methods following the archaeo-related workflow which is split in five main parts. Therefore, the first part of this thesis describes the xBook framework that has been developed to gather and store archaeological data. It allows creating several database applications to provide necessary features for the archaeo-related context. The second part deals with methods to share information, collaborate with colleagues, and retrieve distributed data of cohesive archaeological contexts to bring together archaeo-related data. The third part addresses a dynamic framework for data analyses which features a flexible and easy to be used tool to support archaeologists and bioarchaeologists executing analyses on their data without any programming skills and without the necessity to get familiar with external technologies. The fourth part introduces an interactive tool to compare the temporal position of archaeological findings in form of a Harris Matrix with their spatial position as 2D and 3D site plan sketches by using the introduced data retrieval methods. Finally, the fifth part specifies an architecture for an information system which allows distributed and interdisciplinary data to be searched by using dynamic joins of results from heterogeneous data formats. This novel way of information retrieval enables scientists to cross-connect archaeological information with domain-extrinsic knowledge. However, the concept of this information system is not limited to the archaeo-related context. Other sciences could also benefit from this architecture.Die Wiederherstellung und Analyse von materieller Kultur ist der Schwerpunkt archäologischer Arbeit. Das Material von Funden wie Gebäudereste, Artefakte, menschliche Überreste aus Bestattungen oder tierische Reste wird in Projekten auf der ganzen Welt ausgegraben, beschrieben, kategorisiert und analysiert. Die große Anzahl an archäologischen Daten bildet die Grundlage für viele Analysen. Die Ergebnisse der Auswertung der gesammelten Daten gibt uns Aufschluss über die Vergangenheit. Alle Disziplinen der archäologischen Wissenschaften setzen sich mit ähnlichen Herausforderungen auseinander. Der Arbeitsablauf ist in den einzelnen Disziplinen ähnlich, jedoch gibt es aufgrund der Art der Daten Unterschiede. Das führt zu Fragestellungen, wie heterogene und verteilte Daten erfasst, geteilt, abgerufen und analysiert werden können. Diese Dissertation beschäftigt sich mit der Unterstützung von Archäologen und Bioarchäologen bei ihrer Arbeit, indem unterstützende Methoden bereitgestellt werden, die dem archäologischen Arbeitsablauf , der in fünf Schritte unterteilt ist, folgt. Der erste Teil dieser Arbeit beschreibt das xBook Framework, welches entwickelt wurde, um archäologische Daten zu erfassen und zu speichern. Es ermöglicht die Erstellung zahlreicher Datenbankanwendungen, um notwendige Funktionen für den archäologischen Kontext bereitzustellen. Der zweite Teil beschäftigt sich mit der Zusammentragung von archäologischen Daten und setzt sich mit Methoden zum Teilen von Informationen, Methoden zur Zusammenarbeit zwischen Kollegen und Methoden zum Abruf von verteilten, aber zusammenhängenden archäologischen Daten auseinander. Der dritte Teil stellt ein dynamisches Framework für Datenanalysen vor, welches ein flexibles und leicht zu bedienendes Tool bereitstellt, das Archäologen und Bioarchäologen in der Ausführung von Analysen ihrer Daten unterstützt, so dass weder Programmierkenntnisse noch die Einarbeitung in externe Technologien benötigt werden. Der vierte Teil führt ein interaktives Tool ein, mit dem – unter Verwendung der zuvor beschriebenen Methoden zur Datenabfrage – die zeitliche Position von archäologischen Funden in Form einer Harris Matrix mit ihrer räumlichen Position als 2D- und 3D-Lageplan verglichen werden kann. Abschließend spezifiziert der fünfte Teil eine Architektur für ein Informationssystem, das die Durchsuchung von verteilten und interdisziplinären Daten durch dynamische Joins von Suchergebnissen aus heterogenen Datenformaten ermöglicht. Diese neue Art an Informationsabfrage erlaubt Wissenschaftlern eine Querverbindung von archäologischen Informationen mit fachfremdem Wissen. Das Konzept für dieses Informationssystem ist jedoch nicht auf den archäologischen Kontext begrenzt. Auch andere wissenschaftliche Bereiche können von dieser Architektur profitieren

Entity-Oriented Search

Author: Balog Krisztian
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 10/02/2021
Field of study

This open access book covers all facets of entity-oriented search—where “search” can be interpreted in the broadest sense of information access—from a unified point of view, and provides a coherent and comprehensive overview of the state of the art. It represents the first synthesis of research in this broad and rapidly developing area. Selected topics are discussed in-depth, the goal being to establish fundamental techniques and methods as a basis for future research and development. Additional topics are treated at a survey level only, containing numerous pointers to the relevant literature. A roadmap for future research, based on open issues and challenges identified along the way, rounds out the book. The book is divided into three main parts, sandwiched between introductory and concluding chapters. The first two chapters introduce readers to the basic concepts, provide an overview of entity-oriented search tasks, and present the various types and sources of data that will be used throughout the book. Part I deals with the core task of entity ranking: given a textual query, possibly enriched with additional elements or structural hints, return a ranked list of entities. This core task is examined in a number of different variants, using both structured and unstructured data collections, and numerous query formulations. In turn, Part II is devoted to the role of entities in bridging unstructured and structured data. Part III explores how entities can enable search engines to understand the concepts, meaning, and intent behind the query that the user enters into the search box, and how they can provide rich and focused responses (as opposed to merely a list of documents)—a process known as semantic search. The final chapter concludes the book by discussing the limitations of current approaches, and suggesting directions for future research. Researchers and graduate students are the primary target audience of this book. A general background in information retrieval is sufficient to follow the material, including an understanding of basic probability and statistics concepts as well as a basic knowledge of machine learning concepts and supervised learning algorithms

Directory of Open Access Books (DOAB)

CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap

Author: Bardeli Rolf
Boujemaa Nozha
Compañó Ramón
Doch Christoph
Geurts Joost
Gouraud Henri
Joly Alexis
Karlgren Jussi
King Paul
Kompatsiaris Yiannis
Köhler Joachim
Le Moine Jean-Yves
Ortgies Robert
Point Jean-Charles
Rotenberg Boris
Rudström Åsa
Schreer Oliver
Sebe Nicu
Snoek Cees
Publication venue: Chorus Project Consortium
Publication date: 01/01/2008
Field of study

After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in multimedia search engines, we have identified and analyzed gaps within European research effort during our second year. In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio- economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal challenges

RISE – Research Institutes of Sweden

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database

Software institutes' Online Digital Archive