542 research outputs found

    CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap

    Get PDF
    After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in multimedia search engines, we have identified and analyzed gaps within European research effort during our second year. In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio- economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core technological gaps that involve research challenges, and ā€œenablersā€, which are not necessarily technical research challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal challenges

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

    Knowledge extraction from unstructured data and classification through distributed ontologies

    Get PDF
    The World Wide Web has changed the way humans use and share any kind of information. The Web removed several access barriers to the information published and has became an enormous space where users can easily navigate through heterogeneous resources (such as linked documents) and can easily edit, modify, or produce them. Documents implicitly enclose information and relationships among them which become only accessible to human beings. Indeed, the Web of documents evolved towards a space of data silos, linked each other only through untyped references (such as hypertext references) where only humans were able to understand. A growing desire to programmatically access to pieces of data implicitly enclosed in documents has characterized the last efforts of the Web research community. Direct access means structured data, thus enabling computing machinery to easily exploit the linking of different data sources. It has became crucial for the Web community to provide a technology stack for easing data integration at large scale, first structuring the data using standard ontologies and afterwards linking them to external data. Ontologies became the best practices to define axioms and relationships among classes and the Resource Description Framework (RDF) became the basic data model chosen to represent the ontology instances (i.e. an instance is a value of an axiom, class or attribute). Data becomes the new oil, in particular, extracting information from semi-structured textual documents on the Web is key to realize the Linked Data vision. In the literature these problems have been addressed with several proposals and standards, that mainly focus on technologies to access the data and on formats to represent the semantics of the data and their relationships. With the increasing of the volume of interconnected and serialized RDF data, RDF repositories may suffer from data overloading and may become a single point of failure for the overall Linked Data vision. One of the goals of this dissertation is to propose a thorough approach to manage the large scale RDF repositories, and to distribute them in a redundant and reliable peer-to-peer RDF architecture. The architecture consists of a logic to distribute and mine the knowledge and of a set of physical peer nodes organized in a ring topology based on a Distributed Hash Table (DHT). Each node shares the same logic and provides an entry point that enables clients to query the knowledge base using atomic, disjunctive and conjunctive SPARQL queries. The consistency of the results is increased using data redundancy algorithm that replicates each RDF triple in multiple nodes so that, in the case of peer failure, other peers can retrieve the data needed to resolve the queries. Additionally, a distributed load balancing algorithm is used to maintain a uniform distribution of the data among the participating peers by dynamically changing the key space assigned to each node in the DHT. Recently, the process of data structuring has gained more and more attention when applied to the large volume of text information spread on the Web, such as legacy data, news papers, scientific papers or (micro-)blog posts. This process mainly consists in three steps: \emph{i)} the extraction from the text of atomic pieces of information, called named entities; \emph{ii)} the classification of these pieces of information through ontologies; \emph{iii)} the disambigation of them through Uniform Resource Identifiers (URIs) identifying real world objects. As a step towards interconnecting the web to real world objects via named entities, different techniques have been proposed. The second objective of this work is to propose a comparison of these approaches in order to highlight strengths and weaknesses in different scenarios such as scientific and news papers, or user generated contents. We created the Named Entity Recognition and Disambiguation (NERD) web framework, publicly accessible on the Web (through REST API and web User Interface), which unifies several named entity extraction technologies. Moreover, we proposed the NERD ontology, a reference ontology for comparing the results of these technologies. Recently, the NERD ontology has been included in the NIF (Natural language processing Interchange Format) specification, part of the Creating Knowledge out of Interlinked Data (LOD2) project. Summarizing, this dissertation defines a framework for the extraction of knowledge from unstructured data and its classification via distributed ontologies. A detailed study of the Semantic Web and knowledge extraction fields is proposed to define the issues taken under investigation in this work. Then, it proposes an architecture to tackle the single point of failure issue introduced by the RDF repositories spread within the Web. Although the use of ontologies enables a Web where data is structured and comprehensible by computing machinery, human users may take advantage of it especially for the annotation task. Hence, this work describes an annotation tool for web editing, audio and video annotation in a web front end User Interface powered on the top of a distributed ontology. Furthermore, this dissertation details a thorough comparison of the state of the art of named entity technologies. The NERD framework is presented as technology to encompass existing solutions in the named entity extraction field and the NERD ontology is presented as reference ontology in the field. Finally, this work highlights three use cases with the purpose to reduce the amount of data silos spread within the Web: a Linked Data approach to augment the automatic classification task in a Systematic Literature Review, an application to lift educational data stored in Sharable Content Object Reference Model (SCORM) data silos to the Web of data and a scientific conference venue enhancer plug on the top of several data live collectors. Significant research efforts have been devoted to combine the efficiency of a reliable data structure and the importance of data extraction techniques. This dissertation opens different research doors which mainly join two different research communities: the Semantic Web and the Natural Language Processing community. The Web provides a considerable amount of data where NLP techniques may shed the light within it. The use of the URI as a unique identifier may provide one milestone for the materialization of entities lifted from a raw text to real world object

    Dagstuhl News January - December 2005

    Get PDF
    "Dagstuhl News" is a publication edited especially for the members of the Foundation "Informatikzentrum Schloss Dagstuhl" to thank them for their support. The News give a summary of the scientific work being done in Dagstuhl. Each Dagstuhl Seminar is presented by a small abstract describing the contents and scientific highlights of the seminar as well as the perspectives or challenges of the research topic

    Robust Trust Establishment in Decentralized Networks

    Get PDF
    The advancement in networking technologies creates new opportunities for computer users to communicate and interact with one another. Very often, these interacting parties are strangers. A relevant concern for a user is whether to trust the other party in an interaction, especially if there are risks associated with the interaction. Reputation systems are proposed as a method to establish trust among strangers. In a reputation system, a user who exhibits good behavior continuously can build a good reputation. On the other hand, a user who exhibits malicious behavior will have a poor reputation. Trust can then be established based on the reputation ratings of a user. While many research efforts have demonstrated the effectiveness of reputation systems in various situations, the security of reputation systems is not well understood within the research community. In the context of trust establishment, the goal of an adversary is to gain trust. An adversary can appear to be trustworthy within a reputation system if the adversary has a good reputation. Unfortunately, there are plenty of methods that an adversary can use to achieve a good reputation. To make things worse, there may be ways for an attacker to gain an advantage that may not be known yet. As a result, understanding an adversary is a challenging problem. The difficulty of this problem can be witnessed by how researchers attempt to prove the security of their reputation systems. Most prove security by using simulations to demonstrate that their solutions are resilient to specific attacks. Unfortunately, they do not justify their choices of the attack scenarios, and more importantly, they do not demonstrate that their choices are sufficient to claim that their solutions are secure. In this dissertation, I focus on addressing the security of reputation systems in a decentralized Peer-to-Peer (P2P) network. To understand the problem, I define an abstract model for trust establishment. The model consists of several layers. Each layer corresponds to a component of trust establishment. This model serves as a common point of reference for defining security. The model can also be used as a framework for designing and implementing trust establishment methods. The modular design of the model can also allow existing methods to inter-operate. To address the security issues, I first provide the definition of security for trust establishment. Security is defined as a measure of robustness. Using this definition, I provide analytical techniques for examining the robustness of trust establishment methods. In particular, I show that in general, most reputation systems are not robust. The analytical results lead to a better understanding of the capabilities of the adversaries. Based on this understanding, I design a solution that improves the robustness of reputation systems by using accountability. The purpose of accountability is to encourage peers to behave responsibly as well as to provide disincentive for malicious behavior. The effectiveness of the solution is validated by using simulations. While simulations are commonly used by other research efforts to validate their trust establishment methods, their choices of simulation scenarios seem to be chosen in an ad hoc manner. In fact, many of these works do not justify their choices of simulation scenarios, and neither do they show that their choices are adequate. In this dissertation, the simulation scenarios are chosen based on the capabilities of the adversaries. The simulation results show that under certain conditions, accountability can improve the robustness of reputation systems

    Web 2.0 as syndication

    No full text
    There is considerable excitement about the notion of 'Web 2.0', particularly among Internet businesspeople. In contrast, there is an almost complete lack of formal literature on the topic. It is important that movements with such energy and potential be subjected to critical attention, and that industry and social commentators have the opportunity to draw on the eCommerce research literature in formulating their views

    IDEAS-1997-2021-Final-Programs

    Get PDF
    This document records the final program for each of the 26 meetings of the International Database and Engineering Application Symposium from 1997 through 2021. These meetings were organized in various locations on three continents. Most of the papers published during these years are in the digital libraries of IEEE(1997-2007) or ACM(2008-2021)

    Design of a Controlled Language for Critical Infrastructures Protection

    Get PDF
    We describe a project for the construction of controlled language for critical infrastructures protection (CIP). This project originates from the need to coordinate and categorize the communications on CIP at the European level. These communications can be physically represented by official documents, reports on incidents, informal communications and plain e-mail. We explore the application of traditional library science tools for the construction of controlled languages in order to achieve our goal. Our starting point is an analogous work done during the sixties in the field of nuclear science known as the Euratom Thesaurus.JRC.G.6-Security technology assessmen

    Machine Learning based Trust Computational Model for IoT Services

    Get PDF
    The Internet of Things has facilitated access to a large volume of sensitive information on each participating object in an ecosystem. This imposes many threats ranging from the risks of data management to the potential discrimination enabled by data analytics over delicate information such as locations, interests, and activities. To address these issues, the concept of trust is introduced as an important role in supporting both humans and services to overcome the perception of uncertainty and risks before making any decisions. However, establishing trust in a cyber world is a challenging task due to the volume of diversified influential factors from cyber-physical-systems. Hence, it is essential to have an intelligent trust computation model that is capable of generating accurate and intuitive trust values for prospective actors. Therefore, in this paper, a quantifiable trust assessment model is proposed. Built on this model, individual trust attributes are then calculated numerically. Moreover, a novel algorithm based on machine learning principles is devised to classify the extracted trust features and combine them to produce a final trust value to be used for decision making. Finally, our modelā€™s effectiveness is verified through a simulation. The results show that our method has advantages over other aggregation methods
    • ā€¦
    corecore