39 research outputs found

    Entity Query Feature Expansion Using Knowledge Base Links

    Get PDF
    Recent advances in automatic entity linking and knowledge base construction have resulted in entity annotations for document and query collections. For example, annotations of entities from large general purpose knowledge bases, such as Freebase and the Google Knowledge Graph. Understanding how to leverage these entity annotations of text to improve ad hoc document retrieval is an open research area. Query expansion is a commonly used technique to improve retrieval effectiveness. Most previous query expansion approaches focus on text, mainly using unigram concepts. In this paper, we propose a new technique, called entity query feature expansion (EQFE) which enriches the query with features from entities and their links to knowledge bases, including structured attributes and text. We experiment using both explicit query entity annotations and latent entities. We evaluate our technique on TREC text collections automatically annotated with knowledge base entity links, including the Google Freebase Annotations (FACC1) data. We find that entity-based feature expansion results in significant improvements in retrieval effectiveness over state-of-the-art text expansion approaches

    A Novel Approach to Multimedia Ontology Engineering for Automated Reasoning over Audiovisual LOD Datasets

    Full text link
    Multimedia reasoning, which is suitable for, among others, multimedia content analysis and high-level video scene interpretation, relies on the formal and comprehensive conceptualization of the represented knowledge domain. However, most multimedia ontologies are not exhaustive in terms of role definitions, and do not incorporate complex role inclusions and role interdependencies. In fact, most multimedia ontologies do not have a role box at all, and implement only a basic subset of the available logical constructors. Consequently, their application in multimedia reasoning is limited. To address the above issues, VidOnt, the very first multimedia ontology with SROIQ(D) expressivity and a DL-safe ruleset has been introduced for next-generation multimedia reasoning. In contrast to the common practice, the formal grounding has been set in one of the most expressive description logics, and the ontology validated with industry-leading reasoners, namely HermiT and FaCT++. This paper also presents best practices for developing multimedia ontologies, based on my ontology engineering approach

    Towards Interactive Geodata Analysis through a Combination of Domain-Specific Languages and 3D Geo Applications in a Web Portal Environment

    Get PDF
    Urban planning processes affect a wide range of stakeholders including decision makers, urban planners, business companies as well as citizens. ICT-enabled tools supporting urban planning are considered key to successful and sustainable urban management. Based on previous work in the areas of web-based participation tools for urban planning, rule-based geospatial processing as well as 3D virtual reality applications we present a tool that supports experts from municipalities in planning and decision making but also provides a way for the public to engage in urban planning processes. The main contribution of this work is in the combination of 3D visualization and interaction components with a new ontology-driven rule editor based on domain-specific languages. The 3D visualization, on the one hand, enables stakeholders to present and discuss urban plans. On the other hand, the rule editor particularly targets expert users who need to perform spatial analyses on urban data or want to configure the 3D scene according to custom rules. Compared to previous approaches we propose a portable and interactive solution. Our tool is web-based and uses HTML5 technology making it accessible by a broad audience

    SemTree: An index for supporting semantic retrieval of documents

    Get PDF
    In this paper, we propose SemTree, a novel semantic index for supporting retrieval of information from huge amount of document collections, assuming that semantics of a document can be effectively expressed by a set of (subject, predicate, object) statements as in the RDF model. A distributed version of KD-Tree has been then adopted for providing a scalable solution to the document indexing, leveraging the mapping of triples in a vectorial space. We investigate the feasibility of our approach in a real case study, considering the problem of finding inconsistencies in documents related to software requirements and report some preliminary experimental results

    On social networks and collaborative recommendation

    Get PDF
    Social network systems, like last.fm, play a significant role in Web 2.0, containing large amounts of multimedia-enriched data that are enhanced both by explicit user-provided annotations and implicit aggregated feedback describing the personal preferences of each user. It is also a common tendency for these systems to encourage the creation of virtual networks among their users by allowing them to establish bonds of friendship and thus provide a novel and direct medium for the exchange of data. We investigate the role of these additional relationships in developing a track recommendation system. Taking into account both the social annotation and friendships inherent in the social graph established among users, items and tags, we created a collaborative recommendation system that effectively adapts to the personal information needs of each user. We adopt the generic framework of Random Walk with Restarts in order to provide with a more natural and efficient way to represent social networks. In this work we collected a representative enough portion of the music social network last.fm, capturing explicitly expressed bonds of friendship of the user as well as social tags. We performed a series of comparison experiments between the Random Walk with Restarts model and a user-based collaborative filtering method using the Pearson Correlation similarity. The results show that the graph model system benefits from the additional information embedded in social knowledge. In addition, the graph model outperforms the standard collaborative filtering method.</p

    Learning Relatedness Measures for Entity Linking

    Get PDF
    Entity Linking is the task of detecting, in text documents, relevant mentions to entities of a given knowledge base. To this end, entity-linking algorithms use several signals and features extracted from the input text or from the knowl- edge base. The most important of such features is entity relatedness. Indeed, we argue that these algorithms benefit from maximizing the relatedness among the relevant enti- ties selected for annotation, since this minimizes errors in disambiguating entity-linking. The definition of an e↵ective relatedness function is thus a crucial point in any entity-linking algorithm. In this paper we address the problem of learning high-quality entity relatedness functions. First, we formalize the problem of learning entity relatedness as a learning-to-rank problem. We propose a methodology to create reference datasets on the basis of manually annotated data. Finally, we show that our machine-learned entity relatedness function performs better than other relatedness functions previously proposed, and, more importantly, improves the overall performance of dif- ferent state-of-the-art entity-linking algorithms

    Innovative approaches to urban data management using emerging technologies

    Get PDF
    Many characteristics of Smart cities rely on a sufficient quantity and quality of urban data. Local industry and developers can use this data for application development that improves life of all citizens. Therefore, the handling and usability of this data is a big challenge for smart cities. In this paper we investigate new approaches to urban data management using emerging technologies and give an insight on further research conducted within the EC-funded smarticipate project. Geospatial data cannot be handled well in classical relational database environments. Either they are just put in as binary large objects or have to be broken down into elementary types which can be handled by the database, in many cases resulting in a slow system, since the database technology is not really tuned for delivery on mass data as classical relational databases are optimized for online transaction processing and not analytic processing. Document-based databases provide a better performance, but still struggle with the challenge of large binary objects. Also the heterogeneity of data requires a lot of mapping and data cleansing, in some cases replication can’t be avoided. Another approach is to use Semantic Web technologies to enhance the data and build up relations and connections between entities. However, data formats such as RDF use a different approach and are not suitable for geospatial data leading to a lack on usability. Search engines are a good example of web applications with a high usability. The users must be able to find the right data and get information of related or close matches. This allows information retrieval in an easy to use fashion. The same principles should be applied to geospatial data, which would improve the usability of open data. Combined with data mining and big data technologies those principles would improve the usability of open geospatial data and even lead to new ways to use it. By helping with the interpretation of data in a certain context data is transformed into useful information. In this paper we analyse key features of open geodata portals such as linked data and machine learning in order to show ways of improving the user experience. Based on the Smarticipate projects we show afterwards as open data and geo data online and see the practical application. We also give an outlook on piloting cases where we want to evaluate, how the technologies presented in this paper can be combined to a usefull open data portal. In contrast to the previous EC-funded project urbanapi, where participative processes in smart cities where created with urban data, we go one step further with semantic web and open data. Thereby we achieve a more general approach on open data portals for spatial data and how to improve their usability. The envisioned architecture of the smarticipate project relies on file based storage and a no-copy strategy, which means that data is mostly kept in its original format, a conversion to another format is only done if necessary (e.g. the current format has limitations on domain specific attributes or the user requests a specific format). A strictly functional approach and architecture is envisioned which allows a massively parallel execution and therefore is predestined to be deployed in a cloud environment. The actual search interface uses a domain specific vocabulary which can be customised for special purposes or for users that consider their context and expertise, which should abstract from technology specific peculiarities. Also application programmers will benefit form this architecture as linked data principles will be followed extensively. For example, the JSON and JSON-LD standards will be used, so that web developers can use results of the data store directly without the need for conversion. Also links to further information will be provided within the data, so that a drill down is possible for more details. The remainder of this paper is structured as follows. After the introduction about open data and data in general we look at related work and existing open data portals. This leads to the main chapter about the key technology aspects for an easy-to-use open data portal. This is followed by Chapter five, an introduction of the EC-funded project smarticipate, in which the key technology aspects of chapter four will be included

    Graph-of-Entity: A Model for Combined Data Representation and Retrieval

    Get PDF
    Managing large volumes of digital documents along with the information they contain, or are associated with, can be challenging. As systems become more intelligent, it increasingly makes sense to power retrieval through all available data, where every lead makes it easier to reach relevant documents or entities. Modern search is heavily powered by structured knowledge, but users still query using keywords or, at the very best, telegraphic natural language. As search becomes increasingly dependent on the integration of text and knowledge, novel approaches for a unified representation of combined data present the opportunity to unlock new ranking strategies. We tackle entity-oriented search using graph-based approaches for representation and retrieval. In particular, we propose the graph-of-entity, a novel approach for indexing combined data, where terms, entities and their relations are jointly represented. We compare the graph-of-entity with the graph-of-word, a text-only model, verifying that, overall, it does not yet achieve a better performance, despite obtaining a higher precision. Our assessment was based on a small subset of the INEX 2009 Wikipedia Collection, created from a sample of 10 topics and respectively judged documents. The offline evaluation we do here is complementary to its counterpart from TREC 2017 OpenSearch track, where, during our participation, we had assessed graph-of-entity in an online setting, through team-draft interleaving
    corecore