151 research outputs found

    CRIS-IR 2006

    Get PDF
    The recognition of entities and their relationships in document collections is an important step towards the discovery of latent knowledge as well as to support knowledge management applications. The challenge lies on how to extract and correlate entities, aiming to answer key knowledge management questions, such as; who works with whom, on which projects, with which customers and on what research areas. The present work proposes a knowledge mining approach supported by information retrieval and text mining tasks in which its core is based on the correlation of textual elements through the LRD (Latent Relation Discovery) method. Our experiments show that LRD outperform better than other correlation methods. Also, we present an application in order to demonstrate the approach over knowledge management scenarios.Fundação para a Ciência e a Tecnologia (FCT) Denmark's Electronic Research Librar

    Reasoning & Querying – State of the Art

    Get PDF
    Various query languages for Web and Semantic Web data, both for practical use and as an area of research in the scientific community, have emerged in recent years. At the same time, the broad adoption of the internet where keyword search is used in many applications, e.g. search engines, has familiarized casual users with using keyword queries to retrieve information on the internet. Unlike this easy-to-use querying, traditional query languages require knowledge of the language itself as well as of the data to be queried. Keyword-based query languages for XML and RDF bridge the gap between the two, aiming at enabling simple querying of semi-structured data, which is relevant e.g. in the context of the emerging Semantic Web. This article presents an overview of the field of keyword querying for XML and RDF

    Development of a Framework for Ontology Population Using Web Scraping in Mechatronics

    Get PDF
    One of the major challenges in engineering contexts is the efficient collection, management, and sharing of data. To address this problem, semantic technologies and ontologies are potent assets, although some tasks, such as ontology population, usually demand high maintenance effort. This thesis proposes a framework to automate data collection from sparse web resources and insert it into an ontology. In the first place, a product ontology is created based on the combination of several reference vocabularies, namely GoodRelations, the Basic Formal Ontology, ECLASS stan- dard, and an information model. Then, this study introduces a general procedure for developing a web scraping agent to collect data from the web. Subsequently, an algorithm based on lexical similarity measures is presented to map the collected data to the concepts of the ontology. Lastly, the collected data is inserted into the ontology. To validate the proposed solution, this thesis implements the previous steps to collect information about microcontrollers from three differ- ent websites. Finally, the thesis evaluates the use case results, draws conclusions, and suggests promising directions for future research

    Keyword-Based Querying for the Social Semantic Web

    Get PDF
    Enabling non-experts to publish data on the web is an important achievement of the social web and one of the primary goals of the social semantic web. Making the data easily accessible in turn has received only little attention, which is problematic from the point of view of incentives: users are likely to be less motivated to participate in the creation of content if the use of this content is mostly reserved to experts. Querying in semantic wikis, for example, is typically realized in terms of full text search over the textual content and a web query language such as SPARQL for the annotations. This approach has two shortcomings that limit the extent to which data can be leveraged by users: combined queries over content and annotations are not possible, and users either are restricted to expressing their query intent using simple but vague keyword queries or have to learn a complex web query language. The work presented in this dissertation investigates a more suitable form of querying for semantic wikis that consolidates two seemingly conflicting characteristics of query languages, ease of use and expressiveness. This work was carried out in the context of the semantic wiki KiWi, but the underlying ideas apply more generally to the social semantic and social web. We begin by defining a simple modular conceptual model for the KiWi wiki that enables rich and expressive knowledge representation. A component of this model are structured tags, an annotation formalism that is simple yet flexible and expressive, and aims at bridging the gap between atomic tags and RDF. The viability of the approach is confirmed by a user study, which finds that structured tags are suitable for quickly annotating evolving knowledge and are perceived well by the users. The main contribution of this dissertation is the design and implementation of KWQL, a query language for semantic wikis. KWQL combines keyword search and web querying to enable querying that scales with user experience and information need: basic queries are easy to express; as the search criteria become more complex, more expertise is needed to formulate the corresponding query. A novel aspect of KWQL is that it combines both paradigms in a bottom-up fashion. It treats neither of the two as an extension to the other, but instead integrates both in one framework. The language allows for rich combined queries of full text, metadata, document structure, and informal to formal semantic annotations. KWilt, the KWQL query engine, provides the full expressive power of first-order queries, but at the same time can evaluate basic queries at almost the speed of the underlying search engine. KWQL is accompanied by the visual query language visKWQL, and an editor that displays both the textual and visual form of the current query and reflects changes to either representation in the other. A user study shows that participants quickly learn to construct KWQL and visKWQL queries, even when given only a short introduction. KWQL allows users to sift the wealth of structure and annotations in an information system for relevant data. If relevant data constitutes a substantial fraction of all data, ranking becomes important. To this end, we propose PEST, a novel ranking method that propagates relevance among structurally related or similarly annotated data. Extensive experiments, including a user study on a real life wiki, show that pest improves the quality of the ranking over a range of existing ranking approaches

    Proceedings of the 9th International Workshop on Information Retrieval on Current Research Information Systems

    Get PDF
    The recognition of entities and their relationships in document collections is an important step towards the discovery of latent knowledge as well as to support knowledge management applications. The challenge lies on how to extract and correlate entities, aiming to answer key knowledge management questions, such as; who works with whom, on which projects, with which customers and on what research areas. The present work proposes a knowledge mining approach supported by information retrieval and text mining tasks in which its core is based on the correlation of textual elements through the LRD (Latent Relation Discovery) method. Our experiments show that LRD outperform better than other correlation methods. Also, we present an application in order to demonstrate the approach over knowledge management scenarios

    Annotation-based storage and retrieval of models and simulation descriptions in computational biology

    Get PDF
    This work aimed at enhancing reuse of computational biology models by identifying and formalizing relevant meta-information. One type of meta-information investigated in this thesis is experiment-related meta-information attached to a model, which is necessary to accurately recreate simulations. The main results are: a detailed concept for model annotation, a proposed format for the encoding of simulation experiment setups, a storage solution for standardized model representations and the development of a retrieval concept.Die vorliegende Arbeit widmete sich der besseren Wiederverwendung biologischer Simulationsmodelle. Ziele waren die Identifikation und Formalisierung relevanter Modell-Meta-Informationen, sowie die Entwicklung geeigneter Modellspeicherungs- und Modellretrieval-Konzepte. Wichtigste Ergebnisse der Arbeit sind ein detailliertes Modellannotationskonzept, ein Formatvorschlag für standardisierte Kodierung von Simulationsexperimenten in XML, eine Speicherlösung für Modellrepräsentationen sowie ein Retrieval-Konzept

    Automatic Extraction and Assessment of Entities from the Web

    Get PDF
    The search for information about entities, such as people or movies, plays an increasingly important role on the Web. This information is still scattered across many Web pages, making it more time consuming for a user to find all relevant information about an entity. This thesis describes techniques to extract entities and information about these entities from the Web, such as facts, opinions, questions and answers, interactive multimedia objects, and events. The findings of this thesis are that it is possible to create a large knowledge base automatically using a manually-crafted ontology. The precision of the extracted information was found to be between 75–90 % (facts and entities respectively) after using assessment algorithms. The algorithms from this thesis can be used to create such a knowledge base, which can be used in various research fields, such as question answering, named entity recognition, and information retrieval

    Automated Identification and Exploitation of Market Insufficiencies in Fixed-Odds Betting Markets

    Get PDF
    Στη διπλωματική αυτή εργασία μελετάμε τη δυνατότητα αυτόματης αναγνώρισης και εκμετάλλευσης ανεπαρκειών αγοράς σε λιγότερο ερευνημένες αγορές. Ορίζουμε το πρόβλημα της αναποτελεσματικότητας της αγοράς, προσδιορίζουμε τις επιθυμητές συνθήκες, και να εξετάζουμε τελικά την ύπαρξη ή όχι ανεπαρκειών που μπορούν να προσφέρουν ευκαιρίες για εξισορροπητική κερδοσκοπία σε αγορές πρόβλεψης. Στο πλαίσιο της εργασίας, αναπτύχθηκε το λογισμικό που ονομάζεται Delphi, και είναι σε θέση να εντοπίζει τέτοιες ευκαιρίες κέρδους σε μεγάλους όγκους δεδομένων και σε σχετικά σύντομο χρονικό διάστημα. Το Delphi, είναι ένα σύστημα εξόρυξης και ταξινόμησης δεδομένων διαδικτύου που μπορεί να αντιμετωπίσει την ετερογένεια των πληροφοριών που παρουσιάζεται στην αγορά και να επιτύχει την παραγωγή ταξινομημένων και κατάλληλα δομημένων δεδομένων προς επεξεργασία. Τέλος, διαμορφώνεται ένα μαθηματικό μοντέλο που είναι σε θέση να κατανοήσει τα δεδομένα αυτά, να αξιολογήσει τις συνθήκες της αγοράς, και να προσδιορίσει εάν υπάρχουν ευκαιρίες για εξισορροπητική κερδοσκοπία.Ιn this Master thesis we study the possibility of automated identification and exploitation of market insufficiencies in less explored on-line markets. We define the problem, determine the desired market conditions, and finally examine the existence of insufficiencies that can offer arbitrage opportunities in on-line prediction markets. In the context of this thesis, we developed a software named Delphi, that is able to identify arbitrage opportunities in large volumes of data in relatively short time. Delphi is a web data extraction and classification system capable of dealing with the heterogeneity of market information and producing classified structured data. Finally, the problem is reduced to a mathematical model able to understand the market data, evaluate the market conditions, and identify arbitrage opportunities
    corecore