25,133 research outputs found

    An infrastructure for building semantic web portals

    Get PDF
    In this paper, we present our KMi semantic web portal infrastructure, which supports two important tasks of semantic web portals, namely metadata extraction and data querying. Central to our infrastructure are three components: i) an automated metadata extraction tool, ASDI, which supports the extraction of high quality metadata from heterogeneous sources, ii) an ontology-driven question answering tool, AquaLog, which makes use of the domain specific ontology and the semantic metadata extracted by ASDI to answers questions in natural language format, and iii) a semantic search engine, which enhances traditional text-based searching by making use of the underlying ontologies and the extracted metadata. A semantic web portal application has been built, which illustrates the usage of this infrastructure

    Structural Regularities in Text-based Entity Vector Spaces

    Get PDF
    Entity retrieval is the task of finding entities such as people or products in response to a query, based solely on the textual documents they are associated with. Recent semantic entity retrieval algorithms represent queries and experts in finite-dimensional vector spaces, where both are constructed from text sequences. We investigate entity vector spaces and the degree to which they capture structural regularities. Such vector spaces are constructed in an unsupervised manner without explicit information about structural aspects. For concreteness, we address these questions for a specific type of entity: experts in the context of expert finding. We discover how clusterings of experts correspond to committees in organizations, the ability of expert representations to encode the co-author graph, and the degree to which they encode academic rank. We compare latent, continuous representations created using methods based on distributional semantics (LSI), topic models (LDA) and neural networks (word2vec, doc2vec, SERT). Vector spaces created using neural methods, such as doc2vec and SERT, systematically perform better at clustering than LSI, LDA and word2vec. When it comes to encoding entity relations, SERT performs best.Comment: ICTIR2017. Proceedings of the 3rd ACM International Conference on the Theory of Information Retrieval. 201

    Conversational Exploratory Search via Interactive Storytelling

    Get PDF
    Conversational interfaces are likely to become more efficient, intuitive and engaging way for human-computer interaction than today's text or touch-based interfaces. Current research efforts concerning conversational interfaces focus primarily on question answering functionality, thereby neglecting support for search activities beyond targeted information lookup. Users engage in exploratory search when they are unfamiliar with the domain of their goal, unsure about the ways to achieve their goals, or unsure about their goals in the first place. Exploratory search is often supported by approaches from information visualization. However, such approaches cannot be directly translated to the setting of conversational search. In this paper we investigate the affordances of interactive storytelling as a tool to enable exploratory search within the framework of a conversational interface. Interactive storytelling provides a way to navigate a document collection in the pace and order a user prefers. In our vision, interactive storytelling is to be coupled with a dialogue-based system that provides verbal explanations and responsive design. We discuss challenges and sketch the research agenda required to put this vision into life.Comment: Accepted at ICTIR'17 Workshop on Search-Oriented Conversational AI (SCAI 2017

    Comparing Traditional and LLM-based Search for Image Geolocation

    Full text link
    Web search engines have long served as indispensable tools for information retrieval; user behavior and query formulation strategies have been well studied. The introduction of search engines powered by large language models (LLMs) suggested more conversational search and new types of query strategies. In this paper, we compare traditional and LLM-based search for the task of image geolocation, i.e., determining the location where an image was captured. Our work examines user interactions, with a particular focus on query formulation strategies. In our study, 60 participants were assigned either traditional or LLM-based search engines as assistants for geolocation. Participants using traditional search more accurately predicted the location of the image compared to those using the LLM-based search. Distinct strategies emerged between users depending on the type of assistant. Participants using the LLM-based search issued longer, more natural language queries, but had shorter search sessions. When reformulating their search queries, traditional search participants tended to add more terms to their initial queries, whereas participants using the LLM-based search consistently rephrased their initial queries

    Trusting (and Verifying) Online Intermediaries\u27 Policing

    Get PDF
    All is not well in the land of online self-regulation. However competently internet intermediaries police their sites, nagging questions will remain about their fairness and objectivity in doing so. Is Comcast blocking BitTorrent to stop infringement, to manage traffic, or to decrease access to content that competes with its own for viewers? How much digital due process does Google need to give a site it accuses of harboring malware? If Facebook censors a video of war carnage, is that a token of respect for the wounded or one more reflexive effort of a major company to ingratiate itself with the Washington establishment? Questions like these will persist, and erode the legitimacy of intermediary self-policing, as long as key operations of leading companies are shrouded in secrecy. Administrators must develop an institutional competence for continually monitoring rapidly-changing business practices. A trusted advisory council charged with assisting the Federal Trade Commission (FTC) and Federal Communications Commission (FCC) could help courts and agencies adjudicate controversies concerning intermediary practices. An Internet Intermediary Regulatory Council (IIRC) would spur the development of expertise necessary to understand whether companies’ controversial decisions are socially responsible or purely self-interested. Monitoring is a prerequisite for assuring a level playing field online

    Security and computer forensics in web engineering education

    Get PDF
    The integration of security and forensics into Web Engineering curricula is imperative! Poor security in web-based applications is continuing to cost organizations millions and the losses are still increasing annually. Security is frequently taught as a stand-alone course, assuming that security can be 'bolted on' to a web application at some point. Security issues must be integrated into Web Engineering processes right from the beginning to create secure solutions and therefore security should be an integral part of a Web Engineering curriculum. One aspect of Computer forensics investigates failures in security. Hence, students should be aware of the issues in forensics and how to respond when security failures occur; collecting evidence is particularly difficult for Web-based applications
    • …
    corecore