25 research outputs found

    What Users Ask a Search Engine: Analyzing One Billion Russian Question Queries

    Full text link
    We analyze the question queries submitted to a large commercial web search engine to get insights about what people ask, and to better tailor the search results to the users’ needs. Based on a dataset of about one billion question queries submitted during the year 2012, we investigate askers’ querying behavior with the support of automatic query categorization. While the importance of question queries is likely to increase, at present they only make up 3–4% of the total search traffic. Since questions are such a small part of the query stream and are more likely to be unique than shorter queries, clickthrough information is typically rather sparse. Thus, query categorization methods based on the categories of clicked web documents do not work well for questions. As an alternative, we propose a robust question query classification method that uses the labeled questions from a large community question answering platform (CQA) as a training set. The resulting classifier is then transferred to the web search questions. Even though questions on CQA platforms tend to be different to web search questions, our categorization method proves competitive with strong baselines with respect to classification accuracy. To show the scalability of our proposed method we apply the classifiers to about one billion question queries and discuss the trade-offs between performance and accuracy that different classification models offer. Our findings reveal what people ask a search engine and also how this contrasts behavior on a CQA platform

    Easier : An Approach to Automatically Generate Active Ontologies for Intelligent Assistants

    Get PDF
    Intelligent assistants are ubiquitous and will grow in importance. Apple\u27s well-known assistant Siri uses Active Ontologies to process user input and to model the provided functionalities. Supporting new features requires extending the ontologies or even building new ones. The question is no longer "How to build an intelligent assistant?" but "How to do it efficiently?" We propose EASIER, an approach to automate building and extending Active Ontologies. EASIER identifies new services automatically and classifies unseen service providers with a clustering-based approach. It proposes ontology elements for new service categories and service providers respectively to ease ontology building. We evaluate EASIER with 292 form-based web services and two different clustering algorithms from Weka, DBScan and spectral clustering. DBScan achieves a F1 score of 51% in a ten-fold cross validation but is outperformed by spectral clustering, which achieves a F1 score of even 70%

    Appmetrics - Improving impact on the go

    Get PDF
    Mobile ‘smart’ devices have increased enormously in popularity over the last few years, with 61% of UK adults now owning a smartphone (OFcom, 2014). With the emergence of the tablet computer adding significantly to the utilities available via mobile devices, the adoption of mobile technologies into work-related activities is ever-expanding. However, relatively few academic staff who use these devices make full use of the range of options available, and many lack awareness of the apps they could be using to promote their outputs and improve impact on the go. For information professionals there is a need to stay abreast of current and emerging developments within the world of mobile apps in order to support academic staff in using their mobile devices effectively to improve and monitor their research impact. With so many apps and tools to choose from, in this chapter we will look at an essential "toolkit" of apps that information professionals should bear in mind when supporting and advising academic staff on research impact, along with advice on how to make the best and most efficient use of them. Additionally, this chapter will examine how impact activities undertaken on a mobile device can be fitted into a flexible working day

    Dissection of AI Job Advertisements: A Text Mining-based Analysis of Employee Skills in the Disciplines Computer Vision and Natural Language Processing

    Get PDF
    Human capital is a well discussed topic in information system research. In order for companies to develop and use IT artifacts, they need specialized employees. This is especially the case when complex technologies, such as artificial intelligence, are used. Two major fields of artificial intelligence are computer vision (CV) and natural language processing (NLP). In this paper skills and know-how required for CV and NLP specialists are analyzed and compared from a job market perspective. For this purpose, we utilize a text mining-based analysis pipeline to dissect job advertisements for artificial intelligence. In concrete, job advertisements of both sub-disciplines were crawled from a large international online job platform and analyzed using named entity recognition and term vectors. It could be shown that know-how and skills required differ between the two job profiles. There is no general requirement profile of an artificial intelligence specialist, and it requires a differentiated consideration

    CWCL: Cross-Modal Transfer with Continuously Weighted Contrastive Loss

    Full text link
    This paper considers contrastive training for cross-modal 0-shot transfer wherein a pre-trained model in one modality is used for representation learning in another domain using pairwise data. The learnt models in the latter domain can then be used for a diverse set of tasks in a zero-shot way, similar to ``Contrastive Language-Image Pre-training (CLIP)'' and ``Locked-image Tuning (LiT)'' that have recently gained considerable attention. Most existing works for cross-modal representation alignment (including CLIP and LiT) use the standard contrastive training objective, which employs sets of positive and negative examples to align similar and repel dissimilar training data samples. However, similarity amongst training examples has a more continuous nature, thus calling for a more `non-binary' treatment. To address this, we propose a novel loss function called Continuously Weighted Contrastive Loss (CWCL) that employs a continuous measure of similarity. With CWCL, we seek to align the embedding space of one modality with another. Owing to the continuous nature of similarity in the proposed loss function, these models outperform existing methods for 0-shot transfer across multiple models, datasets and modalities. Particularly, we consider the modality pairs of image-text and speech-text and our models achieve 5-8% (absolute) improvement over previous state-of-the-art methods in 0-shot image classification and 20-30% (absolute) improvement in 0-shot speech-to-intent classification and keyword classification.Comment: Accepted to Neural Information Processing Systems (NeurIPS) 2023 conferenc

    A semantic memory bank assisted by an embodied conversational agents for mobile devices

    Get PDF
    Alzheimer’s disease is a type of dementia that causes memory loss and interferes with intellectual abilities seriously. It has no current cure and therapeutic efficiency of current medication is limited. However, there is evidence that non-pharmacological treatments could be useful to stimulate cognitive abilities. In the last few year, several studies have focused on describing and under- standing how Virtual Coaches (VC) could be key drivers for health promotion in home care settings. The use of VC gains an augmented attention in the considerations of medical innovations. In this paper, we propose an approach that exploits semantic technologies and Embodied Conversational Agents to help patients training cognitive abilities using mobile devices. In this work, semantic technologies are used to provide knowledge about the memory of a specific person, who exploits the structured data stored in a linked data repository and take advantage of the flexibility provided by ontologies to define search domains and expand the agent’s capabilities. Our Memory Bank Embodied Conversational Agent (MBECA) is used to interact with the patient and ease the interaction with new devices. The framework is oriented to Alzheimer’s patients, caregivers, and therapists

    Augmented robotics dialog system for enhancing human-robot interaction

    Get PDF
    Augmented reality, augmented television and second screen are cutting edge technologies that provide end users extra and enhanced information related to certain events in real time. This enriched information helps users better understand such events, at the same time providing a more satisfactory experience. In the present paper, we apply this main idea to human-robot interaction (HRI), to how users and robots interchange information. The ultimate goal of this paper is to improve the quality of HRI, developing a new dialog manager system that incorporates enriched information from the semantic web. This work presents the augmented robotic dialog system (ARDS), which uses natural language understanding mechanisms to provide two features: (i) a non-grammar multimodal input (verbal and/or written) text; and (ii) a contextualization of the information conveyed in the interaction. This contextualization is achieved by information enrichment techniques that link the extracted information from the dialog with extra information about the world available in semantic knowledge bases. This enriched or contextualized information (information enrichment, semantic enhancement or contextualized information are used interchangeably in the rest of this paper) offers many possibilities in terms of HRI. For instance, it can enhance the robot's pro-activeness during a human-robot dialog (the enriched information can be used to propose new topics during the dialog, while ensuring a coherent interaction). Another possibility is to display additional multimedia content related to the enriched information on a visual device. This paper describes the ARDS and shows a proof of concept of its applications.The authors gratefully acknowledge the funds provided by the Spanish MICINN (Ministry of Science and Innovation) through the project “Aplicaciones de los robots sociales”, DPI2011-26980 from the Spanish Ministry of Economy and Competitiveness. The research leading to these results has received funding from the RoboCity2030-III-CM project (Robótica aplicada a la mejora de la calidad de vida de los ciudadanos. fase III; S2013/MIT-2748), funded by Programas de Actividades I+D en la Comunidad de Madrid and co-funded by the Structural Funds of the EU

    Detection of Control Structures in Spoken Utterances

    Get PDF
    corecore