14,132 research outputs found

    A Word Embedding Based Approach for Focused Web Crawling Using the Recurrent Neural Network

    Get PDF
    Learning-based focused crawlers download relevant uniform resource locators (URLs) from the web for a specific topic. Several studies have used the term frequency-inverse document frequency (TF-IDF) weighted cosine vector as an input feature vector for learning algorithms. TF-IDF-based crawlers calculate the relevance of a web page only if a topic word co-occurs on the said page, failing which it is considered irrelevant. Similarity is not considered even if a synonym of a term co-occurs on a web page. To resolve this challenge, this paper proposes a new methodology that integrates the Adagrad-optimized Skip Gram Negative Sampling (A-SGNS)-based word embedding and the Recurrent Neural Network (RNN).The cosine similarity is calculated from the word embedding matrix to form a feature vector that is given as an input to the RNN to predict the relevance of the website. The performance of the proposed method is evaluated using the harvest rate (hr) and irrelevance ratio (ir). The proposed methodology outperforms existing methodologies with an average harvest rate of 0.42 and irrelevance ratio of 0.58

    09251 Abstracts Collection -- Scientific Visualization

    Get PDF
    From 06-14-2009 to 06-19-2009, the Dagstuhl Seminar 09251 ``Scientific Visualization \u27\u27 was held in Schloss Dagstuhl~--~Leibniz Center for Informatics. During the seminar, over 50 international participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general

    Query Resolution for Conversational Search with Limited Supervision

    Get PDF
    In this work we focus on multi-turn passage retrieval as a crucial component of conversational search. One of the key challenges in multi-turn passage retrieval comes from the fact that the current turn query is often underspecified due to zero anaphora, topic change, or topic return. Context from the conversational history can be used to arrive at a better expression of the current turn query, defined as the task of query resolution. In this paper, we model the query resolution task as a binary term classification problem: for each term appearing in the previous turns of the conversation decide whether to add it to the current turn query or not. We propose QuReTeC (Query Resolution by Term Classification), a neural query resolution model based on bidirectional transformers. We propose a distant supervision method to automatically generate training data by using query-passage relevance labels. Such labels are often readily available in a collection either as human annotations or inferred from user interactions. We show that QuReTeC outperforms state-of-the-art models, and furthermore, that our distant supervision method can be used to substantially reduce the amount of human-curated data required to train QuReTeC. We incorporate QuReTeC in a multi-turn, multi-stage passage retrieval architecture and demonstrate its effectiveness on the TREC CAsT dataset.Comment: SIGIR 2020 full conference pape

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

    Responsible research and innovation in science education: insights from evaluating the impact of using digital media and arts-based methods on RRI values

    Get PDF
    The European Commission policy approach of Responsible Research and Innovation (RRI) is gaining momentum in European research planning and development as a strategy to align scientific and technological progress with socially desirable and acceptable ends. One of the RRI agendas is science education, aiming to foster future generations' acquisition of skills and values needed to engage in society responsibly. To this end, it is argued that RRI-based science education can benefit from more interdisciplinary methods such as those based on arts and digital technologies. However, the evidence existing on the impact of science education activities using digital media and arts-based methods on RRI values remains underexplored. This article comparatively reviews previous evidence on the evaluation of these activities, from primary to higher education, to examine whether and how RRI-related learning outcomes are evaluated and how these activities impact on students' learning. Forty academic publications were selected and its content analysed according to five RRI values: creative and critical thinking, engagement, inclusiveness, gender equality and integration of ethical issues. When evaluating the impact of digital and arts-based methods in science education activities, creative and critical thinking, engagement and partly inclusiveness are the RRI values mainly addressed. In contrast, gender equality and ethics integration are neglected. Digital-based methods seem to be more focused on students' questioning and inquiry skills, whereas those using arts often examine imagination, curiosity and autonomy. Differences in the evaluation focus between studies on digital media and those on arts partly explain differences in their impact on RRI values, but also result in non-documented outcomes and undermine their potential. Further developments in interdisciplinary approaches to science education following the RRI policy agenda should reinforce the design of the activities as well as procedural aspects of the evaluation research
    • …
    corecore