Search CORE

53 research outputs found

Personalized Search

Author: Carlsen Fredrik Nygård
Publication venue
Publication date: 01/01/2015
Field of study

As the volume of electronically available information grows, relevant items become harder to find. This work presents an approach to personalizing search results in scientific publication databases. This work focuses on re-ranking search results from existing search engines like Solr or ElasticSearch. This work also includes the development of Obelix, a new recommendation system used to re-rank search results. The project was proposed and performed at CERN, using the scientific publications available on the CERN Document Server (CDS). This work experiments with re-ranking using offline and online evaluation of users and documents in CDS. The experiments conclude that the personalized search result outperform both latest first and word similarity in terms of click position in the search result for global search in CDS

arXiv.org e-Print Archive

CERN Document Server

BlogForever D5.2: Implementation of Case Studies

Author: Arampatzis S.
Arango-Docio S.
Banos. E.
Gkotsis G.
Kopidaki S.
Manolopoulos I.
Pinsent E.
Rynning M.
Sleeman P.
Stepanyan K.
Trochidis I.
Publication venue
Publication date: 25/10/2013
Field of study

This document presents the internal and external testing results for the BlogForever case studies. The evaluation of the BlogForever implementation process is tabulated under the most relevant themes and aspects obtained within the testing processes. The case studies provide relevant feedback for the sustainability of the platform in terms of potential users’ needs and relevant information on the possible long term impact

ZENODO

BlogForever: D3.1 Preservation Strategy Report

Author: Arango-Docio Silvia
Banos Vangelis
Garcia Llopis Jaime
Kalb Hendrik
Kim Yunhyong
Pinsent Ed
Ross Seamus
Sleeman Patricia
Stepanyan Karen
Trochidis Illias
Publication venue: BlogForever
Publication date: 25/10/2013
Field of study

This report describes preservation planning approaches and strategies recommended by the BlogForever project as a core component of a weblog repository design. More specifically, we start by discussing why we would want to preserve weblogs in the first place and what it is exactly that we are trying to preserve. We further present a review of past and present work and highlight why current practices in web archiving do not address the needs of weblog preservation adequately. We make three distinctive contributions in this volume: a) we propose transferable practical workflows for applying a combination of established metadata and repository standards in developing a weblog repository, b) we provide an automated approach to identifying significant properties of weblog content that uses the notion of communities and how this affects previous strategies, c) we propose a sustainability plan that draws upon community knowledge through innovative repository design

Enlighten

Usage-driven Application Profile Generation Using Ontologies

Author: João Miguel Rocha da Silva
Publication venue
Publication date: 18/05/2016
Field of study

Repositório Aberto da Universidade do Porto

Development of a flexible tool for the automatic comparison of bibliographic records. Application to sample collections - Développement d'un logiciel flexible pour la comparaison de notices bibliographiques et application à différentes collections

Author: Borel Alain
Krause Jan
Publication venue
Publication date: 13/10/2009
Field of study

Due to the multiplication of digital bibliographic catalogues (open repositories, library and bookseller catalogues), information specialists are facing the challenge of mass-processing huge amounts of metadata for various purposes. Among the many possible applications, determining the similarity between records is an important issue. Such a similarity can be interesting from a bibliographic point of view (i.e., do the records describe the same document, the answer to which can be useful for deduplication or for collection overlap studies) as well as from a thematic point of view (suggestion of documents to the user, as well as content management within the framework of a library policy, automatic classification of documents, and so on). In order to fulfil such various needs, we propose a flexible, open-source, multiplatform software tool supporting the implementation of multiple strategies for record comparisons. In a second step, we study the relevance and performance of several algorithms applied to a selection of collections (size, origin, document types...)

Infoscience - École polytechnique fédérale de Lausanne

BlogForever: D2.5 Weblog Spam Filtering Report and Associated Methodology

Author: Banos Vangelis
Kasioumis Nikolaos
Kim Yunhyong
Kopidaki Stella
Ross Seamus
Rynning Morten
Stepanyan Karen
Publication venue: BlogForever
Publication date: 25/10/2013
Field of study

This report is written as a first attempt to define the BlogForever spam detection strategy. It comprises a survey of weblog spam technology and approaches to their detection. While the report was written to help identify possible approaches to spam detection as a component within the BlogForver software, the discussion has been extended to include observations related to the historical, social and practical value of spam, and proposals of other ways of dealing with spam within the repository without necessarily removing them. It contains a general overview of spam types, ready-made anti-spam APIs available for weblogs, possible methods that have been suggested for preventing the introduction of spam into a blog, and research related to spam focusing on those that appear in the weblog context, concluding in a proposal for a spam detection workflow that might form the basis for the spam detection component of the BlogForever software

ZENODO

Enlighten

Como mejorar la visibilidad y el posicionamiento en los motores de búsqueda de un repositorio digital mediante el uso de Schema.org.

Author: Nevado Chiné Nuria
Publication venue
Publication date: 01/09/2019
Field of study

Màster Universitari de Gestió i Direcció de Biblioteques i Serveis d'Informacó. Facultat d'Informació i Mitjans Audiovisuals. Universitat de Barcelona. Curs: 2018-2019. Tutor: Rubén Alcaraz.La Web 3.0 ha transformado la forma de acceder y compartir el conocimiento. La Web Semántica abre un nuevo espacio de relación entre la información que se publica en la Web y la comprensión que de ellas pueden extraer las máquinas para dar una mejor respuesta al usuario en sus búsquedas. En este contexto, la optimización de motores de búsqueda (SEO) se convierte en un factor crucial como método para mejorar la visibilidad de un sitio o página web en los resultados de búsqueda de un motor de búsqueda. En este camino de mejora de la interoperabilidad semántica, Google, Yahoo y Bing presentan en 2011 Schema.org, un vocabulario creado para hacer que el contenido web sea comprensible a los rastreadores web y otras máquinas. Un vocabulario que permite describir la información que contienen las webs con una serie de propiedades que se insertan dentro del código HTML, semantizando sus contenidos y haciendo haciendo sus datos legibles e interpretables por aplicaciones informáticas. Los repositorios digitales se enfrentan al desafío de que los usuarios encuentren su contenido en un entorno tan grande como Internet. Si consideramos que el comportamiento de los investigadores respecto a cómo descubren, leen y utilizan la literatura académica ha cambiado considerablemente con el avance de la Web, el planteamiento de trabajo por parte de los repositorios digitales en pro de la visibilidad e interoperabilidad con la web, necesita afrontar mejoras semánticas para no convertirse en repositorios invisibles, porque no pueden ser recuperados de forma apropiada por los motores de búsqueda en internet. En este trabajo se realiza una introducción a los conceptos básicos y diferentes tecnologías, estándares y recomendaciones que conforman el entorno de la Web Semántica y se plantea el reto de implementar una herramienta como Schema.org en el repositorio institucional de la Universitat de València (RODERIC) como ingrediente de mejora en el entendimiento entre la web, la información que contiene y los buscadores..

Diposit Digital de la Universitat de Barcelona

Электронные библиотеки: перспективные методы и технологии, электронные коллекции

Author
Publication venue: КарНЦ РАН
Publication date: 01/01/2009
Field of study

Электронные библиотеки – область исследований и разработок, направленных на развитие теории и практики обработки, распространения, хранения, анализа и поиска цифровых данных различной природы. Основная цель серии конференций RCDL заключается в формировании сообщества специалистов России, ведущих исследования и разработки в области электронных библиотек и близких областях. Всероссийская научная конференция 2009 г. (RCDL'2009) является одиннадцатой конференцией по данной тематике (1999 г. – Санкт-Петербург, 2000 г. – Протвино, 2001 г. – Петрозаводск, 2002 г. – Дубна, 2003 г. – Санкт-Петербург, 2004 г. – Пущино, 2005 г. – Ярославль, 2006 г. – Суздаль, 2007 г. – Переславль-Залесский, 2008 г. – Дубна). Настоящий сборник включает тексты докладов, коротких сообщений и стендовых докладов, отобранных Программным комитетом RCDL'2009 в результате проведенного рецензирования

The repository of KarRC RAS

Recommended from our members

A model of scientists' information seeking and a user-interface design

Author: Sadeh Tamar
Publication venue
Publication date
Field of study

Information systems that are available today do not optimally address the information-seeking behaviour of scholars, particularly those who belong to scientific communities; as a result, scholarly discovery is often cumbersome and incomplete. The hypothesis of this study is that an information-seeking system that is designed to address the nature of scholarly materials and the information seeking behaviour of scholars, particularly the members of one scientific community, will increase the effectiveness of the scholars’ searches and enable them to find and obtain relevant materials with greater ease and precision than current practices do. The information-seeking behaviour and search practices deployed by high-energy physics (HEP) researchers are explored through a series of interviews and observations. More than 2,100 responses obtained from a HEP survey are also examined; in particular, the participants’ open-ended responses are analysed. On the basis of qualitative and quantitative research regarding the characteristics of HEP scientists and their information-seeking practices, a set of six personas, representing typical members of the HEP community, is constructed. An original model is developed that leverages existing models of information behaviour, information seeking, and information searching and reflects the full spectrum of active information-seeking and information-searching practices of HEP scholars and the nature of the data that these researchers seek. The model is then evaluated by means of seven scenarios involving the personas constructed earlier. On the basis of the information-seeking model, a software user interface is designed as the future interface for the HEP INSPIRE information system. The user-interface design is corroborated through the model, and the personas are used to evaluate the design. Methods are suggested for long-term quantitative and qualitative monitoring of the ways in which this design supports HEP researchers. It is argued that the proposed user interface, which provides an information environment that accommodates the information-seeking practices of the HEP community in a friendly and efficient manner, will support HEP academic research—and research of other scholarly communities that share some of the HEP community’s characteristics—by shortening the search process and improving the findability of quality materials. This thesis contributes to the body of information-science knowledge in the novel modelling of information-seeking behaviour of a well-defined scientific community, the use of personas for the modelling, and the concretization of the model into a new user-interface design

City Research Online

The Future of Information Sciences : INFuture2009 : Digital Resources and Knowledge Sharing

Author
Publication venue: Department of Information Sciences, Faculty of Humanities and Social Sciences, University of Zagreb
Publication date: 01/11/2009
Field of study

Repozitorij Filozofskog fakulteta u Zagrebu' at University of Zagreb