Search CORE

27,774 research outputs found

Structural Regularities in Text-based Entity Vector Spaces

Author: de Rijke Maarten
Kanoulas Evangelos
Van Gysel Christophe
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2017
Field of study

Entity retrieval is the task of finding entities such as people or products in response to a query, based solely on the textual documents they are associated with. Recent semantic entity retrieval algorithms represent queries and experts in finite-dimensional vector spaces, where both are constructed from text sequences. We investigate entity vector spaces and the degree to which they capture structural regularities. Such vector spaces are constructed in an unsupervised manner without explicit information about structural aspects. For concreteness, we address these questions for a specific type of entity: experts in the context of expert finding. We discover how clusterings of experts correspond to committees in organizations, the ability of expert representations to encode the co-author graph, and the degree to which they encode academic rank. We compare latent, continuous representations created using methods based on distributional semantics (LSI), topic models (LDA) and neural networks (word2vec, doc2vec, SERT). Vector spaces created using neural methods, such as doc2vec and SERT, systematically perform better at clustering than LSI, LDA and word2vec. When it comes to encoding entity relations, SERT performs best.Comment: ICTIR2017. Proceedings of the 3rd ACM International Conference on the Theory of Information Retrieval. 201

arXiv.org e-Print Archive

Crossref

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Modeling and Analysis of Scholar Mobility on Scientific Landscape

Author: Chiu Dah Ming
Venkatramanan Srinivasan
Ying Qiu Fang
Publication venue
Publication date: 10/03/2015
Field of study

Scientific literature till date can be thought of as a partially revealed landscape, where scholars continue to unveil hidden knowledge by exploring novel research topics. How do scholars explore the scientific landscape , i.e., choose research topics to work on? We propose an agent-based model of topic mobility behavior where scholars migrate across research topics on the space of science following different strategies, seeking different utilities. We use this model to study whether strategies widely used in current scientific community can provide a balance between individual scientific success and the efficiency and diversity of the whole academic society. Through extensive simulations, we provide insights into the roles of different strategies, such as choosing topics according to research potential or the popularity. Our model provides a conceptual framework and a computational approach to analyze scholars' behavior and its impact on scientific production. We also discuss how such an agent-based modeling approach can be integrated with big real-world scholarly data.Comment: To appear in BigScholar, WWW 201

arXiv.org e-Print Archive

Crossref

Mapping Big Data into Knowledge Space with Cognitive Cyber-Infrastructure

Author: Zhuge Hai
Publication venue
Publication date: 18/07/2015
Field of study

Big data research has attracted great attention in science, technology, industry and society. It is developing with the evolving scientific paradigm, the fourth industrial revolution, and the transformational innovation of technologies. However, its nature and fundamental challenge have not been recognized, and its own methodology has not been formed. This paper explores and answers the following questions: What is big data? What are the basic methods for representing, managing and analyzing big data? What is the relationship between big data and knowledge? Can we find a mapping from big data into knowledge space? What kind of infrastructure is required to support not only big data management and analysis but also knowledge discovery, sharing and management? What is the relationship between big data and science paradigm? What is the nature and fundamental challenge of big data computing? A multi-dimensional perspective is presented toward a methodology of big data computing.Comment: 59 page

arXiv.org e-Print Archive

CiteSeerX

Serendipitous research process

Author: Nutefall Jennifer E.
Ryder Phyllis Mentzell
Publication venue: Scholar Commons
Publication date: 01/01/2010
Field of study

This article presents the results of an exploratory study asking faculty in the first-year writing program and instruction librarians about their research process focusing on results specifically related to serendipity. Steps to prepare for serendipity are highlighted as well as a model for incorporating serendipity into a first-year writing course

Scholar Commons - Santa Clara University

Taxonomy for Humans or Computers? Cognitive Pragmatics for Big Data

Author: Franz Nico M.
Sterner Beckett
Publication venue
Publication date: 01/01/2017
Field of study

Criticism of big data has focused on showing that more is not necessarily better, in the sense that data may lose their value when taken out of context and aggregated together. The next step is to incorporate an awareness of pitfalls for aggregation into the design of data infrastructure and institutions. A common strategy minimizes aggregation errors by increasing the precision of our conventions for identifying and classifying data. As a counterpoint, we argue that there are pragmatic trade-offs between precision and ambiguity that are key to designing effective solutions for generating big data about biodiversity. We focus on the importance of theory-dependence as a source of ambiguity in taxonomic nomenclature and hence a persistent challenge for implementing a single, long-term solution to storing and accessing meaningful sets of biological specimens. We argue that ambiguity does have a positive role to play in scientific progress as a tool for efficiently symbolizing multiple aspects of taxa and mediating between conflicting hypotheses about their nature. Pursuing a deeper understanding of the trade-offs and synthesis of precision and ambiguity as virtues of scientific language and communication systems then offers a productive next step for realizing sound, big biodiversity data services

PhilPapers

Identifying creative research accomplishments : methodology and results for nanotechnology and human genetics

Author: Heinze Thomas
Kuhlmann Stefan
Senker Jacqueline
Shapira Philip
Publication venue
Publication date
Field of study

Motivated by concerns about the organizational and institutional conditions that foster research creativity in science, we focus on how creative research can be defined, operationalized, and empirically identified. A functional typology of research creativity is proposed encompassing theoretical, methodological and empirical developments in science. We then apply this typology through a process of creative research event identification in the fields of nanotechnology and human genetics in Europe and the United States, combining nominations made by several hundred experts with data on prize winners. Characteristics of creative research in the two respective fields are analyzed, and there is a discussion of broader insights offered by our approach. --

Research Papers in Economics

Hypermedia-based discovery for source selection using low-cost linked data interfaces

Author: Colpaert Pieter
Dimou Anastasia
Mannens Erik
Vander Sande Miel
Verborgh Ruben
Publication venue: 'IGI Global'
Publication date: 01/01/2016
Field of study

Evaluating federated Linked Data queries requires consulting multiple sources on the Web. Before a client can execute queries, it must discover data sources, and determine which ones are relevant. Federated query execution research focuses on the actual execution, while data source discovery is often marginally discussed-even though it has a strong impact on selecting sources that contribute to the query results. Therefore, the authors introduce a discovery approach for Linked Data interfaces based on hypermedia links and controls, and apply it to federated query execution with Triple Pattern Fragments. In addition, the authors identify quantitative metrics to evaluate this discovery approach. This article describes generic evaluation measures and results for their concrete approach. With low-cost data summaries as seed, interfaces to eight large real-world datasets can discover each other within 7 minutes. Hypermedia-based client-side querying shows a promising gain of up to 50% in execution time, but demands algorithms that visit a higher number of interfaces to improve result completeness

Ghent University Academic Bibliography

Information retrieval and machine learning methods for academic expert finding

Author: Bolaños Néstor
de Campos Luis M.
Fernández Luna Juan Manuel
Huete Juan F.
Ribadas Pena Francisco José
Publication venue: COmputational LEarnig
Publication date: 07/02/2024
Field of study

In the context of academic expert finding, this paper investigates and compares the performance of information retrieval (IR) and machine learning (ML) methods, including deep learning, to approach the problem of identifying academic figures who are experts in different domains when a potential user requests their expertise. IR-based methods construct multifaceted textual profiles for each expert by clustering information from their scientific publications. Several methods fully tailored for this problem are presented in this paper. In contrast, ML-based methods treat expert finding as a classification task, training automatic text classifiers using publications authored by experts. By comparing these approaches, we contribute to a deeper understanding of academic-expert-finding techniques and their applicability in knowledge discovery. These methods are tested with two large datasets from the biomedical field: PMSC-UGR and CORD-19. The results show how IR techniques were, in general, more robust with both datasets and more suitable than the ML-based ones, with some exceptions showing good performance.Agencia Estatal de Investigación | Ref. PID2019-106758GB-C31Agencia Estatal de Investigación | Ref. PID2020-113230RB-C22FEDER/Junta de Andalucía | Ref. A-TIC-146-UGR2

Investigo

Press Start: the value of an online student-led, peer-reviewed game studies journal

Author: Barr Matthew
Publication venue: 'Informa UK Limited'
Publication date: 07/12/2017
Field of study

In this article, an online student journal is described, and the ways in which student participants value the journal are discussed. Press Start is a peer-reviewed international journal of game studies, which aims to publish the best student work related to the academic study of video games. Content analysis of qualitative survey data (n = 29) provides insights into what students value about the journal, revealing six broad themes: community and support, inclusiveness and accessibility, the published research, feedback from peer review, experience of conducting peer review and the opportunity to publish. The article concludes by suggesting that engagement with online student journals should not be limited in terms of geography or the level of study, unless there are robust pedagogical reasons for doing so

Directory of Open Access Journals

Enlighten