27,774 research outputs found
Structural Regularities in Text-based Entity Vector Spaces
Entity retrieval is the task of finding entities such as people or products
in response to a query, based solely on the textual documents they are
associated with. Recent semantic entity retrieval algorithms represent queries
and experts in finite-dimensional vector spaces, where both are constructed
from text sequences.
We investigate entity vector spaces and the degree to which they capture
structural regularities. Such vector spaces are constructed in an unsupervised
manner without explicit information about structural aspects. For concreteness,
we address these questions for a specific type of entity: experts in the
context of expert finding. We discover how clusterings of experts correspond to
committees in organizations, the ability of expert representations to encode
the co-author graph, and the degree to which they encode academic rank. We
compare latent, continuous representations created using methods based on
distributional semantics (LSI), topic models (LDA) and neural networks
(word2vec, doc2vec, SERT). Vector spaces created using neural methods, such as
doc2vec and SERT, systematically perform better at clustering than LSI, LDA and
word2vec. When it comes to encoding entity relations, SERT performs best.Comment: ICTIR2017. Proceedings of the 3rd ACM International Conference on the
Theory of Information Retrieval. 201
Modeling and Analysis of Scholar Mobility on Scientific Landscape
Scientific literature till date can be thought of as a partially revealed
landscape, where scholars continue to unveil hidden knowledge by exploring
novel research topics. How do scholars explore the scientific landscape , i.e.,
choose research topics to work on? We propose an agent-based model of topic
mobility behavior where scholars migrate across research topics on the space of
science following different strategies, seeking different utilities. We use
this model to study whether strategies widely used in current scientific
community can provide a balance between individual scientific success and the
efficiency and diversity of the whole academic society. Through extensive
simulations, we provide insights into the roles of different strategies, such
as choosing topics according to research potential or the popularity. Our model
provides a conceptual framework and a computational approach to analyze
scholars' behavior and its impact on scientific production. We also discuss how
such an agent-based modeling approach can be integrated with big real-world
scholarly data.Comment: To appear in BigScholar, WWW 201
Mapping Big Data into Knowledge Space with Cognitive Cyber-Infrastructure
Big data research has attracted great attention in science, technology,
industry and society. It is developing with the evolving scientific paradigm,
the fourth industrial revolution, and the transformational innovation of
technologies. However, its nature and fundamental challenge have not been
recognized, and its own methodology has not been formed. This paper explores
and answers the following questions: What is big data? What are the basic
methods for representing, managing and analyzing big data? What is the
relationship between big data and knowledge? Can we find a mapping from big
data into knowledge space? What kind of infrastructure is required to support
not only big data management and analysis but also knowledge discovery, sharing
and management? What is the relationship between big data and science paradigm?
What is the nature and fundamental challenge of big data computing? A
multi-dimensional perspective is presented toward a methodology of big data
computing.Comment: 59 page
Serendipitous research process
This article presents the results of an exploratory study asking faculty in the first-year writing program and instruction librarians about their research process focusing on results specifically related to serendipity. Steps to prepare for serendipity are highlighted as well as a model for incorporating serendipity into a first-year writing course
Taxonomy for Humans or Computers? Cognitive Pragmatics for Big Data
Criticism of big data has focused on showing that more is not necessarily better, in the sense that data may lose their value when taken out of context and aggregated together. The next step is to incorporate an awareness of pitfalls for aggregation into the design of data infrastructure and institutions. A common strategy minimizes aggregation errors by increasing the precision of our conventions for identifying and classifying data. As a counterpoint, we argue that there are pragmatic trade-offs between precision and ambiguity that are key to designing effective solutions for generating big data about biodiversity. We focus on the importance of theory-dependence as a source of ambiguity in taxonomic nomenclature and hence a persistent challenge for implementing a single, long-term solution to storing and accessing meaningful sets of biological specimens. We argue that ambiguity does have a positive role to play in scientific progress as a tool for efficiently symbolizing multiple aspects of taxa and mediating between conflicting hypotheses about their nature. Pursuing a deeper understanding of the trade-offs and synthesis of precision and ambiguity as virtues of scientific language and communication systems then offers a productive next step for realizing sound, big biodiversity data services
Identifying creative research accomplishments : methodology and results for nanotechnology and human genetics
Motivated by concerns about the organizational and institutional conditions that foster research creativity in science, we focus on how creative research can be defined, operationalized, and empirically identified. A functional typology of research creativity is proposed encompassing theoretical, methodological and empirical developments in science. We then apply this typology through a process of creative research event identification in the fields of nanotechnology and human genetics in Europe and the United States, combining nominations made by several hundred experts with data on prize winners. Characteristics of creative research in the two respective fields are analyzed, and there is a discussion of broader insights offered by our approach. --
Hypermedia-based discovery for source selection using low-cost linked data interfaces
Evaluating federated Linked Data queries requires consulting multiple sources on the Web. Before a client can execute queries, it must discover data sources, and determine which ones are relevant. Federated query execution research focuses on the actual execution, while data source discovery is often marginally discussed-even though it has a strong impact on selecting sources that contribute to the query results. Therefore, the authors introduce a discovery approach for Linked Data interfaces based on hypermedia links and controls, and apply it to federated query execution with Triple Pattern Fragments. In addition, the authors identify quantitative metrics to evaluate this discovery approach. This article describes generic evaluation measures and results for their concrete approach. With low-cost data summaries as seed, interfaces to eight large real-world datasets can discover each other within 7 minutes. Hypermedia-based client-side querying shows a promising gain of up to 50% in execution time, but demands algorithms that visit a higher number of interfaces to improve result completeness
Information retrieval and machine learning methods for academic expert finding
In the context of academic expert finding, this paper investigates and compares the performance of information retrieval (IR) and machine learning (ML) methods, including deep learning, to approach the problem of identifying academic figures who are experts in different domains when a potential user requests their expertise. IR-based methods construct multifaceted textual profiles for each expert by clustering information from their scientific publications. Several methods fully tailored for this problem are presented in this paper. In contrast, ML-based methods treat expert finding as a classification task, training automatic text classifiers using publications authored by experts. By comparing these approaches, we contribute to a deeper understanding of academic-expert-finding techniques and their applicability in knowledge discovery. These methods are tested with two large datasets from the biomedical field: PMSC-UGR and CORD-19. The results show how IR techniques were, in general, more robust with both datasets and more suitable than the ML-based ones, with some exceptions showing good performance.Agencia Estatal de Investigación | Ref. PID2019-106758GB-C31Agencia Estatal de Investigación | Ref. PID2020-113230RB-C22FEDER/Junta de Andalucía | Ref. A-TIC-146-UGR2
Press Start: the value of an online student-led, peer-reviewed game studies journal
In this article, an online student journal is described, and the ways in which student participants value the journal are discussed. Press Start is a peer-reviewed international journal of game studies, which aims to publish the best student work related to the academic study of video games. Content analysis of qualitative survey data (n = 29) provides insights into what students value about the journal, revealing six broad themes: community and support, inclusiveness and accessibility, the published research, feedback from peer review, experience of conducting peer review and the opportunity to publish. The article concludes by suggesting that engagement with online student journals should not be limited in terms of geography or the level of study, unless there are robust pedagogical reasons for doing so
- …