Search CORE

54,722 research outputs found

The textual characteristics of traditional and Open Access scientific journals are similar

Author: A Knebel
A Swan
C Blaschke
D Biber
D Ferrucci
DP Corney
G Eysenbach
K Bretonnel Cohen
K Curran
K Verspoor
Karin Verspoor
KB Cohen
L Tanabe
Lawrence Hunter
M Krallinger
M Palmer
MP Marcus
P Rayson
PK Shah
S Kullback
T Dunning
W Hersh
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Recent years have seen an increased amount of natural language processing (NLP) work on full text biomedical journal publications. Much of this work is done with Open Access journal articles. Such work assumes that Open Access articles are representative of biomedical publications in general and that methods developed for analysis of Open Access full text publications will generalize to the biomedical literature as a whole. If this assumption is wrong, the cost to the community will be large, including not just wasted resources, but also flawed science. This paper examines that assumption. Results We collected two sets of documents, one consisting only of Open Access publications and the other consisting only of traditional journal publications. We examined them for differences in surface linguistic structures that have obvious consequences for the ease or difficulty of natural language processing and for differences in semantic content as reflected in lexical items. Regarding surface linguistic structures, we examined the incidence of conjunctions, negation, passives, and pronominal anaphora, and found that the two collections did not differ. We also examined the distribution of sentence lengths and found that both collections were characterized by the same mode. Regarding lexical items, we found that the Kullback-Leibler divergence between the two collections was low, and was lower than the divergence between either collection and a reference corpus. Where small differences did exist, log likelihood analysis showed that they were primarily in the area of formatting and in specific named entities. Conclusion We did not find structural or semantic differences between the Open Access and traditional journal collections.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

University of Melbourne Institutional Repository

Encoding models for scholarly literature

Author: Holmes Martin
Romary Laurent
Publication venue: 'IGI Global'
Publication date: 03/06/2009
Field of study

We examine the issue of digital formats for document encoding, archiving and publishing, through the specific example of "born-digital" scholarly journal articles. We will begin by looking at the traditional workflow of journal editing and publication, and how these practices have made the transition into the online domain. We will examine the range of different file formats in which electronic articles are currently stored and published. We will argue strongly that, despite the prevalence of binary and proprietary formats such as PDF and MS Word, XML is a far superior encoding choice for journal articles. Next, we look at the range of XML document structures (DTDs, Schemas) which are in common use for encoding journal articles, and consider some of their strengths and weaknesses. We will suggest that, despite the existence of specialized schemas intended specifically for journal articles (such as NLM), and more broadly-used publication-oriented schemas such as DocBook, there are strong arguments in favour of developing a subset or customization of the Text Encoding Initiative (TEI) schema for the purpose of journal-article encoding; TEI is already in use in a number of journal publication projects, and the scale and precision of the TEI tagset makes it particularly appropriate for encoding scholarly articles. We will outline the document structure of a TEI-encoded journal article, and look in detail at suggested markup patterns for specific features of journal articles

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Multimedia search without visual analysis: the value of linguistic and contextual information

Author: Jong Franciska M.G. de
Vries Arjen P. de
Westerveld Thijs
Publication venue: IEEE Computer Society Press
Publication date: 01/01/2007
Field of study

This paper addresses the focus of this special issue by analyzing the potential contribution of linguistic content and other non-image aspects to the processing of audiovisual data. It summarizes the various ways in which linguistic content analysis contributes to enhancing the semantic annotation of multimedia content, and, as a consequence, to improving the effectiveness of conceptual media access tools. A number of techniques are presented, including the time-alignment of textual resources, audio and speech processing, content reduction and reasoning tools, and the exploitation of surface features

CiteSeerX

CWI's Institutional Repository

University of Twente Research Information

Theory and Practice of Data Citation

Author: Silvello Gianmaria
Publication venue: 'Wiley'
Publication date: 24/06/2017
Field of study

Citations are the cornerstone of knowledge propagation and the primary means of assessing the quality of research, as well as directing investments in science. Science is increasingly becoming "data-intensive", where large volumes of data are collected and analyzed to discover complex patterns through simulations and experiments, and most scientific reference works have been replaced by online curated datasets. Yet, given a dataset, there is no quantitative, consistent and established way of knowing how it has been used over time, who contributed to its curation, what results have been yielded or what value it has. The development of a theory and practice of data citation is fundamental for considering data as first-class research objects with the same relevance and centrality of traditional scientific products. Many works in recent years have discussed data citation from different viewpoints: illustrating why data citation is needed, defining the principles and outlining recommendations for data citation systems, and providing computational methods for addressing specific issues of data citation. The current panorama is many-faceted and an overall view that brings together diverse aspects of this topic is still missing. Therefore, this paper aims to describe the lay of the land for data citation, both from the theoretical (the why and what) and the practical (the how) angle.Comment: 24 pages, 2 tables, pre-print accepted in Journal of the Association for Information Science and Technology (JASIST), 201

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Padova

Reviewing, indicating, and counting books for modern research evaluation systems

Author: A Natowitz
A Nederhof
A Schubert
A Spink
A Weel Van der
A Zuccala
A Zuccala
A Zuccala
A Zuccala
A Zuccala
A Zuccala
A Zuccala
A Zuccala
A Zuccala
A Zuccala
AC Kors
AJ Nederhof
AJM Linmans
AV Rubio
B Cronin
B Cronin
B Flyvberg
B González-Pereira
B Hammarfelt
B Tillet
B Tillet
C Basili
C Beghtol
D Hicks
D Motta-Roth
D Torres-Salinas
D Torres-Salinas
D Torres-Salinas
D Torres-Salinas
D Torres-Salinas
DT Champion
E Archambault
E Garfield
E Garfield
E Garfield
E Garfield
E Giménez-Toledo
E Giménez-Toledo
E Giménez-Toledo
E Giménez-Toledo
EL Eisenstein
FT Verleysen
G Lewison
G Sivertsen
G Williams
H Regnery
H White
HD White
HF Moed
HF Moed
HF Moed
J Fry
J Gorraiz
J Gorraiz
J Guillory
J Hartley
J Mañana-Rodríguez
J Nicolaisen
J Serebnick
JB Thomson
JM Parker
JW Cortada
JW Thompson
K Kousha
K Kousha
K Kousha
K Kousha
K Kousha
K Kousha
KW Boyack
L Butler
L Butler
L Leydesdorff
L Leydesdorff
LA Coser
LE Riley
LM Chan
M Huang
M Mann
M Thelwall
MI Dorta González
N Glenn
P Dorta González
P Williams
P-S Chi
P-S Chi
RA Day
RP Smiraglia
S Fish
S Milojević
TCE Engels
TD Bilhartz
TLB Ossenblok
V Diodato
V Larivière
VE Hargrave
W Glänzel
W Glänzel
W Glänzel
WE Snizek
WE Snizek
Y Gingras
Y Lindholm-Romantschuk
Publication venue
Publication date: 06/08/2018
Field of study

In this chapter, we focus on the specialists who have helped to improve the conditions for book assessments in research evaluation exercises, with empirically based data and insights supporting their greater integration. Our review highlights the research carried out by four types of expert communities, referred to as the monitors, the subject classifiers, the indexers and the indicator constructionists. Many challenges lie ahead for scholars affiliated with these communities, particularly the latter three. By acknowledging their unique, yet interrelated roles, we show where the greatest potential is for both quantitative and qualitative indicator advancements in book-inclusive evaluation systems.Comment: Forthcoming in Glanzel, W., Moed, H.F., Schmoch U., Thelwall, M. (2018). Springer Handbook of Science and Technology Indicators. Springer Some corrections made in subsection 'Publisher prestige or quality

arXiv.org e-Print Archive

Crossref

Copenhagen University Research Information System

Cultural consequences of computing technology

Author: Memmi Daniel
Publication venue
Publication date: 01/01/2013
Field of study

Computing technology is clearly a technical revolution, but will most probably bring about a cultural revolution\ud as well. The effects of this technology on human culture will be dramatic and far-reaching. Yet, computers and\ud electronic networks are but the latest development in a long history of cognitive tools, such as writing and printing.\ud We will examine this history, which exhibits long-term trends toward an increasing democratization of culture,\ud before turning to today's technology. Within this framework, we will analyze the probable effects of computing on\ud culture: dynamical representations, generalized networking, constant modification and reproduction. To address the\ud problems posed by this new technical environment, we will suggest possible remedies. In particular, the role of\ud social institutions will be discussed, and we will outline the shape of new electronic institutions able to deal with the\ud information flow on the internet

Archipel - Université du Québec à Montréal

Possibilities of quality enhancement in higher education by intensive use of information technology

Author: Mishra SK
Publication venue
Publication date
Field of study

Quality of higher education is a multi-dimensional concept. It lies in effectiveness of transmitting knowledge and skill; the authenticity, content, coverage and depth of information; availability of reading/teaching materials; help in removing obstacles to learning; applicability of knowledge in solving the real life problems; fruitfulness of knowledge in personal and social domains; convergence of content and variety of knowledge over space (countries and regions) and different sections of the people; cost-effectiveness and administrative efficiency. Information technology has progressed very fast in the last three decades; it has produced equipments at affordable cost and it has now made their wider application feasible. This technology has made search, gathering, dissemination, storing, retrieval, transmission and reception of knowledge easier, cheaper and faster. Side by side, a vast virtual library vying with the library in prints has emerged and continues growing rapidly. One may hold that the e-libraries are the libraries of tomorrow when the libraries in prints will be the antiques or the archival objects of the past. This paper discusses in details how information technology can be applied to enhance the quality of higher education at affordable cost. It also discusses the major obstacles to optimal utilization of information technology and measures to remove them.Information Technology; Quality in Higher Education; e-library; e-book; e-journal

Research Papers in Economics

Latent Semantic Analysis: A Big Data Opportunity for Tax Research

Author: George Benjamin, Ph.D.
Hutchison Paul D., Ph.D.
Plummer C. Elizabeth, Ph.D., CPA
Publication venue: SJSU ScholarWorks
Publication date: 01/02/2018
Field of study

SJSU ScholarWorks